Python bytes

Mar 17, 2024 · 2 min read · python ·

About bytes

The python bytes type represents abinary sequence of bytes.
I use the term "binary" because it is sometimes displayed as a sequence of characters, but it it not text, not it it a string!
bytes is immutable, whereas bytearray is a mutable version of it.

representation

A bytes object looks like a string, with a b letter preceding it.
Inside quotes, you may see characters, like this:

1>>> b1 = b"uto&&*329f76"
2>>> b1
3b'uto&&*329f76'
4>>> type(b1)
5<class 'bytes'>
6>>>

The meaning of this is that the sequence of bytes represent the numerical values of the corresponding ASCII characters.
You can easilly see these values, by translating the bytes object to a tuple or a list:

1>>> tuple(b1)
2(117, 116, 111, 38, 38, 42, 51, 50, 57, 102, 55, 54)
3>>>

Creating a bytes object

You can always create a bytes object by using ASCII letters, but you can also write numerical values in a tuple, and convert to bytes:

1>>> bytes( (5, 70, 170, 67, 89, 33, 205)  )
2b'\x05F\xaaCY!\xcd'
3>>>

Note that when ASCII cannot represent a value correctly, the character is represented by a \xnn, where nn are 2-digit hexadecimal representation of the byte.

Off course, since these numbers represent an unsigned byte, their value must not exceed 255:

1>>> bytes(  (80, 42, 300, 999)  )
2Traceback (most recent call last):
3  File "<stdin>", line 1, in <module>
4ValueError: bytes must be in range(0, 256)
5>>>

Using a bytes object

Indexing:

1>>> b2 = bytes( (5, 70, 170, 67, 89, 33, 205)  )
2>>> b2
3b'\x05F\xaaCY!\xcd'
4>>> b2[2]
5170
6>>>

Slicing:

1>>> b2[1:4]
2b'F\xaaC'
3>>>

looping:

1>> for b8 in b2[1:5]:
2...   print(b8)
3... 
470
5170
667
789
8>>>

Converting int to bytes:

1>>> (2257).to_bytes(length=2, byteorder='little')
2b'\xd1\x08'
3>>>

explanations:
I use parentheses so that the dot will be regarded as a member access operation.
I use length=2, because I know that my value (2257) needs 2 bytes to be represented.
I specify byteorder as 'little', because I use an Intel CPU, and this is the kind of endianess used.

Convert a string:

1>>>
2>>> bytes('יובל', encoding="utf-8")
3b'\xd7\x99\xd7\x95\xd7\x91\xd7\x9c'
4>>>

I have used Hebrew letters
..and used utf=8 encoding, so I got 2 bytes for each letter.