Python bytes

About bytes

  • The python bytes type represents abinary sequence of bytes.
  • I use the term "binary" because it is sometimes displayed as a sequence of characters, but it it not text, not it it a string!
  • bytes is immutable, whereas bytearray is a mutable version of it.

representation

  • A bytes object looks like a string, with a b letter preceding it.
    Inside quotes, you may see characters, like this:
1>>> b1 = b"uto&&*329f76"
2>>> b1
3b'uto&&*329f76'
4>>> type(b1)
5<class 'bytes'>
6>>> 
  • The meaning of this is that the sequence of bytes represent the numerical values of the corresponding ASCII characters.
  • You can easilly see these values, by translating the bytes object to a tuple or a list:
1>>> tuple(b1)
2(117, 116, 111, 38, 38, 42, 51, 50, 57, 102, 55, 54)
3>>> 

Creating a bytes object

  • You can always create a bytes object by using ASCII letters, but you can also write numerical values in a tuple, and convert to bytes:
1>>> bytes( (5, 70, 170, 67, 89, 33, 205)  )
2b'\x05F\xaaCY!\xcd'
3>>> 

Note that when ASCII cannot represent a value correctly, the character is represented by a \xnn, where nn are 2-digit hexadecimal representation of the byte.

  • Off course, since these numbers represent an unsigned byte, their value must not exceed 255:
1>>> bytes(  (80, 42, 300, 999)  )
2Traceback (most recent call last):
3  File "<stdin>", line 1, in <module>
4ValueError: bytes must be in range(0, 256)
5>>> 

Using a bytes object

  • Indexing:
1>>> b2 = bytes( (5, 70, 170, 67, 89, 33, 205)  )
2>>> b2
3b'\x05F\xaaCY!\xcd'
4>>> b2[2]
5170
6>>> 
  • Slicing:
1>>> b2[1:4]
2b'F\xaaC'
3>>> 
  • looping:
1>> for b8 in b2[1:5]:
2...   print(b8)
3... 
470
5170
667
789
8>>> 
  • Converting int to bytes:
1>>> (2257).to_bytes(length=2, byteorder='little')
2b'\xd1\x08'
3>>> 

explanations:
I use parentheses so that the dot will be regarded as a member access operation.
I use length=2, because I know that my value (2257) needs 2 bytes to be represented.
I specify byteorder as 'little', because I use an Intel CPU, and this is the kind of endianess used.

  • Convert a string:
1>>>
2>>> bytes('יובל', encoding="utf-8")
3b'\xd7\x99\xd7\x95\xd7\x91\xd7\x9c'
4>>>
  • I have used Hebrew letters
  • ..and used utf=8 encoding, so I got 2 bytes for each letter.