Base64 Encode / Decode

How to encode / decode Base64 in Python?

Base64 encoding is a way to convert text (or any binary data) into a format that only contains letters, numbers, and a few special characters (+, /, and = for padding). This is useful when you need to store or transfer data safely over systems that might not support special characters.

In Python, you can encode a string in Base64 using the base64 module:

  1. Convert the string into bytes (because Base64 works with bytes, not text).
  2. Use base64.b64encode() to encode the bytes.
  3. Convert the encoded bytes back into a string so it's easy to read and share.
import base64
def encode_base64(text: str) -> str:
    encoded_bytes = base64.b64encode(text.encode('utf-8'))
    return encoded_bytes.decode('utf-8')

# Example usage
text = "Hello, World!"
encoded_text = encode_base64(text)
print("Encoded:", encoded_text)

Decoding is the reverse process of encoding. When you receive a Base64-encoded string, you need to convert it back into its original text.

To do this in Python:

  1. Use base64.b64decode() to decode the Base64 string into bytes.
  2. Convert the bytes back into a regular text string using .decode('utf-8').

This method allows you to safely recover the original text from a Base64-encoded message.

import base64

def decode_base64(encoded_text: str) -> str:
    decoded_bytes = base64.b64decode(encoded_text)
    return decoded_bytes.decode('utf-8')

# Example usage
encoded_text = "SGVsbG8sIFdvcmxkIQ=="
decoded_text = decode_base64(encoded_text)
print("Decoded:", decoded_text)

What is Base64 encoding and how it works?

Base64 encoding is a method used to encode binary data into a format that can be easily transmitted as text. This encoding scheme is particularly useful when you need to include binary data in text-based formats like JSON, XML, or URL parameters, ensuring compatibility across different systems.

At its core, Base64 converts binary data into 64 different ASCII characters. It uses a specific set of characters: uppercase A-Z, lowercase a-z, digits 0-9, and two additional characters, typically + and /, forming the 64-character map.

  1. Convert Binary Data: The data is divided into groups of 3 bytes. Since each byte is 8 bits, this results in 24 bits per group.
  2. Divide into Subgroups: Each 24-bit group is split into four 6-bit subgroups.
  3. Map to Characters: Each 6-bit subgroup is mapped to a character using the Base64 character set.
  4. Padding: If the initial data doesn't perfectly divide into 3-byte increments, padding with = is added to make it fit the character quartet.

Example:

Consider the text "Man". Its ASCII binary representation is:

M: 01001101
a: 01100001
n: 01101110

These binary strings combine to form:

010011010110000101101110

Split into 6-bit groups, they become:

010011  010110  000101  101110

Which translate to Base64 as:

TWFu

Common use cases:

  • Email Attachments: Base64 is commonly used to encode email attachments, ensuring binary data is sent over protocols that only support textual data.
  • Data URIs: It enables embedding images and other media in webpages via data URIs, reducing the need for separate file requests.
  • Token Storage: In APIs, tokens often use Base64 encoding for safe transmission and storage.

Base64 is a straightforward but powerful encoding method, turning complex binary data into a format that is easy to handle, store, and transmit. Its utility spans across various applications, ensuring data integrity and compatibility without significant overhead. As you encounter scenarios requiring seamless data transfer, consider Base64 as your go-to encoding solution, making data exchange as smooth as possible.