Skip to content

protobuf API

protobuf

Zero-dependency proto3 encoder/decoder using Python dataclass schemas.

Part of zerodep: https://github.com/Oaklight/zerodep Copyright (c) 2026 Peng Ding. MIT License.

Encode and decode Protocol Buffers (proto3) wire format using plain Python dataclasses as message schemas. No protoc compiler, no .proto files, no C extensions — just stdlib + type annotations.

Basic usage::

from protobuf import message, field, int32, repeated

@message
class Person:
    name: str = field(1)
    id: int32 = field(2)
    emails: repeated[str] = field(3)

data = Person(name="Alice", id=123, emails=["a@b.com"]).serialize()
person = Person.parse(data)
print(person.to_dict())

Scalar type aliases::

int32, int64, uint32, uint64        # varint
sint32, sint64                      # varint + ZigZag
fixed32, sfixed32, float32          # 32-bit fixed
fixed64, sfixed64, double           # 64-bit fixed
bool_                               # varint (0/1)

Composite fields::

repeated[int32]         # packed repeated scalars
map_field[str, int32]   # map<string, int32>

Proto3 semantics: - All fields are optional with zero-value defaults. - Fields at their default value are NOT serialized. - Unknown fields are preserved across parse → serialize round-trips.

WireType

Bases: IntEnum

Proto3 wire types.

Source code in protobuf/protobuf.py
class WireType(IntEnum):
    """Proto3 wire types."""

    VARINT = 0
    FIXED64 = 1
    LEN = 2
    # SGROUP = 3  (deprecated, not supported)
    # EGROUP = 4  (deprecated, not supported)
    FIXED32 = 5

ScalarType

Bases: IntEnum

Identifies the proto3 scalar encoding strategy.

Source code in protobuf/protobuf.py
class ScalarType(IntEnum):
    """Identifies the proto3 scalar encoding strategy."""

    INT32 = 0
    INT64 = 1
    UINT32 = 2
    UINT64 = 3
    SINT32 = 4
    SINT64 = 5
    BOOL = 6
    FIXED32 = 7
    FIXED64 = 8
    SFIXED32 = 9
    SFIXED64 = 10
    FLOAT = 11
    DOUBLE = 12
    STRING = 13
    BYTES = 14
    ENUM = 15

ProtoScalar dataclass

Annotated marker carrying proto wire-type metadata for a scalar field.

Placed inside Annotated[base_type, ProtoScalar(...)] to tell the encoder/decoder how to serialize the value on the wire.

Source code in protobuf/protobuf.py
@dataclasses.dataclass(frozen=True)
class ProtoScalar:
    """Annotated marker carrying proto wire-type metadata for a scalar field.

    Placed inside ``Annotated[base_type, ProtoScalar(...)]`` to tell the
    encoder/decoder how to serialize the value on the wire.
    """

    scalar_type: ScalarType
    wire_type: WireType

    @property
    def is_numeric(self) -> bool:
        return self.scalar_type not in (ScalarType.STRING, ScalarType.BYTES)

    @property
    def is_packable(self) -> bool:
        return self.wire_type != WireType.LEN

Repeated dataclass

Annotated marker for repeated fields.

repeated[int32] expands to Annotated[list[int], Repeated(...), ProtoScalar(...)].

Source code in protobuf/protobuf.py
@dataclasses.dataclass(frozen=True)
class Repeated:
    """Annotated marker for ``repeated`` fields.

    ``repeated[int32]`` expands to
    ``Annotated[list[int], Repeated(...), ProtoScalar(...)]``.
    """

    item_type: type | Any  # The inner type (e.g., int32, str, a message class)

MapField dataclass

Annotated marker for map<K, V> fields.

map_field[str, int32] expands to Annotated[dict[str, int], MapField(...)].

Source code in protobuf/protobuf.py
@dataclasses.dataclass(frozen=True)
class MapField:
    """Annotated marker for ``map<K, V>`` fields.

    ``map_field[str, int32]`` expands to ``Annotated[dict[str, int], MapField(...)]``.
    """

    key_type: type | Any
    value_type: type | Any

OneofGroup dataclass

Annotated marker for oneof field grouping.

Source code in protobuf/protobuf.py
@dataclasses.dataclass(frozen=True)
class OneofGroup:
    """Annotated marker for ``oneof`` field grouping."""

    group_name: str

repeated

Subscriptable type alias for repeated proto fields.

Usage: emails: repeated[str] = field(3)

Source code in protobuf/protobuf.py
class repeated(metaclass=_RepeatedMeta):  # noqa: N801
    """Subscriptable type alias for repeated proto fields.

    Usage: ``emails: repeated[str] = field(3)``
    """

map_field

Subscriptable type alias for map proto fields.

Usage: attrs: map_field[str, int32] = field(5)

Source code in protobuf/protobuf.py
class map_field(metaclass=_MapFieldMeta):  # noqa: N801
    """Subscriptable type alias for map proto fields.

    Usage: ``attrs: map_field[str, int32] = field(5)``
    """

FieldKind

Bases: IntEnum

Categories for how a field is encoded on the wire.

Source code in protobuf/protobuf.py
class FieldKind(IntEnum):
    """Categories for how a field is encoded on the wire."""

    SCALAR = 0
    STRING = 1
    BYTES = 2
    MESSAGE = 3
    ENUM = 4
    REPEATED_SCALAR = 5  # packed
    REPEATED_MESSAGE = 6  # length-delimited per element
    REPEATED_ENUM = 7  # packed
    REPEATED_STRING = 8  # length-delimited per element
    REPEATED_BYTES = 9  # length-delimited per element
    MAP = 10

FieldInfo dataclass

Resolved metadata for a single proto field.

Source code in protobuf/protobuf.py
@dataclasses.dataclass(frozen=True)
class FieldInfo:
    """Resolved metadata for a single proto field."""

    name: str
    number: int
    kind: FieldKind
    wire_type: WireType
    scalar: ProtoScalar | None  # non-None for scalar/enum kinds
    message_type: type | None  # non-None for MESSAGE/REPEATED_MESSAGE
    repeated_marker: Repeated | None
    map_marker: MapField | None
    oneof_group: str | None
    python_type: type  # base Python type (int, str, etc.)
    default_value: Any  # proto3 zero-value
    _decoder: Any = None  # cached dispatch handler, set after _FIELD_DECODERS is built
    _encoder: Any = None  # cached encode handler, set after _FIELD_ENCODERS is built
    _is_default: Any = None  # cached default-value check, bound at build time
    _tag: bytes = b""  # cached tag bytes for this field's native wire type
    _len_tag: bytes = b""  # cached tag bytes for LEN wire type (repeated/map)
    _map_meta: Any = None  # cached _MapMeta for MAP fields

encode_varint(value)

Encode an unsigned integer as a varint.

Parameters:

Name Type Description Default
value int

Non-negative integer to encode.

required

Returns:

Type Description
bytes

Varint-encoded bytes.

Source code in protobuf/protobuf.py
def encode_varint(value: int) -> bytes:
    """Encode an unsigned integer as a varint.

    Args:
        value: Non-negative integer to encode.

    Returns:
        Varint-encoded bytes.
    """
    if value < 0:
        # Proto3 treats negative int32/int64 as 10-byte two's complement
        value = value & 0xFFFFFFFFFFFFFFFF
    # Fast-paths for common small values (tags, lengths, small ints)
    if value < 0x80:
        return bytes((value,))
    if value < 0x4000:
        return bytes(((value & 0x7F) | 0x80, value >> 7))
    buf = bytearray()
    while value > 0x7F:
        buf.append((value & 0x7F) | 0x80)
        value >>= 7
    buf.append(value & 0x7F)
    return bytes(buf)

decode_varint(data, pos)

Decode a varint from data starting at pos.

Parameters:

Name Type Description Default
data bytes | bytearray | memoryview

Buffer to read from.

required
pos int

Start offset.

required

Returns:

Type Description
tuple[int, int]

Tuple of (decoded value, new position after varint).

Raises:

Type Description
ValueError

If the varint is malformed or exceeds 10 bytes.

Source code in protobuf/protobuf.py
def decode_varint(data: bytes | bytearray | memoryview, pos: int) -> tuple[int, int]:
    """Decode a varint from *data* starting at *pos*.

    Args:
        data: Buffer to read from.
        pos: Start offset.

    Returns:
        Tuple of (decoded value, new position after varint).

    Raises:
        ValueError: If the varint is malformed or exceeds 10 bytes.
    """
    result = 0
    shift = 0
    while True:
        if pos >= len(data):
            raise ValueError("Unexpected end of data while reading varint")
        byte = data[pos]
        result |= (byte & 0x7F) << shift
        pos += 1
        if not (byte & 0x80):
            break
        shift += 7
        if shift >= 70:
            raise ValueError("Varint too long (> 10 bytes)")
    return result, pos

zigzag_encode(value)

ZigZag-encode a signed integer.

Maps signed integers to unsigned: 0→0, -1→1, 1→2, -2→3, …

Parameters:

Name Type Description Default
value int

Signed integer.

required

Returns:

Type Description
int

ZigZag-encoded unsigned integer.

Source code in protobuf/protobuf.py
def zigzag_encode(value: int) -> int:
    """ZigZag-encode a signed integer.

    Maps signed integers to unsigned: 0→0, -1→1, 1→2, -2→3, …

    Args:
        value: Signed integer.

    Returns:
        ZigZag-encoded unsigned integer.
    """
    return (value << 1) ^ (value >> 63)

zigzag_decode(value)

ZigZag-decode an unsigned integer back to signed.

Parameters:

Name Type Description Default
value int

ZigZag-encoded unsigned integer.

required

Returns:

Type Description
int

Original signed integer.

Source code in protobuf/protobuf.py
def zigzag_decode(value: int) -> int:
    """ZigZag-decode an unsigned integer back to signed.

    Args:
        value: ZigZag-encoded unsigned integer.

    Returns:
        Original signed integer.
    """
    return (value >> 1) ^ -(value & 1)

make_tag(field_number, wire_type)

Pack a field number and wire type into a tag varint.

Parameters:

Name Type Description Default
field_number int

Proto field number (1–536870911).

required
wire_type int

Wire type (0–5).

required

Returns:

Type Description
bytes

Varint-encoded tag bytes.

Source code in protobuf/protobuf.py
def make_tag(field_number: int, wire_type: int) -> bytes:
    """Pack a field number and wire type into a tag varint.

    Args:
        field_number: Proto field number (1–536870911).
        wire_type: Wire type (0–5).

    Returns:
        Varint-encoded tag bytes.
    """
    key = (field_number, wire_type)
    cached = _TAG_CACHE.get(key)
    if cached is not None:
        return cached
    tag = encode_varint((field_number << 3) | wire_type)
    _TAG_CACHE[key] = tag
    return tag

decode_tag(data, pos)

Decode a tag varint into field number and wire type.

Parameters:

Name Type Description Default
data bytes | bytearray | memoryview

Buffer.

required
pos int

Start offset.

required

Returns:

Type Description
tuple[int, int, int]

Tuple of (field_number, wire_type, new position).

Source code in protobuf/protobuf.py
def decode_tag(data: bytes | bytearray | memoryview, pos: int) -> tuple[int, int, int]:
    """Decode a tag varint into field number and wire type.

    Args:
        data: Buffer.
        pos: Start offset.

    Returns:
        Tuple of (field_number, wire_type, new position).
    """
    tag, pos = decode_varint(data, pos)
    return tag >> 3, tag & 0x07, pos

encode_fixed32(value)

Encode a 32-bit value in little-endian.

Source code in protobuf/protobuf.py
def encode_fixed32(value: int) -> bytes:
    """Encode a 32-bit value in little-endian."""
    return struct.pack("<I", value & 0xFFFFFFFF)

decode_fixed32(data, pos)

Decode a 32-bit little-endian unsigned integer.

Source code in protobuf/protobuf.py
def decode_fixed32(data: bytes | bytearray | memoryview, pos: int) -> tuple[int, int]:
    """Decode a 32-bit little-endian unsigned integer."""
    if pos + 4 > len(data):
        raise ValueError("Unexpected end of data reading fixed32")
    return struct.unpack_from("<I", data, pos)[0], pos + 4

encode_sfixed32(value)

Encode a signed 32-bit value in little-endian.

Source code in protobuf/protobuf.py
def encode_sfixed32(value: int) -> bytes:
    """Encode a signed 32-bit value in little-endian."""
    return struct.pack("<i", value)

decode_sfixed32(data, pos)

Decode a signed 32-bit little-endian integer.

Source code in protobuf/protobuf.py
def decode_sfixed32(data: bytes | bytearray | memoryview, pos: int) -> tuple[int, int]:
    """Decode a signed 32-bit little-endian integer."""
    if pos + 4 > len(data):
        raise ValueError("Unexpected end of data reading sfixed32")
    return struct.unpack_from("<i", data, pos)[0], pos + 4

encode_fixed64(value)

Encode a 64-bit value in little-endian.

Source code in protobuf/protobuf.py
def encode_fixed64(value: int) -> bytes:
    """Encode a 64-bit value in little-endian."""
    return struct.pack("<Q", value & 0xFFFFFFFFFFFFFFFF)

decode_fixed64(data, pos)

Decode a 64-bit little-endian unsigned integer.

Source code in protobuf/protobuf.py
def decode_fixed64(data: bytes | bytearray | memoryview, pos: int) -> tuple[int, int]:
    """Decode a 64-bit little-endian unsigned integer."""
    if pos + 8 > len(data):
        raise ValueError("Unexpected end of data reading fixed64")
    return struct.unpack_from("<Q", data, pos)[0], pos + 8

encode_sfixed64(value)

Encode a signed 64-bit value in little-endian.

Source code in protobuf/protobuf.py
def encode_sfixed64(value: int) -> bytes:
    """Encode a signed 64-bit value in little-endian."""
    return struct.pack("<q", value)

decode_sfixed64(data, pos)

Decode a signed 64-bit little-endian integer.

Source code in protobuf/protobuf.py
def decode_sfixed64(data: bytes | bytearray | memoryview, pos: int) -> tuple[int, int]:
    """Decode a signed 64-bit little-endian integer."""
    if pos + 8 > len(data):
        raise ValueError("Unexpected end of data reading sfixed64")
    return struct.unpack_from("<q", data, pos)[0], pos + 8

encode_float(value)

Encode a 32-bit float.

Source code in protobuf/protobuf.py
def encode_float(value: float) -> bytes:
    """Encode a 32-bit float."""
    return struct.pack("<f", value)

decode_float(data, pos)

Decode a 32-bit float.

Source code in protobuf/protobuf.py
def decode_float(data: bytes | bytearray | memoryview, pos: int) -> tuple[float, int]:
    """Decode a 32-bit float."""
    if pos + 4 > len(data):
        raise ValueError("Unexpected end of data reading float")
    return struct.unpack_from("<f", data, pos)[0], pos + 4

encode_double(value)

Encode a 64-bit double.

Source code in protobuf/protobuf.py
def encode_double(value: float) -> bytes:
    """Encode a 64-bit double."""
    return struct.pack("<d", value)

decode_double(data, pos)

Decode a 64-bit double.

Source code in protobuf/protobuf.py
def decode_double(data: bytes | bytearray | memoryview, pos: int) -> tuple[float, int]:
    """Decode a 64-bit double."""
    if pos + 8 > len(data):
        raise ValueError("Unexpected end of data reading double")
    return struct.unpack_from("<d", data, pos)[0], pos + 8

oneof(group_name)

Create a oneof group marker for use in field metadata.

Usage::

@message
class Msg:
    text: str = field(1, oneof="body")
    image: bytes = field(2, oneof="body")

Parameters:

Name Type Description Default
group_name str

Name of the oneof group.

required

Returns:

Type Description
OneofGroup

OneofGroup marker.

Source code in protobuf/protobuf.py
def oneof(group_name: str) -> OneofGroup:
    """Create a oneof group marker for use in field metadata.

    Usage::

        @message
        class Msg:
            text: str = field(1, oneof="body")
            image: bytes = field(2, oneof="body")

    Args:
        group_name: Name of the oneof group.

    Returns:
        OneofGroup marker.
    """
    return OneofGroup(group_name)

field(number, *, default=dataclasses.MISSING, default_factory=dataclasses.MISSING, oneof=None)

Define a proto field with its field number.

Parameters:

Name Type Description Default
number int

Proto field number (must be >= 1).

required
default Any

Default value for the field.

MISSING
default_factory Any

Factory for mutable default values.

MISSING
oneof str | None

Optional oneof group name.

None

Returns:

Type Description
Any

A dataclasses.Field with proto metadata.

Source code in protobuf/protobuf.py
def field(
    number: int,
    *,
    default: Any = dataclasses.MISSING,
    default_factory: Any = dataclasses.MISSING,
    oneof: str | None = None,
) -> Any:
    """Define a proto field with its field number.

    Args:
        number: Proto field number (must be >= 1).
        default: Default value for the field.
        default_factory: Factory for mutable default values.
        oneof: Optional oneof group name.

    Returns:
        A ``dataclasses.Field`` with proto metadata.
    """
    if number < 1:
        raise ValueError(f"Field number must be >= 1, got {number}")
    metadata: dict[str, Any] = {"proto_number": number}
    if oneof is not None:
        metadata["proto_oneof"] = oneof
    kwargs: dict[str, Any] = {"metadata": metadata}
    if default is not dataclasses.MISSING:
        kwargs["default"] = default
    elif default_factory is not dataclasses.MISSING:
        kwargs["default_factory"] = default_factory
    return dataclasses.field(**kwargs)

message(cls)

Decorator that turns a class into a proto3 message.

Applies @dataclass (if not already applied) and injects proto3 serialize(), parse(), to_dict(), from_dict() methods.

Usage::

@message
class Person:
    name: str = field(1)
    id: int32 = field(2)

Parameters:

Name Type Description Default
cls type

The class to decorate.

required

Returns:

Type Description
type

The decorated class with proto3 capabilities.

Source code in protobuf/protobuf.py
def message(cls: type) -> type:
    """Decorator that turns a class into a proto3 message.

    Applies ``@dataclass`` (if not already applied) and injects proto3
    ``serialize()``, ``parse()``, ``to_dict()``, ``from_dict()`` methods.

    Usage::

        @message
        class Person:
            name: str = field(1)
            id: int32 = field(2)

    Args:
        cls: The class to decorate.

    Returns:
        The decorated class with proto3 capabilities.
    """
    if not dataclasses.is_dataclass(cls):
        _apply_proto3_defaults(cls)
        cls = dataclasses.dataclass(cls)

    # Build descriptor
    descriptor = _MessageDescriptor(cls)
    descriptor._bind_handlers()
    cls._proto_descriptor = descriptor  # type: ignore[attr-defined]

    # Inject methods
    cls.serialize = _msg_serialize  # type: ignore[attr-defined]
    cls.parse = classmethod(_msg_parse)  # type: ignore[attr-defined]
    cls.to_dict = _msg_to_dict  # type: ignore[attr-defined]
    cls.from_dict = classmethod(_msg_from_dict)  # type: ignore[attr-defined]
    cls.byte_size = _msg_byte_size  # type: ignore[attr-defined]

    # Add _unknown_fields support
    original_init = cls.__init__

    def _new_init(self: Any, *args: Any, **kwargs: Any) -> None:
        original_init(self, *args, **kwargs)
        if not hasattr(self, "_unknown_fields"):
            object.__setattr__(self, "_unknown_fields", [])

    cls.__init__ = _new_init  # type: ignore[attr-defined]

    return cls