Define custom field encoders
Use CEncoded when you need full control over how a field is
read from and written to a binary buffer. Unlike CFormat,
which wraps Python's struct module, CEncoded lets you
supply your own pack and unpack functions directly.
When to reach for CEncoded
CFormat covers the common case — fixed-size fields described
by a struct format string. Reach for CEncoded when:
- the field has variable length (size not known until read time)
- the binary encoding is not expressible as a
structformat string - you need to control raw I/O against the buffer yourself
Use the built-in RawBytes alias
The simplest CEncoded field is
RawBytes, which reads all remaining bytes in the buffer.
Place it as the last field in a model.
from cmodel import CModel
from cmodel.types import RawBytes
class Packet(CModel):
header: int
data: RawBytes
After unpacking, data contains everything after header. When packing, the raw bytes
are written as-is.
Define a custom encoder
CEncoded takes a single argument: get_encoder, a factory
function called with the struct's [EndianType][cmodel.schema.EndianType] and
[SizeType][cmodel.schema.SizeType]. It must return a
[CEncoderSchema][cmodel.schema.CEncoderSchema] dict.
from typing import Annotated
from cmodel import CEncoded
from cmodel import CModel
from cmodel.schema import CEncoderSchema
def uint24(endian: str, size: str) -> CEncoderSchema[int]:
byteorder = "little" if endian in ("native", "little") else "big"
def unpack(buf):
return int.from_bytes(buf.read(3), byteorder)
def pack(buf, value):
buf.write(value.to_bytes(3, byteorder))
return CEncoderSchema[int](
type="encoder",
size=3,
alignment=1,
unpack=unpack,
pack=pack,
schema_equality_info=("example", "uint24", byteorder),
)
UInt24 = Annotated[int, CEncoded(get_encoder=uint24)]
class AudioSample(CModel):
left: UInt24
right: UInt24
Python's struct module has no format character for a 24-bit integer, but 24-bit
fields are common in audio formats and network protocols. Since the encoding falls
outside what struct can express, CFormat cannot describe it
— exactly the kind of field CEncoded is designed for.
Understand CEncoderSchema fields
The [CEncoderSchema][cmodel.schema.CEncoderSchema] dict returned by get_encoder
has the following keys:
| Key | Type | Purpose |
|---|---|---|
type |
"encoder" |
Must always be "encoder". |
size |
int \| None |
Byte size of the encoded value, or None for variable length. |
alignment |
int |
Alignment requirement in bytes. |
unpack |
(BytesIO) -> T |
Read the value from a buffer. |
pack |
(BytesIO, T) -> Any |
Write the value to a buffer. |
schema_equality_info |
Hashable |
Used to compare two schemas for equality. |
Set size to None for variable-length fields. A variable-length field should
generally be the last field in a struct, since its unpack function may read to the end
of the buffer.
Respect endianness and size type
The factory function receives the struct's endian and size arguments so you can
build format strings or choose encoding strategies that match the model's byte order.
The uint24 factory will be called with endian="big", so it writes the three bytes
in big-endian order.