baldaquin.pkt — Binary packets#

This module contains all the facilities to deal with binary packet data—by packet we mean one piece one unit of binary data, and a packet can typically be imagined as the elementary unit of information outputted by the hardware that is seen from the DAQ side, containing one (in the simplest case) or more events.

The module provides the AbstractPacket abstract class as a base for all the packet classes. Subclass should implement the following interfaces

  • a data property, returning the underlying binary buffer (typically a bytes object);

  • a fields property, i.e., a tuple of string with the names of all the fields that have to be extracted from the data when a class instance is unpacked;

  • the __len__() dunder method, returning the size of the data in bytes;

  • the __iter__() dunder method, that makes the class iterable;

  • a pack() method, packing all the fields into the corresponding data;

  • an unpack() method, unpacking the data into the corresponding fields, with the understanding that pack() and unpack() should be guardanteed to roundtrip.

From a DAQ standpoint, the main use of concrete packet classes should be something along the lines of

>>> packet = Packet.unpack(data)

That is: you have a piece of binary data from the hardware, you know the layout of the packet, you can unpack it in the form a useful data structure that is easy to work with, plot, write to file, and alike.

Being able to go the other way around (i.e., initialize a packet from its fields) is useful from a testing standpoint, and that is the very reason for provinding the pack() interface, that does things in this direction.

Warning

We have not put much thought, yet, into support for variable-size packets, and the interfaces might change as we actually implement and use them. At this time the user should feel comfortable in using the FixedSizePacketBasePacket base class and the associated packetclass decorator.

In addition, the AbstractPacket provides placeholders for helping redirecting packet buffers to text sink. More specifically:

  • text_header() is meant to return a sensible header for a text output file containing packets;

  • to_text() is meant to provide a sensible text representation of the single packet, appropriate to write the packet to disk.

Fixed-size packets#

In its simplest incarnation, a packet is just a simple set of number packed in binary format with a well-defined layout and with a fixed size. This module provides the packetclass decorator and the FixedSizePacketBasePacket base class to define concrete fixed-size packet structures.

The packetclass decorator is loosely inspired by the Python dataclass decorator, and what it does is essentially providing a class constructor based on class annotations. The basic contract is that for any annotation in the form of

>>> field_name: Format

a new attribute with the given field_name is added to the class, with the Format specifying the type of the field in the packet layout, according to the rules in the Python struct. If the format charater is not supported, a ValueError is raised.

Additionally, if a value is provided to the class annotation

>>> field_name: format_char = value

the value of the corresponding attribute is checked at runtime, and a FieldMismatchError exception is raised if the two do not match. (This is useful, e.g., when a packet has a fixed header that need to be checked within the event loop.)

Finally, a layout class attribute can be optionally specified to control the byte order, size and alignment of the packet, according to the Layout enum. If no layout is specified, @ (native order and size) is assumed. If the layout character is not supported a ValueError is raised.

The FixedSizePacketBasePacket base class complement the decorator and implements the protocol defined by the AbstractPacket abstract class. For instance, the following snippet

@packetclass
class Trigger(FixedSizePacketBase):

    layout = Layout.BIG_ENDIAN

    header: Format.UNSIGNED_CHAR = 0xff
    pin_number: Format.UNSIGNED_CHAR
    timestamp: Format.UNSIGNED_LONG_LONG

defines a fully fledged packet class with three fields (big endian, standard size), where the header is required to be 0xff (this is automatically checked at runtime) and that can be used as advertised:

>>> packet = Trigger(0xff, 1, 15426782)
>>> print(packet)
>>> Trigger(header=255, pin_number=1, timestamp=15426782,
>>>         data=b'\xff\x01\x00\x00\x00\x00\x00\xebd\xde', _format=>BBQ)
>>> print(len(packet))
>>> 10
>>> print(isinstance(packet, AbstractPacket))
>>> True

(you will notice that when you create a packet from the constructor, the binary representation is automatically calculated using the pack() interface).

And, of course, in real life (as opposed to unit-testing) you will almost always find yourself unpacking things, i.e.,

>>> packet = Trigger.unpack(b'\xff\x01\x00\x00\x00\x00\x00\xebd\xde')
>>> print(packet)
>>> Trigger(header=255, pin_number=1, timestamp=15426782,
>>>         data=b'\xff\x01\x00\x00\x00\x00\x00\xebd\xde', _format=>BBQ)

(i.e., you have binary data from your hardware, and you can seamlessly turned into a useful data structure that you can interact with.)

Packet objects defined in this way are as frozen as Python allows—you can’t modify the values of the basic underlying fields once an instance has been created

>>> packet.pin_number = 0
>>> AttributeError: Cannot modify Trigger.pin_number'

and this is done with the goal of preserving the correspondence between the binary paylod and the unpacked field values at runtime.

You can define new fields, though, and the AbstractPacket protocol, just as plain Python dataclasses, provides a __post_init__() hook which is called at the end of the constructor (and is doing nothing by default). This is useful, e.g., for converting digitized values into the corresponding physical values. Say, for instance, that the timestamp in our simple Trigger class is the the number of microseconds since the last reset latched with an onboard counter, and we want to convert them to seconds. This can be achieved by something along the lines of

@packetclass
class Trigger(FixedSizePacketBase):

    layout = Layout.BIG_ENDIAN

    header: Format.UNSIGNED_CHAR = 0xff
    pin_number: Format.UNSIGNED_CHAR
    microseconds: Format.UNSIGNED_LONG_LONG

    def __post_init__(self):
        self.seconds = self.microseconds / 1000000

with the understanding that

>>> packet = Trigger(0xff, 1, 15426782)
>>> print(packet.seconds)
>>> 15.426782

Note the FixedSizePacketBasePacket base class provides a sensible implementation of the text_header() and to_text() hooks, although in practical situations one is often better off re-implementing them for the specific application at hand.

Reading packet files#

In order to ease the packet I/O, the module provides a PacketFile class to interface with binary files containing packets. The open() method supports the context manager protocol, and the class itself supports the iterator protocol. The basic use semantics is

>>> with PacketFile(PacketClass).open(file_path) as input_file:
>>>     for packet in input_file:
>>>         print(packet)

For application where a given post-processing requires to put in memory all the packets in the file (e.g., when it is necessary to combine adjacent packets in more complex, high-level quantities), the read_all() method is provided. (It goes without saying, this comes with all the caveats of putting a potentially large amount of information in memory.)

Module documentation#

Binary data packet utilities.

class baldaquin.pkt.Format(value)[source]#

Enum class encapsulating the supporte format characters from https://docs.python.org/3/library/struct.html#format-characters

PAD_BTYE = 'x'#
CHAR = 'c'#
SIGNED_CHAR = 'b'#
UNSIGNED_CHAR = 'B'#
BOOL = '?'#
SHORT = 'h'#
UNSIGNED_SHORT = 'H'#
INT = 'i'#
UNSIGNED_INT = 'I'#
LONG = 'l'#
UNSIGNED_LONG = 'L'#
LONG_LONG = 'q'#
UNSIGNED_LONG_LONG = 'Q'#
SSIZE_T = 'n'#
SIZE_T = 'N'#
FLOAT = 'f'#
DOUBLE = 'd'#
class baldaquin.pkt.Layout(value)[source]#

Enum class encapsulating the supported layout characters from https://docs.python.org/3/library/struct.html#byte-order-size-and-alignment

NATIVE_SIZE = '@'#
NATIVE = '='#
LITTLE_ENDIAN = '<'#
BIG_ENDIAN = '>'#
NETWORK = '!'#
DEFAULT = '@'#
class baldaquin.pkt.Edge(value)[source]#

Small Enum class encapsulating the edge type of a transition on a digital line.

RISING = 1#
FALLING = 0#
class baldaquin.pkt.AbstractPacket[source]#

Abstract base class for binary packets.

abstract property data: bytes#

Return the packet binary data.

abstract property fields: tuple#

Return the packet fields.

abstract pack() bytes[source]#

Pack the field values into the corresponding binary data.

abstract classmethod unpack(data: bytes)[source]#

Unpack the binary data into the corresponding field values.

_format_attributes(attrs: tuple[str], fmts: tuple[str] | None = None) tuple[str][source]#

Helper function to join a given set of class attributes in a properly formatted string.

This is used, most notably, in the _repr() hook below, which in turn is used in the various __repr__() and/or __str__ implementations, and in the to_text() implementations in sub-classes.

Parameters:
  • attrs (tuple) – The names of the class attributes we want to include in the representation.

  • fmts (tuple, optional) – If present determines the formatting of the given attributes.

_text(attrs: tuple[str], fmts: tuple[str], separator: str) str[source]#

Helper function for text formatting.

Note the output includes a trailing endline.

Parameters:
  • attrs (tuple) – The names of the class attributes we want to include in the representation.

  • fmts (tuple,) – Determines the formatting of the given attributes.

  • separator (str) – The separator between different fields.

_repr(attrs: tuple[str], fmts: tuple[str] | None = None) str[source]#

Helper function to provide sensible string formatting for the packets.

The basic idea is that concrete classes would use this to implement their __repr__() and/or __str__() special dunder methods.

Parameters:
  • attrs (tuple) – The names of the class attributes we want to include in the representation.

  • fmts (tuple, optional) – If present determines the formatting of the given attributes.

static text_header(prefix: str, creator: str | None = None) str[source]#

Hook that subclasses can overload to provide a sensible header for an output text file.

Parameters:
  • prefix (str) – The prefix to be prepended to each line to signal that that line is a comment and contains no data.

  • creator (str, optional) – An optional string indicating the application that created the file.

to_text(separator: str) str[source]#

Hook that subclasses can overload to provide a text representation of the buffer to be written in an output text file.

_abc_impl = <_abc._abc_data object>#
exception baldaquin.pkt.FieldMismatchError(cls: type, field: str, expected: int, actual: int)[source]#

RuntimeError subclass to signal a field mismatch in a data structure.

baldaquin.pkt._class_annotations(cls) dict[source]#

Small convienience function to retrieve the class annotations.

Note that, in order to support inheritance of @packetclasses, we do iterate over all the ancestors of the class at hand, starting from AbstractPacket, and collect all the annotations along the way. The iteration is in reverse order, so that the final order of annotations is what one would expect.

The try/except clause is needed because in Python 3.7 cls.__annotations__ is not defined when a class has no annotations, while in subsequent Python versions an empty dictionary is returned, instead.

baldaquin.pkt._check_format_characters(cls: type) None[source]#

Check that all the format characters in the class annotations are valid.

baldaquin.pkt._check_layout_character(cls: type) None[source]#

Check that the class layout character is valid.

baldaquin.pkt.packetclass(cls: type) type[source]#

Simple decorator to support automatic generation of fixed-length packet classes.

class baldaquin.pkt.FixedSizePacketBase(*args, data: bytes | None = None)[source]#

Class describing a packet with fixed size.

_fields = ()#
_format = '@'#
size = 0#
_data = None#
property data: bytes#

Return the packet binary data.

property fields: tuple#

Return the packet fields.

pack() bytes[source]#

Pack the field values into the corresponding binary data.

classmethod unpack(data: bytes) AbstractPacket[source]#

Unpack the binary data into the corresponding field values.

classmethod text_header(prefix: str, creator: str | None = None) str[source]#

Overloaded method.

to_text(separator: str) str[source]#

Overloaded method.

_abc_impl = <_abc._abc_data object>#
layout = '@'#
class baldaquin.pkt.PacketFile(packet_class: type)[source]#

Class describing a binary file containing packets.

open(file_path: str)[source]#

Open the file.

read_all() tuple[FixedSizePacketBase][source]#

Read in memory all the packets in the file.

This is meant to support postprocessing applications where one needs all the packets in memory at the same time. Use it cum grano salis.

class baldaquin.pkt.PacketStatistics(packets_processed: int = 0, packets_written: int = 0, bytes_written: int = 0)[source]#

Small container class helping with the event handler bookkeeping.

packets_processed: int = 0#
packets_written: int = 0#
bytes_written: int = 0#
reset() None[source]#

Reset the statistics.

update(packets_processed, packets_written, bytes_written) None[source]#

Update the event statistics.

to_dict() dict[source]#

Serialization.

classmethod from_dict(**kwargs) PacketStatistics[source]#

Deserialization.