mirror of
https://github.com/google/flatbuffers.git
synced 2026-06-02 12:05:50 +00:00
150 lines
5.2 KiB
Markdown
150 lines
5.2 KiB
Markdown
# Annotating FlatBuffers
|
|
|
|
This provides a way to annotate flatbuffer binary data, byte-by-byte, with a
|
|
schema. It is useful for development purposes and understanding the details of
|
|
the internal format.
|
|
|
|
## Annotating
|
|
|
|
Given a `schema`, as either a plain-text (`.fbs`) or a binary schema (`.bfbs`),
|
|
and `binary` file(s) that were created by the `schema`. You can annotate them
|
|
using:
|
|
|
|
```sh
|
|
flatc --annotate SCHEMA -- BINARY_FILES...
|
|
```
|
|
|
|
This will produce a set of annotated files (`.afb` Annotated FlatBuffer)
|
|
corresponding to the input binary files.
|
|
|
|
### Example
|
|
|
|
Taken from the [tests/annotated_binary](https://github.com/google/flatbuffers/tree/master/tests/annotated_binary).
|
|
|
|
```sh
|
|
cd tests/annotated_binary
|
|
../../flatc --annotate annotated_binary.fbs -- annotated_binary.bin
|
|
```
|
|
|
|
Which will produce a `annotated_binary.afb` file in the current directory.
|
|
|
|
The `annotated_binary.bin` is the flatbufer binary of the data contained within
|
|
`annotated_binary.json`, which was made by the following command:
|
|
|
|
```sh
|
|
..\..\flatc -b annotated_binary.fbs annotated_binary.json
|
|
```
|
|
|
|
## .afb Text Format
|
|
|
|
Currently there is a built-in text-based format for outputting the annotations.
|
|
A full example is shown here:
|
|
[`annotated_binary.afb`](https://github.com/google/flatbuffers/blob/master/tests/annotated_binary/annotated_binary.afb)
|
|
|
|
The data is organized as a table with fixed [columns](#columns) grouped into
|
|
Binary [sections](#binary-sections) and [regions](#binary-regions), starting
|
|
from the beginning of the binary (offset `0`).
|
|
|
|
### Columns
|
|
|
|
The columns are as follows:
|
|
|
|
1. The offset from the start of the binary, expressed in hexadecimal format
|
|
(e.g. `+0x003c`).
|
|
|
|
The prefix `+` is added to make searching for the offset (compared to some
|
|
random value) a bit easier.
|
|
|
|
2. The raw binary data, expressed in hexadecimal format.
|
|
|
|
This is in the little endian format the buffer uses internally and what you
|
|
would see with a normal binary text viewer.
|
|
|
|
3. The type of the data.
|
|
|
|
This may be the type specified in the schema or some internally defined
|
|
types:
|
|
|
|
|
|
| Internal Type | Purpose |
|
|
|---------------|----------------------------------------------------|
|
|
| `VOffset16` | Virtual table offset, relative to the table offset |
|
|
| `UOffset32` | Unsigned offset, relative to the current offset |
|
|
| `SOffset32` | Signed offset, relative to the current offset |
|
|
|
|
|
|
4. The value of the data.
|
|
|
|
This is shown in big endian format that is generally written for humans to
|
|
consume (e.g. `0x0013`). As well as the "casted" value (e.g. `0x0013 `is
|
|
`19` in decimal) in parentheses.
|
|
|
|
5. Notes about the particular data.
|
|
|
|
This describes what the data is about, either some internal usage, or tied
|
|
to the schema.
|
|
|
|
### Binary Sections
|
|
|
|
The file is broken up into Binary Sections, which are comprised of contiguous
|
|
[binary regions](#binary-regions) that are logically grouped together. For
|
|
example, a binary section may be a single instance of a flatbuffer `Table` or
|
|
its `vtable`. The sections may be labelled with the name of the associated type,
|
|
as defined in the input schema.
|
|
|
|
An example of a `vtable` Binary Section that is associated with the user-defined
|
|
`AnnotateBinary.Bar` table.
|
|
|
|
```
|
|
vtable (AnnotatedBinary.Bar):
|
|
+0x00A0 | 08 00 | uint16_t | 0x0008 (8) | size of this vtable
|
|
+0x00A2 | 13 00 | uint16_t | 0x0013 (19) | size of referring table
|
|
+0x00A4 | 08 00 | VOffset16 | 0x0008 (8) | offset to field `a` (id: 0)
|
|
+0x00A6 | 04 00 | VOffset16 | 0x0004 (4) | offset to field `b` (id: 1)
|
|
```
|
|
|
|
These are purely annotative, there is no embedded information about these
|
|
regions in the flatbuffer itself.
|
|
|
|
### Binary Regions
|
|
|
|
Binary regions are contiguous bytes regions that are grouped together to form
|
|
some sort of value, e.g. a `scalar` or an array of scalars. A binary region may
|
|
be split up over multiple text lines, if the size of the region is large.
|
|
|
|
#### Annotation Example
|
|
|
|
Looking at an example binary region:
|
|
|
|
```
|
|
vtable (AnnotatedBinary.Bar):
|
|
+0x00A0 | 08 00 | uint16_t | 0x0008 (8) | size of this vtable
|
|
```
|
|
|
|
The first column (`+0x00A0`) is the offset to this region from the beginning of
|
|
the buffer.
|
|
|
|
The second column are the raw bytes (hexadecimal) that make up this region.
|
|
These are expressed in the little-endian format that flatbuffers uses for the
|
|
wire format.
|
|
|
|
The third column is the type to interpret the bytes as. For the above example,
|
|
the type is `uint16_t` which is a 16-bit unsigned integer type.
|
|
|
|
The fourth column shows the raw bytes as a compacted, big-endian value. The raw
|
|
bytes are duplicated in this fashion since it is more intuitive to read the data
|
|
in the big-endian format (e.g., `0x0008`). This value is followed by the decimal
|
|
representation of the value (e.g., `(8)`). For strings, the raw string value is
|
|
shown instead.
|
|
|
|
The fifth column is a textual comment on what the value is. As much metadata as
|
|
known is provided.
|
|
|
|
### Offsets
|
|
|
|
If the type in the 3rd column is of an absolute offset (`SOffet32` or
|
|
`Offset32`), the fourth column also shows an `Loc: +0x025A` value which shows
|
|
where in the binary this region is pointing to. These values are absolute from
|
|
the beginning of the file, their calculation from the raw value in the 4th
|
|
column depends on the context.
|