Added functionality to assign field ids manually in a schema

New attribute:

-   `id: n` (on a table field): manually set the field identifier to `n`.
    If you use this attribute, you must use it on ALL fields of this table,
    and the numbers must be a contiguous range from 0 onwards.
    Additionally, since a union type effectively adds two fields, its
    id must be that of the second field (the first field is the type
    field and not explicitly declared in the schema).
    For example, if the last field before the union field had id 6,
    the union field should have id 8, and the unions type field will
    implicitly be 7.
    IDs allow the fields to be placed in any order in the schema.
    When a new field is added to the schema is must use the next available ID.

Change-Id: I8690f105f3a2d31fdcb75a4fab4130692b12c62f
Tested: on Windows
This commit is contained in:
Wouter van Oortmerssen
2014-07-07 17:34:23 -07:00
parent a5f50019bc
commit 9140144d51
8 changed files with 96 additions and 24 deletions

View File

@@ -64,9 +64,9 @@ auto inventory = fbb.CreateVector(inv, 10);
<p><code>CreateString</code> can also take an <code>std::string</code>, or a <code>const char *</code> with an explicit length, and is suitable for holding UTF-8 and binary data if needed.</p>
<p><code>CreateVector</code> can also take an <code>std::vector</code>. The offset it returns is typed, i.e. can only be used to set fields of the correct type below. To create a vector of struct objects (which will be stored as contiguous memory in the buffer, use <code>CreateVectorOfStructs</code> instead. </p><pre class="fragment">Vec3 vec(1, 2, 3);
</pre><p><code>Vec3</code> is the first example of code from our generated header. Structs (unlike tables) translate to simple structs in C++, so we can construct them in a familiar way.</p>
<p>We have now serialized the non-scalar components of of the monster example, so we could create the monster something like this: </p><pre class="fragment">auto mloc = CreateMonster(fbb, &amp;vec, 150, 80, name, inventory, Color_Red, Offset&lt;void&gt;(0), Any_NONE);
<p>We have now serialized the non-scalar components of of the monster example, so we could create the monster something like this: </p><pre class="fragment">auto mloc = CreateMonster(fbb, &amp;vec, 150, 80, name, inventory, Color_Red, 0, Any_NONE);
</pre><p>Note that we're passing <code>150</code> for the <code>mana</code> field, which happens to be the default value: this means the field will not actually be written to the buffer, since we'll get that value anyway when we query it. This is a nice space savings, since it is very common for fields to be at their default. It means we also don't need to be scared to add fields only used in a minority of cases, since they won't bloat up the buffer sizes if they're not actually used.</p>
<p>We do something similarly for the union field <code>test</code> by specifying a <code>0</code> offset and the <code>NONE</code> enum value (part of every union) to indicate we don't actually want to write this field.</p>
<p>We do something similarly for the union field <code>test</code> by specifying a <code>0</code> offset and the <code>NONE</code> enum value (part of every union) to indicate we don't actually want to write this field. You can use <code>0</code> also as a default for other non-scalar types, such as strings, vectors and tables.</p>
<p>Tables (like <code>Monster</code>) give you full flexibility on what fields you write (unlike <code>Vec3</code>, which always has all fields set because it is a <code>struct</code>). If you want even more control over this (i.e. skip fields even when they are not default), instead of the convenient <code>CreateMonster</code> call we can also build the object field-by-field manually: </p><pre class="fragment">MonsterBuilder mb(fbb);
mb.add_pos(&amp;vec);
mb.add_hp(80);
@@ -95,6 +95,15 @@ assert(inv-&gt;Get(9) == 9);
<p>For structs, layout is deterministic and guaranteed to be the same accross platforms (scalars are aligned to their own size, and structs themselves to their largest member), and you are allowed to access this memory directly by using <code>sizeof()</code> and <code>memcpy</code> on the pointer to a struct, or even an array of structs.</p>
<p>To compute offsets to sub-elements of a struct, make sure they are a structs themselves, as then you can use the pointers to figure out the offset without having to hardcode it. This is handy for use of arrays of structs with calls like <code>glVertexAttribPointer</code> in OpenGL or similar APIs.</p>
<p>It is important to note is that structs are still little endian on all machines, so only use tricks like this if you can guarantee you're not shipping on a big endian machine (an <code>assert(FLATBUFFERS_LITTLEENDIAN)</code> would be wise).</p>
<h3>Access of untrusted buffers</h3>
<p>The generated accessor functions access fields over offsets, which is very quick. These offsets are not verified at run-time, so a malformed buffer could cause a program to crash by accessing random memory.</p>
<p>When you're processing large amounts of data from a source you know (e.g. your own generated data on disk), this is acceptable, but when reading data from the network that can potentially have been modified by an attacker, this is undesirable.</p>
<p>For this reason, you can optionally use a buffer verifier before you access the data. This verifier will check all offsets, all sizes of fields, and null termination of strings to ensure that when a buffer is accessed, all reads will end up inside the buffer.</p>
<p>Each root type will have a verification function generated for it, e.g. for <code>Monster</code>, you can call:</p>
<p>bool ok = VerifyMonsterBuffer(Verifier(buf, len));</p>
<p>if <code>ok</code> is true, the buffer is safe to read.</p>
<p>Besides untrusted data, this function may be useful to call in debug mode, as extra insurance against data being corrupted somewhere along the way.</p>
<p>While verifying a buffer isn't "free", it is typically faster than a full traversal (since any scalar data is not actually touched), and since it may cause the buffer to be brought into cache before reading, the actual overhead may be even lower than expected.</p>
<h2>Text &amp; schema parsing</h2>
<p>Using binary buffers with the generated header provides a super low overhead use of FlatBuffer data. There are, however, times when you want to use text formats, for example because it interacts better with source control, or you want to give your users easy access to data.</p>
<p>Another reason might be that you already have a lot of data in JSON format, or a tool that generates JSON, and if you can write a schema for it, this will provide you an easy way to use that data directly.</p>