This function gets around the inefficiency of populating a [ubyte] vector
byte by byte. Since ubyte vectors are probably the most commonly used type
of generic byte buffer, this seems like a worthwhile thing to create a
fast path for.
Benchmarks show a 6x improvement in throughput on x64.
There is a new test verifying the functionality of the function.
Change-Id: I82e0228ae0f815dd7ea89bf168b8c1925f3ce0d7