Benchmarks indicate that malloc accounts for a significant amount of the
runtime of queries. The message buffer accounts for ~half of that (the
other being channels), and this change should eliminate it.
copy_in data was previously copied ~3 times - once into the copy_in
buffer, once more to frame it into a CopyData frame, and once to write
that into the stream.
Our Codec is now a bit more interesting. Rather than just writing out
pre-encoded data, we can also send along unencoded CopyData so they can
be framed directly into the stream output buffer. In the future we can
extend this to e.g. avoid allocating for simple commands like Sync.
This also allows us to directly pass large copy_in input directly
through without rebuffering it.