AngryAnt - Unity Protocol Buffers

Time for another “ah, crap – better put something up here before heading to GDC”-post. And since I did some re-tinkering with the integration of protobuf in Behave late last year, some sharing on that topic seemed in order.

So yes, stuff is happening with Behave – no abandonment has taken place. On the contrary I have been working on several, long-running, feature branches. I’ll get around to merging some of those in and doing another release. More on that soonish.

Anywho, last I checked, protobuf-net was still a more complete solution than the younger protobuf for C# one, so I still stick with protobuf-net and can therefore only talk about that here. However it looks like the latter is seeing more rapid development, so I’ll probably review my choice when next I need to tinker with the integration.

More interestingly, some of the initial people behind protobuf have since gone and built the new and shiny Cap’n Proto. This looks to be even more powerful than its predecessor, but at time of writing it is still not as mature or implementation-rich as protobuf. Critically for the context of this post, there is no C# implementation yet. Cool stuff, though – worth keeping an eye on.

Overview

So what is the Protocol Buffers project? Succinctly, it is a compact binary serialisation format with very fast cross-platform and -language implementations. Super useful when you need to quickly move some data between memory and some other location – a file / a network peer / whatever.

It was first designed, implemented, and maintained by Google for communications between network peers on their internal network. Since then, it has been implemented in a number of different languages – including the protobuf-net implementation, built and maintained by Marc Gravell.

Serialisation use in Behave has gone through a couple of distinct phases:

.net binary serialiser for asset.
protobuf-net for asset.
protobuf-net for asset and remote debugger protocol.
JSON for asset, protobuf-net for remote debugger protocol.

Additionally I’ve used protobuf-net for runtime saving or build pipelines on various client projects.

Aside from speed and compression, protobuf-net vs. the .net binary serialiser was also an escape of the frustrating lack of support for versioning or even simple structural refactors of serialised types. The switch to JSON for the .behave asset files of-course came when I decided that merge-ability could be a fun thing to support.

On top of this general migration, I also went through a couple of different protobuf-net integrations – as my use cases and requirements changed. It just so happens that this gave me full coverage of the three approaches supported by protobuf-net, so no need for extra research before writing this post. Win!

Integration

For serialisation to work, you need a a serialiser and a schema. Protobuf-net offers a couple of ways for you to provide those:

[Attribute] markup with runtime generated serialiser.
[Attribute] markup with pre-generated serialiser.
.proto file description with pre-generated serialiser.

I did this list in order of simplicity, which also happens to be the order in which I switched through them in the integration with Behave.

The first option is very .net esque and protobuf-net is indeed compatible both with its own serialisation attributes and the more general-purpose .net serialisation attributes. However since your interest here is in a Unity context, I’m sure that you have already spotted the problem with this approach.

Given that my initial use for protobuf-net was just for asset serialisation (which is editor-time only), I had no problem with the serialisation solution relying on JIT compilation in order to construct the serialiser at runtime. However as soon as I expanded my use case to include the runtime debugger, relying on JIT would mean not supporting the debugger on AOT platforms like iOS and consoles. Further, as Unity continues transitioning to their IL2CPP solution, you’re looking at a future where most will want to do AOT on all platforms.

So when introducing the remote debugger (previous debugger implementation was just 100% in-editor), I started to pre-generate the serialiser. This entails feeding your compiled assembly with [Attribute] markup to the protobuf-net precompile tool, which in turn generates an assembly with the serialiser type. For an example of how I used to do that, here’s a snippet of perl.

Extraction

Everything works! Win, right? Well… as I was expanding debugger support to non-C# targets and doing a general code cleanup, I got increasingly annoyed by having serialisation implementation detail in my general data type code. So I started looking at how protobuf-net supports the standard protobuf .proto format for schema definition.

As things stand, it takes a bit of work – specifically this work – to get going, but once there it is solid. In stead of using the precompile tool, you need to build and run the protogen tool from the protobuf-net repository. If you’re on Windows, things may just work straight out of the gate, but you may want to see what is in my patch anyway.

So how does this work? Well, the input is no longer [attribute] marked assemblies, but in stead a .proto definition file. You can find great detail on that in the general protobuf documentation, but you may want to consult the protobuf-net docs for implementation specifics/limits. Also, the output is not an assembly, but simply a C# file.

Keep in mind that protogen relies on the protoc tool from the general Google protobuf tools. I fetched this from homebrew as the “protobuf” package.

Again I have a snippet of perl to illustrate use – as well as an example .proto file. In this case the Behave debugger protocol definition.

Using this approach, I now have a nicely separated codebase and no duplication of schema definition between the C# debugger runtime and others. One trick remains though…

Object pooling

A mildly active Behave debugger session involves a lot of messages. I have very little interest in blowing up the garbage collector by constantly instantiating new messages and leaving them to get collected. So how do we integrate an object pool setup with the generated protobuf-net serialiser?

While not exactly as pretty as I would like, my answer is making use of the partialness of the generated types – in order to add a static pool and -constructor to hook it up.

And that’s pretty much what I’ve got. If I missed something or you have related questions, feel free to ping me and I’ll try to update this post when I can.