protocol buffers «Ismael Juma Ismael Juma

I was reading the The Book of JOSH and saw the following statement:

“Json delivers on what XML promised. Simple to understand, effective data markup accessible and usable by human and computer alike. Serialization/Deserialization is on par with or faster then XML, Thrift and Protocol Buffers.”

That seemed a bit too definite for my taste. There are so many variables that can affect the results that I was interested in more information, so I asked for it and eventually got an answer.

I had a brief look at the benchmark referenced and that was enough to come up with some talking points. To make it easier to follow, I will just compare protocol buffers and json (jackson). I started by running the benchmark in my machine (java 1.6.0_14-ea-b03):

	Object create	Serialization	Deserialization	Serialized Size
protobuf	312.95730	3052.26500	2340.84600	217
json	182.64535	2284.88300	3362.31850	310

Ok, so json doesn’t seem to be faster on deserialization and the size is almost 50% bigger (a big deal if the network is the bottleneck as is often the case). Why is serialization of protobuf so slow though? Let’s see the code:

    public byte[] serialize(MediaContent content, ByteArrayOutputStream baos) throws IOException
    {
        content.writeTo(baos);
        return baos.toByteArray();
    }

How about we replace that with content.toByteArray()?

	Object create	Serialization	Deserialization	Serialized Size
protobuf	298.89330	2087.79800	2339.44450	217
json (jackson)	174.49190	2482.53350	3599.90800	310

That’s more like it. Let’s try something a bit more exotic just for fun and add XX:+DoEscapeAnalysis:

	Object create	Serialization	Deserialization	Serialized Size
protobuf	260.51330	1925.32300	2302.74250	217
json (jackson)	176.20370	2385.99750	3647.01700	310

That reduces some of the cost of object creation for protobuf, but it’s still substantially slower than json. This is not hard to believe because of the builder pattern employed by the Java classes generated by protocol buffers, but I haven’t investigated it in more detail. In any case, protocol buffers is better in 3 of the measures for this particular benchmark.

What does this mean? Not a lot. As usual, where performance is important, you should create benchmarks that mirror your application and environment. I just couldn’t let the blanket “json is on par with or faster than…” statement pass without a bit of scrutiny. ;)

12 Responses »

Ismael Juma

Json serialization/deserialization faster than protocol buffers? Wednesday, Mar 25 2009

Calendar

Recent Posts

Recent Comments

Twitter Updates