Recently I’ve been working on some extensions to ASEXOR, adding there direct support for messaging via WebSocket and I use JSON for small messages that travels between client (browser or standalone) and backend. Messages looks like these:
messages = [ {'call_id': 1, 'kwargs': {}, 'args': ['sleep', 0.1]}, {'call_id': 1, 't': 'r', 'returned': 'd53b2823d35b471282ab5c8b6c2e4685'}, {'call_id': 2, 'kwargs': {'utc': True}, 'args': ['date', '%d-%m-%Y %H:%M %Z']}, {'call_id': 2, 't': 'r', 'returned': '77da239342e240a0a3078d50019a20a0'}, {'call_id': 1, 'data': {'status': 'started', 'task_id': 'd53b2823d35b471282ab5c8b6c2e4685'}, 't': 'm'}, {'call_id': 2, 'data': {'status': 'started', 'task_id': '77da239342e240a0a3078d50019a20a0'}, 't': 'm'}, {'call_id': 1, 'data': {'status': 'success', 'task_id': 'd53b2823d35b471282ab5c8b6c2e4685', 'result': None, 'duration': 0.12562298774719238}, 't': 'm'}, {'call_id': 2, 'data': {'status': 'success', 'task_id': '77da239342e240a0a3078d50019a20a0', 'result': '27-02-2017 11:46 UTC', 'duration': 0.04673957824707031}, 't': 'm'} ]
I wondered, if choosing different serialization format(s) (similar to JSON, but binary) could bring more efficiency into the application – considering both message size and encoding/decoding processing time. I run small tests in python (see tests here on gist) with few established serializers, which can be used as quick replacement for JSON and below are results:
Format | Total messages size (bytes) | Processing time – 10000 x encoding/decoding all messages |
---|---|---|
JSON (standard library) | 798 | 833 ms |
JSON (ujson) | 798 | 193 ms |
MessagePack (official lib) | 591 | 289 ms |
MessagePack (umsgpack) | 585 | 3.15 s |
CBOR | 585 | 163 ms |
UBJSON | 668 | 2.28 s |
As messaging can use clients in web browser we can also look at performace of some serializers in Javascript on this page. As JSON serialization in part of browsers Web API, unsurprisingly it’s fastest there.
In Python pure Python libraries (UBJSON, MessagePack with umsgpack package) are slowest ( but their performance might get better in PyPy). Standard library implementation of JSON serializer can be easily replaced by better performing ujson package.
Conclusions
JSON is today really ubiquitous, thanks to it’s ease of use and readability. It’s probably good choice for many usage scenarios and luckily JSON serializers show good performance. If size of messages is of some concern, CBOR looks like great, almost instant replacement for JSON, with similar performance in Python ( slower performance in browser is not big issues as browser will process typically only few messages) and 27% smaller messages size.
If size of messages is big concern carefully designed binary protocol ( with Protocol Buffers for instance) can provide much smaller messages ( but with additional costs in development).