How RapidJSON is developed in order to achieve highest performance among 20 C/C++ JSON libraries. Benchmarks, some C++ design, algorithm and low-level optimizations are covered.
How to Write the Fastest JSON Parser/Writer in the World
1. How to Write the Fastest JSON
Parser/Writer in the World
Milo Yip
Tencent
28 Mar 2015
2. Milo Yip 叶劲峰
• Expert Engineer (2011 to now)
– Engine Technology Center, R & D Department,
Interactive Entertainment Group (IEG), Tencent
• Master of Philosophy in System Engineering &
Engineering Management, CUHK
• Bachelor of Cognitive Science, HKU
• https://github.com/miloyip
• http://www.cnblogs.com/miloyip
• http://www.zhihu.com/people/miloyip
6. JSON
• JavaScript Object Notation
• Alternative to XML
• Human-readable text to transmit/persist data
• RFC 7159/ECMA-404
• Common uses
– Open API (e.g. Twitter, Facebook, etc.)
– Data storage/exchange (e.g. GeoJSON)
8. Features
• Both SAX and DOM style API
• Fast
• Cross platform/compiler
• No dependencies
• Memory friendly
• UTF-8/16/32/ASCII and transcoding
• In-situ Parsing
• More at http://miloyip.github.io/rapidjson/md_doc_features.html
9. Hello RapidJSON!
#include "rapidjson/document.h"
#include "rapidjson/writer.h"
#include "rapidjson/stringbuffer.h"
#include <iostream>
using namespace rapidjson;
int main() {
// 1. Parse a JSON string into DOM.
const char* json = "{"project":"rapidjson","stars":10}";
Document d;
d.Parse(json);
// 2. Modify it by DOM.
Value& s = d["stars"];
s.SetInt(s.GetInt() + 1);
// 3. Stringify the DOM
StringBuffer buffer;
Writer<StringBuffer> writer(buffer);
d.Accept(writer);
// Output {"project":"rapidjson","stars":11}
std::cout << buffer.GetString() << std::endl;
return 0;
}
10. Fast, AND Reliable
• 103 Unit Tests
• Continuous Integration
– Travis on Linux
– AppVeyor on Windows
– Valgrind (Linux) for memory leak checking
• Use in real applications
– Use in client and server applications at Tencent
– A user reported parsing 50 million JSON daily
11. Public Projects using RapidJSON
• Cocos2D-X: Cross-Platform 2D Game Engine
http://cocos2d-x.org/
• Microsoft Bond: Cross-Platform Serialization
https://github.com/Microsoft/bond/
• Google Angle: OpenGL ES 2 for Windows
https://chromium.googlesource.com/angle/angle/
• CERN LHCb: Large Hadron Collider beauty
http://lhcb-comp.web.cern.ch/lhcb-comp/
• Tell me if you know more
13. Benchmarks for Native JSON libraries
• https://github.com/miloyip/nativejson-benchmark
• Compare 20 open source C/C++ JSON libraries
• Evaluate speed, memory and code size
• For parsing, stringify, traversal, and more
19. Benchmarks for Spine
• Spine is a 2D skeletal animation tool
• Spine-C is the official runtime in C
https://github.com/EsotericSoftware/spine-runtimes/tree/master/spine-c
• It uses JSON as data format
• It has a custom JSON parser
• Adapt RapidJSON and compare loading time
23. The Zero Overhead Principle
• Bjarne Stroustrup[1]:
“What you don't use, you don't pay for.”
• RapidJSON tries to obey this principle
– SAX and DOM
– Combinable options, configurations
24. SAX
StartObject()
Key("hello", 5, true)
String("world", 5, true)
Key("t", 1, true)
Bool(true)
Key("f", 1, true)
Bool(false)
Key("n", 1, true)
Null()
Key("i")
UInt(123)
Key("pi")
Double(3.1416)
Key("a")
StartArray()
Uint(1)
Uint(2)
Uint(3)
Uint(4)
EndArray(4)
EndObject(7)
DOM
When parsing a JSON to DOM, use SAX events to build a DOM.
When stringify a DOM, traverse it and generate events to SAX.
{"hello":"world", "t":true, "f":false, "n":null,
"i":123, "pi":3.1416, "a":[1, 2, 3, 4]}
26. Handler: Template Parameter
• Handler handles SAX event callbacks
• How to implement callbacks?
– Traditional: virtual function
– RapidJSON: template parameter
template <unsigned parseFlags, typename InputStream, typename Handler>
ParseResult Reader::Parse(InputStream& is, Handler& handler);
• No virtual function overhead
• Inline callback functions
27. Parsing Options: Template Argument
• Many parse options -> Zero overhead principle
• Use integer template argument
template <unsigned parseFlags, typename InputStream, typename Handler>
ParseResult Reader::Parse(InputStream& is, Handler& handler);
if (parseFlags & kParseInsituFlag) {
// ...
}
else {
// ...
}
• Compiler optimization eliminates unused code
28. Recursive SAX Parser
• Simple to write/optimize by hand
• Use program stack to maintain parsing state of
the tree structure
• Prone to stack overflow
– So also provide an iterative parser
(Contributed by Don Ding @thebusytypist)
31. Parsing Number: the Pain ;(
• RapidJSON supports parsing JSON number to
uint32_t, int32_t, uint64_t, int64_t, double
• Difficult to detect in single pass
• Even more difficult for double (strtod() is slow)
• Implemented kFullPrecision option using
1. Fast-path
2. DIY-FP (https://github.com/floitsch/double-conversion)
3. Big Integer method [2]
32. How difficult?
• PHP Hangs On Numeric Value 2.2250738585072011e-308
http://www.exploringbinary.com/php-hangs-on-numeric-
value-2-2250738585072011e-308/
• Java Hangs When Converting 2.2250738585072012e-308
http://www.exploringbinary.com/java-hangs-when-
converting-2-2250738585072012e-308/
• "2.22507385850720113605740979670913197593481954635
164564e-308“ → 2.2250738585072009e-308
• "2.22507385850720113605740979670913197593481954635
164565e-308“→ 2.2250738585072014e-308
• And need to be fast…
33. DOM Designed for Fast Parsing
• A JSON value can be one of 6 types
– object, array, number, string, boolean, null
• Inheritance needs new for each value
• RapidJSON uses a single variant type Value
34. Layout of Value
String
Ch* str
SizeType length
unsigned flags
Number
int i unsigned u
int64_t i64 uint64_t u64 double d
0 0
unsigned flags
Object
Member* members
SizeType size
SizeType capacity
unsigned flags
Array
Value* values
SizeType size
SizeType capacity
unsigned flags
35. Move Semantics
• Deep copying object/array/string is slow
• RapidJSON enforces move semantics
36. The Default Allocator
• Internally allocates a single linked-list of
buffers
• Do not free objects (thus FAST!)
• Suitable for parsing (creating values
consecutively)
• Not suitable for DOM manipulation
37. Custom Initial Buffer
• User can provide a custom initial buffer
– For example, buffer on stack, scratch buffer
• The allocator use that buffer first until it is full
• Possible to archive zero allocation in parsing
38. Short String Optimization
• Many JSON keys are short
• Contributor @Kosta-Github submitted a PR to
optimize short strings
String
Ch* str
SizeType length
unsigned flags
ShortString
Ch str[11];
uint8_t x;
unsigned flags
Let length = 11 – x
So 11-char long string is ended with ‘0’
39. SIMD Optimization
• Using SSE2/SSE4 to skip whitespaces
(space, tab, LF, CR)
• Each iteration compare 16 chars × 4 chars
• Fast for JSON with indentation
• Visual C++ 2010 32-bit test:
strlen()
for ref.
strspn() RapidJSON
(no SIMD)
RapidJSON
(SSE2)
RapidJSON
(SSE4)
Skip 1M
whitespace
(ms)
752 3011 1349 170 102
42. Double-to-String Optimziation
• Double-to-string conversion is very slow
– E.g. 3.14 -> “3.14”
• Grisu2 is a fast algorithm for this[3]
– 100% cases give correct results
– >99% cases give optimal results
• Google V8 has an implementation
– https://github.com/floitsch/double-conversion
– But not header-only, so…
43. My Grisu2 Implementation
• https://github.com/miloyip/dtoa-benchmark
• Visual C++ 2013 on Windows 64-bit:
45. Tradeoff: User-Friendliness
• DOM only supports move semantics
– Cannot copy-construct Value/Document
– So, cannot pass them by value, put in containers
• DOM APIs needs allocator as parameter, e.g.
numbers.PushBack(1, allocator);
• User needs to concern life-cycle of allocator
and its allocated values
46. Pausing in Parsing
• Cannot pause in parsing and resume it later
– Not keeping all parsing states explicitly
– Doing so will be much slower
• Typical Scenario
– Streaming JSON from network
– Don’t want to store the JSON in memory
• Solution
– Parse in an separate thread
– Block the input stream to pause
48. Origin
• RapidJSON is my hobby project in 2011
• Also my first open source project
• First version released in 2 weeks
49. Community
• Google Code helps tracking bugs but hard to
involve contributions
• After migrating to GitHub in 2014
– Community much more active
– Issue tracking more powerful
– Pull requests ease contributions
50. Future
• Official Release under Tencent
– 1.0 beta → 1.0 release (after 3+ years…)
– Can work on it in working time
– Involve marketing and other colleagues
– Establish Community in China
• Post-1.0 Features
– Easy DOM API (but slower)
– JSON Schema
– Relaxed JSON syntax
– Optimization on Object Member Access
• Open source our internal projects at Tencent
51. To Establish an Open Source Project
• Courage
• Start Small
• Make Different
– Innovative Idea?
– Easy to Use?
– Good Performance?
• Embrace Community
• Learn
52. References
1. Stroustrup, Bjarne. The design and evolution
of C++. Pearson Education India, 1994.
2. Clinger, William D. How to read floating point
numbers accurately. Vol. 25. No. 6. ACM,
1990.
3. Loitsch, Florian. "Printing floating-point
numbers quickly and accurately with
integers." ACM Sigplan Notices 45.6 (2010):
233-243.