What are the biggest pros and cons of Apache Thrift vs Google's Protocol Buffers?
Google protocol buffers vs json vs XML [closed] Ask Question. One in Perl and second in Java. So, would like to create common service which can be used by both technology i.e. Both are web-applications. Browse other questions tagged xml json protocol-buffers data-serialization or ask your own question.
JonasThey both offer many of the same features; however, there are some differences:
Set
typeBasically, they are fairly equivalent (with Protocol Buffers slightly more efficient from what I have read).
ThomasAnother important difference are the languages supported by default.
Both could be extended to other platforms, but these are the languages bindings available out-of-the-box.
RPC is another key difference. Thrift generates code to implement RPC clients and servers wheres Protocol Buffers seems mostly designed as a>55 gold badges24 silver badges37 bronze badges
As I've said as 'Thrift vs Protocol buffers' topic :
Referring to Thrift vs Protobuf vs JSON comparison :
Additionally, there are plenty of interesting additional tools available for those solutions, which might decide. Here are examples for Protobuf: Protobuf-wireshark , protobufeditor.
Harald GliebeI was able to get better performance with a text based protocol as compared to protobuff on python. However, no type checking or other fancy utf8 conversion, etc... which protobuff offers.
So, if serialization/deserialization is all you need, then you can probably use something else.
dhruvbirddhruvbirdProtocol Buffers seems to have a more compact representation, but that's only an impression I get from reading the Thrift whitepaper. In their own words:
We decided against some extreme storage optimizations (i.e. packing small integers into ASCII or using a 7-bit continuation format) for the sake of simplicity and clarity in the code. These alterations can easily be made if and when we encounter a performance-critical use case that demands them.
Also, it may just be my impression, but Protocol Buffers seems to have some thicker abstractions around struct versioning. Thrift does have some versioning support, but it takes a bit of effort to make it happen.
Daniel SpiewakDaniel SpiewakOne obvious thing not yet mentioned is that can be both a pro or con (and is same for both) is that they are binary protocols. This allows for more compact representation and possibly more performance (pros), but with reduced readability (or rather, debuggability), a con.
Also, both have bit less tool support than standard formats like xml (and maybe even json).
(EDIT) Here's an Interesting comparison that tackles both size & performance differences, and includes numbers for some other formats (xml, json) as well.
StaxManStaxManAnd according to the wiki the Thrift runtime doesn't run on Windows.
hplbshhplbshProtocolBuffers is FASTER.
There is a nice benchmark here:
http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking
You might also want to look into Avro, as Avro is even faster.
Microsoft has a package here:
http://www.nuget.org/packages/Microsoft.Hadoop.Avro
By the way, the fastest I've ever seen is Cap'nProto;
A C# implementation can be found at the Github-repository of Marc Gravell.
I think most of these points have missed the basic fact that Thrift is an RPC framework, which happens to have the ability to serialize data using a variety of methods (binary, XML, etc).
Protocol Buffers are designed purely for serialization, it's not a framework like Thrift.
Babra CunninghamBabra CunninghamFor one, protobuf isn't a full RPC implementation. It requires something like gRPC to go with it.
gPRC is very slow compared to Thrift:
There are some excellent points here and I'm going to add another one in case someones' path crosses here.
Thrift gives you an option to choose between thrift-binary and thrift-compact (de)serializer, thrift-binary will have an excellent performance but bigger packet size, while thrift-compact will give you good compression but needs more processing power. This is handy because you can always switch between these two modes as easily as changing a line of code (heck, even make it configurable). So if you are not sure how much your application should be optimized for packet size or in processing power, thrift can be an interesting choice.
PS: See this excellent benchmark project by thekvs
which compares many serializers including thrift-binary, thrift-compact, and protobuf: https://github.com/thekvs/cpp-serializers
PS: There is another serializer named YAS
which gives this option too but it is schema-less see the link above.
It's also important to note that not all supported languages compair consistently with thrift or protobuf. At this point it's a matter of the modules implementation in addition to the underlying serialization. Take care to check benchmarks for whatever language you plan to use.
This is a comparison of>N/ANoApache Avro™ 1.8.1 SpecificationYesNoN/AYes (built-in)N/AN/AApache ParquetApache Software FoundationN/ANoYesNoNoN/AJava, PythonNoASN.1ISO, IEC, ITU-TN/AYesISO/IEC 8824; X.680 series of ITU-T RecommendationsYes
(BER, DER, PER, OER, or custom via ECN)Yes
(XER, JER, GSER, or custom via ECN)PartialfYes (built-in)N/AYes (OER)BencodeBram Cohen (creator)
BitTorrent, Inc. (maintainer)N/ADe facto standard via BitTorrent Enhancement Proposal (BEP)Part of BitTorrent protocol specificationPartially
(numbers and delimiters are ASCII)NoNoNoNoN/ABinnBernardo RamosN/ANoBinn SpecificationYesNoNoNoNoYesBSONMongoDBJSONNoBSON SpecificationYesNoNoNoNoN/ACBORCarsten Bormann, P. HoffmanJSON (loosely)YesRFC 7049YesNoYes
through taggingYes
(CDDL)NoYesComma-separated values (CSV)RFC author:
Yakov ShafranovichN/APartial
(myriad informal variants used)RFC 4180
(among others)NoYesNoNoNoNoCommon Data Representation (CDR)Object Management GroupN/AYesGeneral Inter-ORB ProtocolYesNoYesYesADA, C, C++, Java, Cobol, Lisp, Python, Ruby, SmalltalkN/AD-Bus Message Protocolfreedesktop.orgN/AYesD-Bus SpecificationYesNoNoPartial
(Signature strings)Yes
(see D-Bus)N/AEfficient XML Interchange (EXI)W3CXML, Efficient XMLYesEfficient XML Interchange (EXI) Format 1.0YesYes
(XML)Yes
(XPointer, XPath)Yes
(XML Schema)Yes
(DOM, SAX, StAX, XQuery, XPath)N/AFlatBuffersGoogleN/ANoYesYes
(Apache Arrow)Partial
(internal to the buffer)Yes [2]C++, Java, C#, Go, Python, Rust, JavaScript, PHP, C, Dart, Lua, TypeScriptYesFast InfosetISO, IEC, ITU-TXMLYesITU-T X.891 and ISO/IEC 24824-1:2007YesNoYes
(XPointer, XPath)Yes
(XML schema)Yes
(DOM, SAX, XQuery, XPath)N/AFHIRHealth_Level_7REST basicsYesFast Healthcare Interoperability ResourcesYesYesYesYesHapi for FHIR[1]JSON, XML, TurtleNoIonAmazonJSONNoThe Amazon Ion SpecificationYesYesNoNoNoN/AJava serializationOracle CorporationN/AYesJava Object SerializationYesNoYesNoYesN/AJSONDouglas CrockfordJavaScript syntaxYesSTD 90/RFC 8259
(ancillary:
RFC 6901,
RFC 6902), ECMA-404, ISO/IEC 21778:2017No, but see BSON, Smile, UBJSONYesYes
(JSON Pointer (RFC 6901);
alternately:
JSONPath, JPath, JSPON, json:select()), JSON-LDPartial
(JSON Schema Proposal, ASN.1 with JER, Kwalify, Rx, Itemscript Schema), JSON-LDPartial
(Clarinet, JSONQuery, JSONPath), JSON-LDNoMessagePackSadayuki FuruhashiJSON (loosely)NoMessagePack format specificationYesNoNoNoNoYesNetstringsDan BernsteinN/ANonetstrings.txtYesYesNoNoNoYesOGDLRolf Veen?NoSpecificationYes
(Binary Specification)YesYes
(Path Specification)Yes
(Schema WD)N/AOPC-UA BinaryOPC FoundationN/ANoopcfoundation.orgYesNoYesNoNoN/AOpenDDLEric LengyelC, PHPNoOpenDDL.orgNoYesYesNoYes
(OpenDDL Library)N/APickle (Python)Guido van RossumPythonDe facto standard via Python Enhancement Proposals (PEPs)[3] PEP 3154 -- Pickle protocol version 4YesNoNoNoYes
([4])NoProperty listNeXT (creator)
Apple (maintainer)?PartialPublic DTD for XML formatYesaYesbNo?Cocoa, CoreFoundation, OpenStep, GnuStepNoProtocol Buffers (protobuf)GoogleN/ANoDeveloper Guide: EncodingYesPartialdNoYes (built-in)C++, C#, Java, Python, Javascript, GoNoS-expressionsJohn McCarthy (original)
Ron Rivest (internet draft)Lisp, NetstringsPartial
(largely de facto)Yes
('Canonical representation')Yes
('Advanced transport representation')NoNoN/ASmileTatu SalorantaJSONNoSmile Format SpecificationYesNoNoPartial
(JSON Schema Proposal, other JSON schemas/IDLs)Partial
(via JSON APIs implemented with Smile backend, on Jackson, Python)N/ASOAPW3CXMLYesW3C Recommendations:
SOAP/1.1
SOAP/1.2Partial
(Efficient XML Interchange, Binary XML, Fast Infoset, MTOM, XSD base64 data)YesYes
(built-in id/ref, XPointer, XPath)Yes
(WSDL, XML schema)Yes
(DOM, SAX, XQuery, XPath)N/AStructured Data eXchange FormatsMax WildgrubeN/AYesRFC 3072YesNoNoNoN/AThriftFacebook (creator)
Apache (maintainer)N/ANoOriginal whitepaperYesPartialcNoYes (built-in)N/AUBJSONThe Buzz Media, LLCJSON, BSONNo[5]YesNoNoNoNoN/AeXternal Data Representation (XDR)Sun Microsystems (creator)
IETF (maintainer)N/AYesSTD 67/RFC 4506YesNoYesYesYesN/AXMLW3CSGMLYesW3C Recommendations:
1.0 (Fifth Edition)
1.1 (Second Edition)Partial
(Efficient XML Interchange, Binary XML, Fast Infoset, XSD base64 data)YesYes
(XPointer, XPath)Yes
(XML schema, RELAX NG)Yes
(DOM, SAX, XQuery, XPath)N/AXML-RPCDave Winer[2]XMLNoXML-RPC SpecificationNoYesNoNoNoN/AYAMLClark Evans,
Ingy döt Net,
and Oren Ben-KikiC, Java, Perl, Python, Ruby, Email, HTML, MIME, URI, XML, SAX, SOAP, JSON[3]NoVersion 1.2NoYesYesPartial
(Kwalify, Rx, built-in language type-defs)NoN/ANameCreator-maintainerBased onStandardized?SpecificationBinary?Human-readable?Supports references?eSchema-IDL?Standard APIsSupports Zero-copy operations
Format | Null | Boolean true | Boolean false | Integer | Floating-point | String | Array | Associative array/Object |
---|---|---|---|---|---|---|---|---|
ASN.1 (XML Encoding Rules) | <foo /> | <foo>true</foo> | <foo>false</foo> | <foo>685230</foo> | <foo>6.8523015e+5</foo> | <foo>A to Z</foo> | An object (the key is a field name): A data mapping (the key is a data value): | |
CSVb | null a(or an empty element in the row)a | 1 atrue a | 0 afalse a | 685230 -685230 a | 6.8523015e+5 a | A to Z 'We said, 'no'.' | true,-42.1e7,'A to Z' | |
Format | Null | Boolean true | Boolean false | Integer | Floating-point | String | Array | Associative array/Object |
Ion |
| true | false | 685230 -685230 0xA74AE 0b111010010101110 | 6.8523015e5 | 'A to Z' '' | ||
Netstringsc | 0:, a4:null, a | 1:1, a4:true, a | 1:0, a5:false, a | 6:685230, a | 9:6.8523e+5, a | 6:A to Z, | 29:4:true,0:,7:-42.1e7,6:A to Z, | 41:9:2:42,1:1,25:6:A to Z,12:1:1,1:2,1:3, a |
JSON | null | true | false | 685230 -685230 | 6.8523015e+5 | 'A to Z' | ||
OGDL[verification needed] | null a | true a | false a | 685230 a | 6.8523015e+5 a | 'A to Z' 'A to Z' NoSpaces |
| |
Format | Null | Boolean true | Boolean false | Integer | Floating-point | String | Array | Associative array/Object |
OpenDDL | ref {null} | bool {true} | bool {false} | int32 {685230} int32 {0x74AE} int32 {0b111010010101110} | float {6.8523015e+5} | string {'A to Z'} | Homogeneous array: Heterogeneous array: | |
Pickle (Python) | N. | I01n. | I00n. | I685230n. | F685230.15n. | S'A to Z'n. | (lI01na(laF-421000000.0naS'A to Z'na. | (dI42nI01nsS'A to Z'n(lI1naI2naI3nas. |
Property list (plain text format)[8] | N/A | <*BY> | <*BN> | <*I685230> | <*R6.8523015e+5> | 'A to Z' | ( <*BY>, <*R-42.1e7>, 'A to Z' ) | |
Property list (XML format)[9][10] | N/A | <true /> | <false /> | <integer>685230</integer> | <real>6.8523015e+5</real> | <string>A to Z</string> | ||
Protocol Buffers | N/A | true | false | 685230 -685230 | 20.0855369 | 'A to Z' | ||
Format | Null | Boolean true | Boolean false | Integer | Floating-point | String | Array | Associative array/Object |
S-expressions | NIL nil | T #t ftrue | NIL #f ffalse | 685230 | 6.8523015e+5 | abc 'abc' #616263# 3:abc {MzphYmM=} |YWJj| | (T NIL -42.1e7 'A to Z') | ((42 T) ('A to Z' (1 2 3))) |
YAML | ~ null Null NULL [11] | y Y yes Yes YES on On ON true True TRUE [12] | n N no No NO off Off OFF false False FALSE [12] | 685230 +685_230 -685230 02472256 0x_0A_74_AE 0b1010_0111_0100_1010_1110 190:20:30 [13] | 6.8523015e+5 685.230_15e+03 685_230.15 190:20:30.15 .inf -.inf .Inf .INF .NaN .nan .NAN [14] | A to Z 'A to Z' 'A to Z' | [y, ~, -42.1e7, 'A to Z'] | {'John':3.14, 'Jane':2.718} |
XMLe and SOAP | <null /> a | true | false | 685230 | 6.8523015e+5 | A to Z | ||
XML-RPC | <value><boolean>1</boolean></value> | <value><boolean>0</boolean></value> | <value><int>685230</int></value> | <value><double>6.8523015e+5</double></value> | <value><string>A to Z</string></value> |
Format | Null | Booleans | Integer | Floating-point | String | Array | Associative array/Object |
---|---|---|---|---|---|---|---|
ASN.1 (BER, PER or OER encoding) | NULL type | BOOLEAN:
| INTEGER:
| REAL: base-10 real values are represented as character strings in ISO 6093 format; binary real values are represented in a binary format that includes the mantissa, the base (2, 8, or 16), and the exponent; the special values NaN, -INF, +INF, and negative zero are also supported | Multiple valid types (VisibleString, PrintableString, GeneralString, UniversalString, UTF8String) | data specifications SET OF (unordered) and SEQUENCE OF (guaranteed order) | user definable type |
Binn | x00 | True: x01 False: x02 | big-endian2's complement signed and unsigned 8/16/32/64 bits | single: big-endianbinary32 double: big-endianbinary64 | UTF-8 encoded, null terminated, preceded by int8 or int32 string length in bytes | Typecode (one byte) + 1-4 bytes size + 1-4 bytes items count + list items | Typecode (one byte) + 1-4 bytes size + 1-4 bytes items count + key/value pairs |
BSON | Null type – 0 bytes for value | True: one byte x01 False: x00 | int32: 32-bit little-endian2's complement or int64: 64-bit little-endian2's complement | double: little-endianbinary64 | UTF-8 encoded, preceded by int32 encoded string length in bytes | BSON embedded document with numeric keys | BSON embedded document |
Concise Binary Object Representation (CBOR) | xf6 | True: xf5 False: xf4 | Small positive number x00-x17 , small negative number x20-x37 (abs(N) <= 23) 8-bit: positive | Typecode (one byte) + IEEE half/single/double | Typecode with length (like integer coding) and content. Bytestring and UTF-8 have different typecode | Typecode with count (like integer coding) and items | Typecode with pairs count (like integer coding) and pairs |
Efficient XML Interchange (EXI) | xsi:nil element (1-4 bits depending on context) | 1 bit. | 0–12 bits (log2 range) bits for integers with defined ranges less than 4096. Extensible sequence of octets with infinite range for larger or undefined ranges. Also supports custom representations. | Scalable floating point representation requiring 18 to 88 bits depending on magnitude. Also supports IEEE and custom representations. | Length prefixed sequence of Unicode code points with partitioned string tables for efficient representation of repeated items. The length and code points are represented as variable length unsigned integers where values under 128 require 1 octet each. Also supports custom representations. | Repeated elements or length-prefixed list of values. Also supports custom representations. | Ordered (sequence) or unordered (all) group of named elements. |
FlatBuffers | Encoded as absence of field in parent object | True: one byte x01 False: x00 | little-endian2's complement signed and unsigned 8/16/32/64 bits | floats: little-endianbinary32 doubles: little-endianbinary64 | UTF-8 encoded, preceded by 32 bit integer length of string in bytes | Vectors of any other type, preceded by 32 bit integer length of number of elements | Tables (schema defined types) or Vectors sorted by key (maps / dictionaries) |
MessagePack | xc0 | True: xc3 False: xc2 | Single byte 'fixnum' (values -32..127) ortypecode (one byte) + big-endian (u)int8/16/32/64 | Typecode (one byte) + IEEE single/double | Typecode + up to 15 bytes or typecode + length as uint8/16/32 + bytes; encoding is unspecified[15] | As 'fixarray' (single-byte prefix + up to 15 array items) ortypecode (one byte) + 2–4 bytes length + array items | As 'fixmap' (single-byte prefix + up to 15 key-value pairs) ortypecode (one byte) + 2–4 bytes length + key-value pairs |
Netstrings | 0:, | True: 1:1, False: | |||||
OGDL Binary | |||||||
Property list (binary format) | |||||||
Protocol Buffers | Variable encoding length signed 32-bit: varint encoding of 'ZigZag'-encoded value (n << 1) XOR (n >> 31) Variable encoding length signed 64-bit: varint encoding of 'ZigZag'-encoded | floats: little-endianbinary32 doubles: little-endianbinary64 | UTF-8 encoded, preceded by varint-encoded integer length of string in bytes | Repeated value with the same tag | N/A | ||
Smile | x21 | True: x23 False: x22 | Single byte 'small' (values -16..15 encoded using xc0 - xdf ),zigzag-encoded | IEEE single/double, BigDecimal | Length-prefixed 'short' Strings (up to 64 bytes), marker-terminated 'long' Strings and (optional) back-references | Arbitrary-length heterogenous arrays with end-marker | Arbitrary-length key/value pairs with end-marker |
Structured Data eXchange Formats (SDXF) | big-endian signed 24-bit or 32-bit integer | big-endian IEEE double | either UTF-8 or ISO 8859-1 encoded | list of elements with identical ID and size, preceded by array header with int16 length | chunks can contain other chunks to arbitrary depth | ||
Thrift |
Any XML based representation can be compressed, or generated as, using EXI - Efficient XML Interchange, which is a 'Schema Informed' (as opposed to schema-required, or schema-less) binary compression standard for XML.