Benchmarking Gob vs Protobuf

Roman Sheremeta
8 min readJul 11, 2023

--

This article is a Part 2 of my experiment in Gob serialization/deserialization performance measuring and comparing it to Protobuf.

Part 1 was intended to compare Gob vs JSON, XML & YAML.

I asked AI to generate an icon of Gob

Gob is a serialization technique specific to Go only. Gob is designed to work with Golang-based data types. For instance, JSON (and others too) supports struct data type only, if we’re talking in terms of Go. While Gob is not restricted so and does support all data types but functions and channels. It means you’re able to encode simple data types like int, string etc. Gob encodes data into a binary format and decodes into a corresponding data type. Another advantage of Gob is worth to say is that it doesn’t encode empty values, the empty value is to be decoded to the default one based on the type. This approach allows to increase the performance.

Another TL;DR

Here is the granular analysis of mine about all the outcomes I’ve got — Analysis.md

Kinda Protobuf intro

Protocol Buffers (protobuf) — is a data serialization format developed by Google. It provides a concise, efficient, and extensible way to exchange structured data between different systems and programming languages. Protobuf defines a language- and platform-neutral schema using a simple language called Protocol Buffer Language (proto). By defining the schema in a .proto file, developers can generate code in various languages to serialize, transmit, and deserialize data. Protobuf offers advantages such as smaller message size, faster encoding/decoding, and backward compatibility, making it a popular choice for communication in distributed systems and data storage applications. Protobuf supports not only Golang, but also lots of other programming languages such as C++, Java, C#, Ruby, Python, PHP, Scala, Swift, Dart etc.

Prerequisites

In this article, I’ll be implementing the benchmarking mechanism in the same way as I did in the Part 1 — eg — creating structs of various size and complexity, filling them up with a dummy data and run the bench tests against. So with that in mind, I’ll skip describing the Gob-related stuff here to not overwhelm y’all.

But using Protobuf-eligible structs requires us to do a set of steps beforehand:

  1. write .proto files with needed data structs
  2. install protoc tool for generating protobuf go files
  3. generate the actual files
  4. implement endcoding & decoding funcs in a similar way as for Gob and other formats in the Part 1.

So, let’s start with writing .proto files which may be good to give you some info about.

The syntax of .proto files is simple and human-readable. It consists of various elements such as messages, fields, enums, and services. Messages represent the structured data that is exchanged, and fields define the individual data elements within a message. Enums provide a set of named values, and services define remote procedure calls (RPC) that can be performed on the defined messages. Proto files follow a specific format, they begin with optional package and import statements to organize and import other .proto files. The core definition consists of message and enum declarations, along with their fields and values. Syntax options can be specified to control the code generation behavior for different programming languages. You can read more here.

Proto files design

With having the syntax and format in mind, designing all the needed .protofiles will eventually look like the below ones:

Tiny struct
Medium struct
Big struct
Huge struct
Complex And Huge map struct

Important points!

Firstly, replace all the go_package values with your repo module name, since it‘s mine and could crash your stuff if you won’t.

Also, all of the files should be saved in the top level project dir (I preffer name it just proto) with the corresponding names and .protoextension.

Protoc installation & Go files generating

Protoc — the protocol buffer compiler which is used to compile .proto files, which contain service and message definitions.

Based on which OS you’re on, the installation steps differ, so we’ll need to get help in the official docs:

  • if you’re using Linux, there is an option to install it via apt
  • if you’re on MacOS, you can use Homebrew
  • I’m unsure regarding Windows, but seems like you can do that via downloading pre-compiled binaries via curl and unzipping them.

Important! Make sure you’ve installed protoc properly by checking its version!

Now we’re all set to generate Protobuf Go files!

In terminal, navigate to the project dir and type the following cmd:

# Do not forget to replace the module name with your one!!!

protoc -Iproto — go_opt=module=github.com/RSheremeta/gob_proto_bench — go_out=. proto/*.proto

In case everything went well, you should have files like tiny.pb.go or big.pb.go etc in your proto_gen dir.

Endcoding & decoding funcs

We’re moving on to the next chapter, which is implementation of encoding and decoding Protobuf functions.

Note that unlike Gob (and JSON, XML etc), Protobuf funcs for encoding/decoding require strict parameters, so it’s not possible to make the input arg as interface{} — it requires proto.Message as a param:

Encode & decode functions

I decided to keep it as simple as possible and thus not handling errors properly. That’s not a good approach in the real code.

Do not forget to implement functions with filling up the structs with dummy data!

We’re all set to move to the most delicious part (at least to me;)) — implementation of the actual benchmarking tests!

Designing tests

As we’ll be using a built-in testingpkg (obviously), we need to implement two structs (each one for Gob and Proto) to run the test conveniently. It should contain a name (printing for clarity), a target struct (one of the created above), and an encoding function.

After that we’ll populate a slice of that testing struct with the data needed for benchmarking and iterate over it by invoking each item using b.Run() function.

So eventually it’ll look like:

encoding benchmarks

If you’re unclear with the inner loop going thru b.N var: it’s a built-in variable which represents amount of times being run for each benchmark. AFAIK, it seeds a random value (actually, not random, but suitable for Go itself). We’ll use a separate flag to set the value for it and run exact times.

For the decoding benchmarks, we need the same step of encoding (it won’t be taken into account in terms of time and performance) and also we’ll surely add a decoding func tho:

decoding benchmarks

As you can see at the snippet above, we defined a list of benchmark structs which contains a name, a func to get a struct instance with data filled, encoding and decoding functions for a corresponding format.

I also wanna point out the b.ResetTimer() lines you may notice in the snippet — it’s needed to reset the time as we need only one operation (decoding or encoding) to be calculated, and not everything in the entire function.

Other tests could be found here.

Benchmarks running

Now we’re all set to run, obtain and analyze the results!

Run the benchmarks while being inside the project root dir and invoking the following terminal cmd:

go test -bench=. -benchmem -benchtime=10x > result.csv

Wait for a while (it may take a few mins, as the struct sets are huge indeed) and have a look at the generated csv file with the results.

FYI, I did separate Make commands for more comfortable running — here.

It should be smth like the ones below, but more comprehensive.

Encoding a single struct:

goos: darwin
goarch: arm64
pkg: github.com/RSheremeta/gob-proto-bench/test
BenchmarkEncodeSingleComplexMap/type=GOB_struct_size=huge_complex_map-10 10 51716921 ns/op 84000585 B/op 731909 allocs/op
BenchmarkEncodeSingleComplexMap/type=Proto_struct_size=huge_complex_map-10 10 40617621 ns/op 43575244 B/op 466002 allocs/op
BenchmarkEncodeSingle/type=GOB_struct_size=tiny-10 10 2762 ns/op 1137 B/op 20 allocs/op
BenchmarkEncodeSingle/type=GOB_struct_size=medium-10 10 4925 ns/op 1720 B/op 38 allocs/op
BenchmarkEncodeSingle/type=GOB_struct_size=big-10 10 7025 ns/op 2792 B/op 61 allocs/op
BenchmarkEncodeSingle/type=GOB_struct_size=huge-10 10 687012 ns/op 1184441 B/op 5407 allocs/op
BenchmarkEncodeSingle/type=Proto_struct_size=tiny-10 10 587.5 ns/op 16 B/op 1 allocs/op
BenchmarkEncodeSingle/type=Proto_struct_size=medium-10 10 904.2 ns/op 48 B/op 1 allocs/op
BenchmarkEncodeSingle/type=Proto_struct_size=big-10 10 875.0 ns/op 128 B/op 1 allocs/op
BenchmarkEncodeSingle/type=Proto_struct_size=huge-10 10 456542 ns/op 172032 B/op 1 allocs/op
PASS
ok github.com/RSheremeta/gob-proto-bench/test 1.224s

Decoding a slice of structs:

goos: darwin
goarch: arm64
pkg: github.com/RSheremeta/gob-proto-bench/test
BenchmarkDecodeSliceComplexMap/type=GOB_struct_size=huge_complex_map_slice-10 10 578728700 ns/op 633277214 B/op 10645890 allocs/op
BenchmarkDecodeSliceComplexMap/type=Proto_struct_size=huge_complex_map_slice-10 10 2147280442 ns/op 2337694045 B/op 46637416 allocs/op
BenchmarkDecodeSlice/type=GOB_struct_size=tiny-10 10 13550 ns/op 8035 B/op 219 allocs/op
BenchmarkDecodeSlice/type=GOB_struct_size=medium-10 10 18817 ns/op 11537 B/op 322 allocs/op
BenchmarkDecodeSlice/type=GOB_struct_size=big-10 10 65404 ns/op 40107 B/op 1133 allocs/op
BenchmarkDecodeSlice/type=GOB_struct_size=huge-10 10 55004492 ns/op 36011551 B/op 1064870 allocs/op
BenchmarkDecodeSlice/type=Proto_struct_size=tiny-10 10 783.3 ns/op 544 B/op 15 allocs/op
BenchmarkDecodeSlice/type=Proto_struct_size=medium-10 10 3504 ns/op 2872 B/op 66 allocs/op
BenchmarkDecodeSlice/type=Proto_struct_size=big-10 10 31708 ns/op 35080 B/op 708 allocs/op
BenchmarkDecodeSlice/type=Proto_struct_size=huge-10 10 42068254 ns/op 46749560 B/op 932608 allocs/op
PASS
ok github.com/RSheremeta/gob-proto-bench/test 36.820s

Decoding a slice of maps:

goos: darwin
goarch: arm64
pkg: github.com/RSheremeta/gob-proto-bench/test
BenchmarkDecodeSliceComplexMap/type=GOB_struct_size=huge_complex_map_slice-10 10 573950896 ns/op 633277254 B/op 10645891 allocs/op
BenchmarkDecodeSliceComplexMap/type=Proto_struct_size=huge_complex_map_slice-10 10 2153558850 ns/op 2337695762 B/op 46637425 allocs/op
PASS
ok github.com/RSheremeta/gob-proto-bench/test 35.524s

As you can see, each benchmark contains the value “10” — that’s the amount of times run. That’s the b.N value mentioned earlier and we set it in the terminal command above.

All results can be found on my Github.

Analyzing the results

Based on the outcome I got after multiple runs, the analysis turned out to be a pretty comprehensive stuff, so I put it in a separate doc called Analysis.md in my repo and will analyze the results only briefly here.

So, my brief analysis is the following:

  1. It’s not worth to use Gob for a small- or medium-sized data (either single or sliced), Protobuf looks way better here.
  2. However, a small- or medium-sized data is slower to work with using Gob, it could be better than Protobuf in terms of memory per operation only (but even not always), while Protobuf looks better in terms of memory allocs and speed (in nanos).
  3. Protobuf is better at encoding/decoding at encoding/decoding a custom complex single map.
  4. Gob is better at both encoding and decoding a really huge and comprehensive data structure — eg — a slice of a custom complex single map.

Instead of a summary

Data formats are a critical aspect of programming, as they determine how efficiently data is stored and processed. In this article, we benchmarked Gob against Protobuf to determine which one performs the best in terms of speed and memory usage. However, the optimal choice of data format ultimately depends on the specific needs of your project and the trade-offs you’re willing to make.

Thank you note

Thank you for the reading the article. I do hope you’ll find my experiment useful for yourself!

Feedbacks and subscriptions are appreciated!

Also, I’d invite you to check out another articles from myself:

--

--