Why milliseconds matter

Developers are always looking for the best and most efficient way to make things happen. But making things more efficient is sometimes just a matter of milliseconds. At Yonego, we recently started to make such a development: switching our webservices from REST to gRPC.

Web services

A web service is a service that can be called by an application. These web services are separate programs that are independent from other applications and can be run on different machines.

As a real-life example, you can compare a webservice to a coffee machine. This machine only has one job: making coffee. A user can select a coffee, and without worrying about how it’s done, a cup of coffee is made. A web service basically does the same. We can have a web service that sends emails. As a client you can just say: “I want to send an email with this content…”. The webservice then handles how to email is actually send, so the client doesn’t have to worry about that anymore.

<< For programmers >>
A web service exposes an API (Application Programming Interface) to communicate over a network usually through HTTP (Hyper Text Transfer Protocol). HTTP is an application protocol often used for communication between a web-client and a web-server. A web service exposes a port that gives other programs access to its functionality.

Other forms of API’s can be much bigger than that. Depending on the API, they can communicate in any means they wish. Internal, external, over a network, over Bluetooth, it does not matter, as long as it reachable in some way.

Service API

As mentioned earlier, these web services are able to run on different machines. This means that there has to be some sort of communication between the applications and the web service, also known as a ‘Service API’.

A Service API is a technique that handles the communication between two machines, such as REST (REpresentational State Transfer) and gRPC (Google Remote Procedure Call). Of course, there are dozens of service API’s, each with their own advantages. But because Yonego recently started to migrate their web services from REST to gRPC I got curious. How much faster is gRPC than REST?

<<For programmers>>
REST is a popular resource based technique using GET, POST, PUT DELETE methods. Resources are data entities on which you perform actions, like the GET, POST, PUT and DELETE methods. REST’s communication often includes sending a JSON and can run over HTTP/1.1 or HTTP/2.

gRPC is an open-source RPC framework that is created and used by Google. It is built upon HTTP/2.0 which makes bi-directional communication possible. gRPC communicates using binary data using protocol buffers by default for serializing structured data. gRPC servers allow cross-language unary calls or stream calls.


To test these methods, a benchmark application is created that runs two benchmark tests:

Multiple Clients

This test simulates several users that try to use the web service at the same time (concurrent), or more technical: simulates several clients that make a call to the server at the same time. A client is a user that is using a service. This can be a website requesting resources from a webservice, a mobile application that initializes a UI (user interface) and requires data to show, or even a webserivce using another webservice.  Just like calling someone on the phone, a client can also call a service, asking it to do a certain thing.

The winner of this test might be the better choice when making a web service that expects several client-calls at the same time.

Parallel Calls

This test simulates one client that makes use of a web service asking it to do several tasks at once, or more technical: one client making n calls to the service with n threads.

These test results are interesting when a web service is created that expects a regular client call, asking the server to do several things at the same time.

<<For programmer>>
gRPC and REST run these tests 50 times in their own Docker container using an Alpine distribution. This is done to minimize external dependencies. Docker is a virtual machine that can run containers. These containers are isolated environments that only run things necessary for you container, and only install software needed for deploying your program. Alpine is a lightweight, security focused Linux distribution.

The result of both these tests is the total duration for this test to finish. This means that only the test is tracked. The duration of the creation of the server and clients is not included in the total duration.

Test calls

Because there are a lot of different web services doing different things, each testing method makes different calls to the web service. Even though these calls are still fairly simple, they do differ in size, which can result in different test-results.

Normal load

When a client sends the word “test” to the web service, and the web service returns the sentence “Hello test”. Note that during this call, the server adds the word “Hello” to the response. Adding “Hello” to the response takes some time which sometimes results in a longer call duration compared to the Heavy load test because it needs to do some work before it can send a response back.

Heavy load

When a client sends a small sentence, a big sentence, a small number, and a big number to the web service, and the web service returns these values. During this call, the server only returns the values. The server does not need to do any work before sending a response back.

Test Results

The tests resulted in different graphs. The first graph shows the results of 50 tests (iterations). The lines represent the duration (in milliseconds) of the call. The big blue line shows the results of gRPC, and the big red line shows the results of REST. The small blue line shows the average gRPC results, and the small green line shows the average REST results.

The second graph shows the average call duration of all 50 iterations.

Note that these graphs have a lot of spikes. This is because we are talking about milliseconds. A millisecond is such a small unit that it’s very easy to take one millisecond longer than another test.

Multiple clients

Normal load

As you can see, REST is faster in this test. gRPC has more overhead which affects the duration of a call. Overhead is all the extra processes or resources that are required to perform a specific task (overhead, 2017 https://en.wikipedia.org/wiki/Overhead_(computing)). For example, gRPC has a security element that takes up a little time. As the payload (the data transmitted) is very small, other dependencies like this might take effect in the overall duration of the call.

One other reason might be that these tests are performed on the same computer, on the same network, on the same port. This is similar to people that try to get off the train at the same time through the same door. People have to go off one by one.

Internally, REST might handle this better than gRPC. How this is internally handled might be a research on its own, so in this test it’s only a speculation. In a more realistic situation, these clients are all on different networks with their own port and might not have this issue.

Heavy load

This test shows that gRPC is faster. This is because gRPC often uses protocol buffer. This is a mechanism for serializing structured data. Serialization is the process of translating data structures into a format that can be stored, transmitted or reconstructed later (Serialization, 2017, https://en.wikipedia.org/wiki/Serialization). REST often sends data using XML or JSON. These are human-readable serialization formats.

According to the official website of protocol buffers, protocol buffer serialization is simpler, faster and smaller than XML (Why not use XML?, 2017 https://developers.google.com/protocol-buffers/docs/overview). This means that the payload of gRPC is smaller and faster then REST’s payload.

With this load, the overhead doesn’t add up to the extra speed gRPC gains from using protocol buffer.

Parallel Calls

Normal load

While REST was faster in the other test with the same load, gRPC is faster in this test.

This is because gRPC is made to create long lasting client connections. This test uses only one client that makes a maximum of 100 calls. All these calls are made while the client connection is still open, making it easier for gRPC to communicate to the server.

REST on the other hand has to create that connection on every call it makes to the server, making it last longer before the call finishes.

Heavy load

In this test, we can see a big difference between REST and gRPC. Not only has gRPC the advantage of a smaller payload from serialized data, but also the advantage of having a long-lasting client connection.

Remarkable is that in this test after every iteration gRPC takes on average less time for a server call, than on the other test, where on average, a call takes longer after every iteration.

The power of a millisecond

The tests show that gRPC is often just a few milliseconds faster than REST. One might think: “Why does this matter? A milliseconds is such a small unit, I won’t even notice…”

Indeed, a millisecond is a very small unit. A user won’t directly notice it when their call lasts 1 ms longer. But this 1ms can be a problem for a company.

A server cannot process all requests at the same time, because it has limited resources. A server can for example only proces 10 requests at the same time. This is comparable with a company that can only answer 10 phone calls at the same time, because it has limited phones, or employees. If a server gets 1000 requests and can only handle 10 at the same time, the last request would have to wait a while before it gets handled. That last request could be you, waiting for the web page to finally load.

To bring things more in perspective, let’s say I made a benchmark test that made 10.000 calls to a server that can only handle 10 requests at the same time. We assume that every gRPC call takes 1ms, while a REST call takes 2ms. gRPC will finish in 1 second and REST in 2 seconds, which is a difference of 1 second. And what if my webservice would make a call to another web service with the same limited resources before responding? That could add another second. This really starts to add up when larger data is sent to web services, making the difference in duration even bigger.

A web page could use a lot of different web services that get called all the time. All these webservice can have different resources available. When a webpage gets more visitors, it isn’t a matter of milliseconds anymore. The load time of a webpage for example can increase, making the user experience decrease. Some companies fix this issue by buying another server. But this can also be fixed by looking into how data is sent from webservice to webservice.


<< For programmers >>
As the results show, gRPC is faster than REST in most tests. The only test that REST won, was the tests where the payload was small and several clients made a server call at the same time.

According to these tests, when creating a web service that expects several client-calls at the same time, and uses a small payload as input and output, REST might be the better choice.

In all other occasions, gRPC was faster. We can conclude that gRPC is our overall winner, but besides the speed, gRPC and REST both have their own advantages:


gRPC can use protocol buffer for data serialization. This makes payloads faster, smaller and simpler.

Just like REST, gRPC can be used cross-language which means that if you have written a web service in Golang, a Java written application can still use that web service, which makes gRPC web services very scalable.

gRPC uses HTTP/2 to support highly performant and scalable API’s and makes use of binary data rather than just text which makes the communication more compact and more efficient. gRPC makes better use of HTTP/2 then REST. gRPC for example makes it possible to turn-off message compression. This might be useful if you want to send an image that is already compressed. Compressing it again just takes up more time.

It is also type-safe. This basically means that you can’t give an apple while a banana is expected. When the server expects an integer, gRPC won’t allow you to send a string because these are two different types.


As said earlier, REST can be used cross-language which makes these web services flexible and scalable.

REST is also widely used. A lot of people have experience with it and a lot of other web services (and clients) use REST. Having a REST web service makes it easier for other people to interact with your web service.

Communication often happens using a JSON, which is human readable. This makes it easier for developers to determine if the client input is send correctly to the server, and back.

But one of the main advantages of REST is that it does not need to setup a client. You just make a call to a server address (for example: www.test.nl/thisworks). This even works if you just copy a REST server address (of a GET method) in your web browser. Other techniques, like gRPC, often require you to setup a client.


With more than a million websites, and even more programs, a lot of techniques are used that barely anyone knows. This article only covers two specific communication methods that can be used in some specific occasions, but are still problems that are handled everyday by back-end developers.

In these tests, gRPC proved to be faster than REST. At Yonego, we chose to migrate our REST web services to gRPC web services because of all the advantages gRPC has to offer.

But apart from which technique is used, the most remarkable thing about all this is the speed in which this is happening. During these tests, whole stories were send in less than 25 milliseconds from the client machine to the server and back. Thousands of lines of code were read and processed for only sending the data from server to client, and all this in such a small amount of time.

Sending stories in less than 25 Milliseconds is fast. But programmers are still trying to decrease this duration, because they know that on a larger scale milliseconds do matter.

Hopefully, after reading this article, you know a little bit more about what those coffee consuming programmers do in their everyday job to make sure your user experience is as good as possible.



As an intern at Yonego I am migrating Yonego’s REST webservices to gRPC webservices, starting with a simple ‘location-service’. This service returns a country code (ISO) from an IP-address. This country-code can then be used to, for example, translate a web-page according to your current location.

While reading on a lot of sites that one of the advantage of gRPC is that it’s pretty fast, I got curious and started doing my own research by creating an application that tests the speed of both gRPC and REST.

While this article is about my research, I also wanted to make people a little more aware of what backend developers are doing to make sure your application keeps working as optimal as possible. And don’t worry, I tried to keep this article readable for the non-programmers.
~ Willem Toemen

Bekijk onze cases