C++ Kourier vs Rust Hyper vs Go http: Memory Consumption Benchmark

4.7x less memory than Rust/Hyper, 7.7x less memory than Go/http, and a fully-compliant parser

Glauco Pacheco · · 963 words · 5 minute read

Hi, I’m the developer of Kourier, the fastest server on Earth, and in this post, I put all servers under high load and benchmark their memory consumption.

In another post, I compare Kourier, Rust/Hyper, and Go/http regarding performance, and show how far above the alternatives Kourier is. In this post, I run the same benchmarks with different command-line options to increase the number of connections to 500K.

As the results below indicate, Kourier is also unbeatable regarding memory consumption.

Kourier is open source, and all the assets used on the benchmarks are publicly available and container-based, ensuring you can reproduce the results on your local machine quickly and easily. All assets used on the benchmarks are available on the Kourier repository.

Although Kourier is truly a wonder of software engineering and delivers never-before-seen performance, many frameworks sacrifice HTTP conformance to be faster. In another post, I show that Kourier is much more HTTP syntax-compliant than Rust/Hyper and Go/http.

All Docker images used in this benchmark can be easily built using the build script contained in Kourier’s repository.

I start the servers with the following commands:

# Kourier. The server listens on port 3275
# and uses six threads to process incoming requests.
docker run --rm -d --network host kourier-bench:kourier -a 127.0.0.1 -p 3275 --worker-count=6 --request-timeout=20 --idle-timeout=60
# Rust (Hyper). The server listens on port 8080
# and uses six threads to process incoming requests.
docker run --rm -d --network host kourier-bench:rust-hyper
# Go (net/http). The server listens on port 7080
# and uses six threads to process incoming requests.
docker run --rm -d --network host kourier-bench:go-net-http

I use wrk to load the server. As I do not want to benchmark the network, I run the benchmarks over the localhost and split the available cores evenly between the servers and wrk. I use the following command to make wrk use 500K connections over six threads during 60 seconds with sockets bound to multiple IPs to load the server without exhausting the available ports for the clients (the -b option for wrk is only available for specific commits. You can see the wrk.dockerfile file on Kourier repository at the Src/Tests/Resources/Benchmarks/wrk folder to see how I build wrk):

# PORT is 3275 for Kourier, 8080 for Rust (Hyper), or 7080 for Go (net/http).
docker run --rm --network host -it kourier-bench:wrk -c 500000 -d 60 -t 6 -b 127.0.0.0/8 --timeout 60s --latency http://localhost:PORT/hello

I set timers to exercise as much framework code as possible in the benchmark. On Kourier, I set request and idle timers; on Rust (Hyper), I set timers only for requests; on Go (net/http), I set request, idle, and write timers, as you can see on Kourier’s repository (all benchmark code is available at the Src/Tests/Resources/Benchmarks folder).

Results 🔗

# Testing Kourier server at 127.0.0.1:3275
glauco@ldh:~$ sudo docker run --rm -d --network host kourier-bench:kourier -a 127.0.0.1 -p 3275 --worker-count=6 --request-timeout=20 --idle-timeout=60
d1606417573b4bbdb7645435b199eee4a6c4e6dd6ddf0b428bdee40fa3f7bb54
glauco@ldh:~$ sudo docker run --rm --network host -it kourier-bench:wrk -c 500000 -d 60 -t 6 -b 127.0.0.0/8 --timeout 60s --latency http://localhost:3275/hello
Running 1m test @ http://localhost:3275/hello
  6 threads and 500000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   909.75ms  300.77ms  27.83s    66.08%
    Req/Sec    42.39k    15.88k  111.69k    71.53%
  Latency Distribution
     50%  903.25ms
     75%    1.12s
     90%    1.31s
     99%    1.58s
  13898103 requests in 1.01m, 1.36GB read
Requests/sec: 229600.45
Transfer/sec:     22.99MB
glauco@ldh:~$ sudo docker inspect -f '{{.State.Pid}}' d16064
9284
glauco@ldh:~$ grep VmPeak /proc/9284/status
VmPeak:  1261344 kB
# Testing Rust (Hyper) server at 127.0.0.1:8080
glauco@ldh:~$ sudo docker run --rm -d --network host kourier-bench:rust-hyper
ee93817a0bbde5a4f1815731bc7749bc0a464d1c845d61ba4f55896745d9fffd
glauco@ldh:~$ sudo docker run --rm --network host -it kourier-bench:wrk -c 500000 -d 60 -t 6 -b 127.0.0.0/8 --timeout 60s --latency http://localhost:8080/hello
Running 1m test @ http://localhost:8080/hello
  6 threads and 500000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.14s   157.73ms   7.90s    96.38%
    Req/Sec    37.80k     4.87k   47.44k    87.65%
  Latency Distribution
     50%    1.13s
     75%    1.14s
     90%    1.17s
     99%    1.54s
  12695817 requests in 1.01m, 1.22GB read
  Socket errors: connect 0, read 1, write 0, timeout 0
Requests/sec: 209752.03
Transfer/sec:     20.60MB
glauco@ldh:~$ sudo docker inspect -f '{{.State.Pid}}' ee93817
18104
glauco@ldh:~$ grep VmPeak /proc/18104/status
VmPeak:  5974808 kB
# Testing Go (net/http) server at 127.0.0.1:7080
glauco@ldh:~$ sudo docker run --rm -d --network host kourier-bench:go-net-http
01dc6bd5097bb8e21ea74488d4e7fc6064073f3ea966f8609a64661e5a27e1a5
glauco@ldh:~$ sudo docker run --rm --network host -it kourier-bench:wrk -c 500000 -d 60 -t 6 -b 127.0.0.0/8 --timeout 60s --latency http://localhost:7080/hello
Running 1m test @ http://localhost:7080/hello
  6 threads and 500000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.69s   361.08ms   3.38s    79.32%
    Req/Sec    25.55k     8.42k   48.62k    73.07%
  Latency Distribution
     50%    1.57s
     75%    2.00s
     90%    2.07s
     99%    3.01s
  8575493 requests in 1.01m, 1.01GB read
Requests/sec: 141699.75
Transfer/sec:     17.03MB
glauco@ldh:~$ sudo docker inspect -f '{{.State.Pid}}' 01dc6bd
18492
glauco@ldh:~$ grep VmPeak /proc/18492/status
VmPeak:  9801832 kB

Conclusion 🔗

Kourier is the next level of network-based communication. It is on another league regarding performance, compliance, and memory consumption. Kourier leaves everything else in the dust, including enterprise network appliances.

Creating the fastest server ever requires much more than a stellar HTTP parser. From ring buffers to socket programming, to a custom timer implementation. I implemented it all and open-sourced all of it alongside Kourier.

I developed Kourier with strict and demanding requirements, where all possible behaviours are comprehensively verified in specifications written in the Gherkin style. To this end, I created Spectator, a test framework that I also open-sourced with Kourier. You can check all files ending in spec.cpp in the Kourier repository to see how meticulously tested Kourier is. There is a stark difference in testing rigor between Kourier and other frameworks.

Kourier can empower the next generation of network appliances/solutions, making businesses that rely on them run at a fraction of their infrastructure cost and in a much more HTTP-compliant way.

You can contact me if your Business is not compatible with the requirements of the AGPL and you want to license Kourier under alternative terms.