Start now →

C++ Didn’t Get Slower — Go Got Better

By Clint Edwards · Published April 24, 2026 · 4 min read · Source: Level Up Coding
Blockchain
C++ Didn’t Get Slower — Go Got Better

What a side‑by‑side gRPC benchmark reveals about modern concurrency, scheduling, and the real cost of performance

C++ didn’t lose its edge. It didn’t suddenly become slow, unsafe, or obsolete.

What changed is that languages like Go — and increasingly Rust — got good enough that raw performance differences are often small, while the cost of building, operating, and maintaining equivalent systems in C++ remains high.

That shift matters more than most benchmarks suggest.

We benchmarked the same production gRPC proxy written twice: once in Go, once in C++. The results don’t show a clear performance winner — they show why runtime design, tail latency, and engineering economics now drive the decision more than peak throughput ever did.

The experiment

We recently open sourced Arke — a production gRPC proxy for message brokers written in Go.

The predictable response was:

“Sure, but how much faster would this be in C++?”

So we answered it directly.

We ported arke to C++, kept the .proto files identical, and ran both implementations through the same benchmark harness. No synthetic loops, no micro‑benchmarks — just real gRPC clients exercising real concurrency and I/O.

What arke does (briefly)

Arke sits between your application and a message broker (RabbitMQ, etc.) and exposes a uniform gRPC API regardless of backend.

It provides three services:

Producer — Connect, PublishOne, Publish (bidi), Disconnect
Consumer — Connect, Consume (bidi), SourceStats, Disconnect
Healthz — Check (bidi heartbeat)

The differences are strictly runtime and ecosystem:

Both use the same generated gRPC interfaces.

Benchmark setup (short version)

The goal wasn’t micro‑performance. It was understanding behavior under load.

Scenario 1 — Connection‑heavy paths

Producer.Connect is called on existing connections, and negotiates AMQP sessions with the broker.

What happened

At 10 workers (p99.9 latency):

With raw throughput so similar, the focus is on tail behavior. Goroutines park cooperatively. std::threads block in the kernel. Under burst load, that difference matters.

Scenario 2 — Steady‑state publishing

Producer.PublishOne represents the typical case: established connections publishing messages as fast as possible.

What happened

Key point:
Once the broker becomes the bottleneck, language choice largely disappears from the equation.

The saturation point is a product of the runtime environment.

Scenario 3 — Pure gRPC overhead

Healthz.Check removes the broker entirely, isolating the runtime.

What happened

At 10 workers (p99.9 latency):

This gap exists with no I/O, minimal allocation pressure, and no GC involvement. It comes directly from scheduling and blocking behavior.

How big is the performance difference?

Smaller than it appears in isolation.

Across the benchmarks, throughput numbers were often close, peak rates frequently aligned, and measured differences commonly fell within single‑digit percentages. In many cases, both implementations reached similar limits despite very different runtime designs.

Which leads to the more consequential point.

The cost that does differ significantly

Performance gaps have narrowed, but C++ development and maintenance costs have not. Extracting marginal gains typically requires more complex concurrency, stricter lifetime management, and ongoing vigilance against subtle correctness issues. These costs don’t appear in benchmarks, but they compound over time — and when performance gains are small, they dominate the trade‑off.

What this actually shows

This benchmark isn’t saying “Go is faster than C++.”

It shows something more important:

Modern systems fail less often because of raw speed, and more often because of scheduling pathologies, queue buildup, and unpredictable tails.

That’s where Go — and increasingly Rust — change the equation.

Should you rewrite a C++ system?

Maybe — if the gains justify the cost.

In most systems, the performance difference is small. Throughput converges, saturation points align, and wins are often negligible.

The development and maintenance cost is not.

If C++’s added complexity isn’t buying you meaningful gains in latency, reliability, or capacity, then Go or Rust can deliver near‑equivalent performance at a much lower long‑term cost.

The takeaway

The modern trade‑off isn’t speed versus safety.

It’s marginal performance gains versus ongoing engineering cost.

C++ still defines the floor — but Go and Rust are reshaping the ceiling.


C++ Didn’t Get Slower — Go Got Better was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

This article was originally published on Level Up Coding and is republished here under RSS syndication for informational purposes. All rights and intellectual property remain with the original author. If you are the author and wish to have this article removed, please contact us at [email protected].

NexaPay — Accept Card Payments, Receive Crypto

No KYC · Instant Settlement · Visa, Mastercard, Apple Pay, Google Pay

Get Started →