AmtocSoft Tech Insights: Why Companies Are Rewriting Critical Systems in Rust in 2026

Friday, April 10, 2026

Why Companies Are Rewriting Critical Systems in Rust in 2026

Hero: Rust crab mascot surrounded by benchmark charts showing memory safety metrics and performance graphs from real-world rewrites at Cloudflare, AWS, and Discord

Generated with Higgsfield GPT Image — 16:9

Something unusual is happening in production engineering. Companies that spent years building critical infrastructure in C, C++, and even Go are rewriting those systems in Rust — not because Rust is trendy, but because the cost of not rewriting is becoming too high.

The Linux kernel merged Rust support in version 6.1. Windows kernel modules are being developed in Rust. Android has been writing new code in Rust since version 13. AWS built Firecracker — the hypervisor powering Lambda and Fargate — in Rust from scratch. Cloudflare replaced their nginx-based proxy with a Rust service called Pingora. Meta is moving away from C++ for systems work. Google is using Rust in Chromium. Discord dropped Go for Rust in their most latency-sensitive service.

This isn't a coincidence. It isn't hype. There is a specific, measurable set of problems that Rust solves better than anything else available today, and the industry has reached the point where the learning curve is an acceptable cost for the guarantees Rust provides.

This post explains exactly what those problems are, what Rust actually solves, and which organizations have made the switch — with the results they reported.

The Memory Safety Crisis

In 2019, Microsoft's Security Response Center published an analysis of the CVEs they had fixed over the previous twelve years. The finding was stark: approximately 70% of the security vulnerabilities in Microsoft products were memory safety bugs. Buffer overflows, use-after-free errors, heap corruption, out-of-bounds reads and writes.

Microsoft isn't unique. Google reported similar numbers for Chrome: around 70% of high-severity bugs are memory safety issues. The NSA issued an advisory in 2022 recommending that organizations transition to memory-safe languages. CISA, ONCD, and multiple other government cybersecurity agencies followed with similar guidance in 2023 and 2024.

To understand why memory safety bugs are so prevalent, you need to understand what C and C++ allow. In both languages, memory is managed manually. You allocate with malloc, and you are responsible for calling free at the right time — not too early, not too late, and never twice. The language itself has no mechanism to enforce this. Here is what a use-after-free looks like:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
    char *data;
    size_t len;
} Buffer;

Buffer* create_buffer(const char *input) {
    Buffer *buf = malloc(sizeof(Buffer));
    buf->len = strlen(input);
    buf->data = malloc(buf->len + 1);
    strcpy(buf->data, input);
    return buf;
}

void free_buffer(Buffer *buf) {
    free(buf->data);
    free(buf);
    // buf->data is now a dangling pointer — the memory it pointed to is freed
}

int main() {
    Buffer *buf = create_buffer("hello");
    free_buffer(buf);

    // This is undefined behavior — the memory has been freed
    // In practice, this may read garbage, crash, or be exploited
    printf("Data: %s\n", buf->data);

    return 0;
}

The compiler accepts this code without warning. The program may print garbage, crash, or — most dangerously — produce a security vulnerability that an attacker can exploit to execute arbitrary code.

The financial cost of a single exploited memory safety CVE in a widely deployed system can be enormous. The Heartbleed bug in OpenSSL (2014) — a buffer over-read — affected an estimated half a million servers and cost hundreds of millions of dollars in remediation. The same class of bugs appears, in slightly different form, year after year.

What makes this particularly frustrating is that these are not logic errors. A use-after-free isn't a conceptual mistake in the program's design. It's a mechanical property of how memory is managed — the kind of thing a language runtime or a type system should be able to catch automatically.

That's exactly what Rust does.

What Rust Actually Solves

Rust's core innovation is its ownership system — a set of compile-time rules that guarantee memory safety without requiring a garbage collector.

The rules are conceptually simple:
1. Every value has exactly one owner.
2. When the owner goes out of scope, the value is dropped (memory freed).
3. You can lend a reference to a value (borrowing), but the compiler enforces that you cannot use a value after it has been moved or freed.
4. Multiple immutable references can coexist, but only one mutable reference may exist at a time — and not simultaneously with any immutable references.

These rules eliminate entire classes of bugs at compile time. Use-after-free is impossible because the compiler tracks ownership and refuses to compile code that accesses freed memory. Double-free is impossible for the same reason. Data races are impossible because the borrow checker prevents two threads from having mutable access to the same data simultaneously.

Here is the C use-after-free from above, translated into a Rust attempt:

// This Rust code does NOT compile — the borrow checker catches it

struct Buffer {
    data: String,
}

fn main() {
    let buf = Buffer {
        data: String::from("hello"),
    };

    drop(buf); // explicitly drop buf, freeing the memory

    // compile error: borrow of moved value: `buf`
    // buf.data has been moved/dropped; this is a use-after-free in C,
    // but Rust catches it at compile time with error E0382
    println!("{}", buf.data);
}

The program doesn't compile. The error message tells you exactly what went wrong and where. You fix it before it ever reaches production, staging, or even your test suite.

Beyond memory safety, Rust also offers genuine zero-cost abstractions and predictable performance. Unlike languages with garbage collectors — Go, Java, Python — Rust has no runtime pause for memory reclamation. This matters enormously for latency-sensitive systems: network proxies, trading systems, audio processing, game engines. When every millisecond counts, a GC pause of even a few hundred microseconds is unacceptable.

Rust's async runtime story has also matured significantly. The tokio crate provides production-grade async I/O, and the async/await syntax is ergonomic enough for real workloads:

// Concurrent thread-safe counter using Rust's ownership model
// This is the pattern that replaces manual mutex management in C/C++
use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    // Arc = Atomically Reference Counted — shared ownership across threads
    // Mutex = mutual exclusion — only one thread accesses the data at a time
    let counter = Arc::new(Mutex::new(0u64));
    let mut handles = vec![];

    for _ in 0..10 {
        // Clone the Arc to give each thread a shared reference
        let counter = Arc::clone(&counter);
        let handle = thread::spawn(move || {
            // lock() returns a MutexGuard — automatically released when dropped
            let mut num = counter.lock().unwrap();
            *num += 1;
            // MutexGuard is dropped here — lock is released automatically
        });
        handles.push(handle);
    }

    // Wait for all threads to complete
    for handle in handles {
        handle.join().unwrap();
    }

    // Always prints 10 — the borrow checker guarantees no data races
    println!("Result: {}", *counter.lock().unwrap());
}

In Go, the equivalent code compiles and runs — but you must run go test -race to discover data races. The race detector is a runtime tool, not a compile-time guarantee. In Rust, the type system itself prevents data races from existing. If the code compiles, it cannot have a data race.

This distinction — compile time versus runtime discovery — has enormous practical implications. A bug caught at compile time has zero cost: no test infrastructure to run, no deployment to roll back, no customer impact. A bug caught at runtime in production can mean minutes of downtime, data corruption, or a security incident.

Architecture diagram: Rust ownership model showing value lifecycle — creation, borrowing, moving, and dropping — with the borrow checker rules annotated at each stage

Generated with Higgsfield GPT Image — 16:9

Real-World Rewrites: Who, Why, and What They Found

The most compelling case for Rust isn't theoretical. It's the published results from organizations that have already made the switch.

Cloudflare Pingora

Cloudflare's HTTP proxy, Pingora, replaced nginx as the foundation of their edge network. Nginx is written in C and has served Cloudflare extraordinarily well — but it has architectural limitations that made certain features difficult or impossible to implement cleanly, and its memory safety posture is that of any large C codebase.

Pingora, written from scratch in Rust, handles over 1 trillion requests per day at Cloudflare's scale. The published results from Cloudflare's engineering blog (2022) are striking:

CPU usage: roughly 70% reduction compared to nginx
Memory usage: approximately 70% reduction
Connection establishment time: 2x faster (due to connection reuse architecture that Rust's type system made safe to implement)
Security: memory safety bugs are structurally eliminated

The memory savings alone justified the rewrite at Cloudflare's scale. At a trillion requests per day, even a kilobyte of memory saved per connection translates to gigabytes of RAM freed across the fleet.

AWS Firecracker

AWS built Firecracker to power AWS Lambda and Fargate — their serverless and container execution platforms. Firecracker is a microVM hypervisor: it creates and manages lightweight virtual machines, each isolated by the hypervisor boundary.

The constraints were extreme: Lambda functions must start in milliseconds, memory overhead per VM must be minimal (Lambda runs thousands of functions per physical host), and security isolation must be absolute (untrusted customer code runs in every VM).

Firecracker, written in Rust, achieves:
- Boot time: under 125 milliseconds to a running VM
- Memory overhead: approximately 5MB per VM (compared to 100MB+ for a full QEMU VM)
- Security: Rust's memory safety eliminates an entire class of hypervisor vulnerabilities

The 5MB overhead figure is particularly remarkable. At that density, a single server can host thousands of Lambda execution environments simultaneously, which is what makes Lambda's pricing model economically viable.

Discord: Go to Rust

Discord published a detailed engineering post in 2020 describing their migration of a critical service — the service responsible for tracking which users have read which messages — from Go to Rust.

The Go implementation was correct and fast enough for most workloads. But it had one problem: the garbage collector. As the service grew, GC pressure caused latency spikes every two minutes, coinciding with GC cycles. The 99th percentile latency reached 150ms during these spikes, far above their target.

After rewriting in Rust:
- P99 latency: dropped from 150ms to 10ms
- P95 latency: dropped from 40ms to 5ms
- Average latency: similar between the two implementations
- Memory usage: lower in Rust, with no GC-induced spikes

The team reported that the Rust version was also faster than the Go version in absolute terms — not just more consistent — because Rust could optimize memory layout in ways the GC-managed Go runtime could not.

Figma

Figma rewrote their multiplayer collaboration server — the component responsible for synchronizing edits between users — from TypeScript to Rust. The results: approximately 3x improvement in memory usage and significant latency improvements under load.

The pattern across all four of these cases is identical: the existing implementation was functional but had a ceiling imposed by its runtime model (GC pauses, memory fragmentation, unsafe memory access). Rust removed that ceiling.

flowchart LR subgraph Before["Before Rewrite (Go/C++)"] direction TB A1[Request arrives] --> B1[Process request] B1 --> C1[GC pressure builds] C1 --> D1[GC pause: 50-150ms spike] D1 --> E1[P99 latency degraded] end subgraph After["After Rewrite (Rust)"] direction TB A2[Request arrives] --> B2[Process request] B2 --> C2[Memory freed deterministically] C2 --> D2[No GC pause] D2 --> E2[P99 latency stable] end Before -->|"Rust rewrite"| After style D1 fill:#ff6b6b,color:#fff style D2 fill:#51cf66,color:#fff

The Learning Curve Is Real — and Worth It

Rust has a reputation for being difficult to learn. That reputation is earned. The borrow checker enforces rules that most programmers have never had to think about explicitly, and the compiler will reject code that would compile fine in any other language.

The first few weeks of Rust often look like this:

error[E0502]: cannot borrow `data` as mutable because it is also borrowed as immutable
  --> src/main.rs:8:5
   |
5  |     let r1 = &data;           // immutable borrow occurs here
6  |     let r2 = &data;           // second immutable borrow
7  |     println!("{} {}", r1, r2);
8  |     data.push_str(" world");  // mutable borrow occurs here
   |     ^^^^ mutable borrow occurs here
9  |     println!("{}", r1);
   |                    -- immutable borrow later used here

This error is the borrow checker working exactly as intended. You can't hold an immutable reference while mutating the data — that would invalidate the reference. The compiler is telling you that your code has a potential aliasing/mutation bug.

The frustrating part is that this is correct behavior from the compiler. The reassuring part is that once you understand why the compiler is complaining, you understand something true about memory safety that you didn't fully understand before.

The common experience among developers who push through the learning curve is that after a few weeks, the friction decreases dramatically. And there's a phrase you hear repeatedly in the Rust community: "if it compiles, it works." That's an overstatement, but it captures something real: the class of bugs that Rust eliminates at compile time are exactly the class of bugs that are hardest to catch in code review and most expensive to debug in production.

The bug discovery timeline comparison tells the whole story:

gantt title Bug Discovery Timeline by Language dateFormat X axisFormat %s section C / C++ Code written :done, c1, 0, 1 Bug introduced :crit, c2, 1, 2 Passes code review :done, c3, 2, 3 Passes QA :done, c4, 3, 5 Deployed to prod :done, c5, 5, 7 Customer reports bug :crit, c6, 7, 10 section Go Code written :done, g1, 0, 1 Bug introduced :crit, g2, 1, 2 Race detector catches :active, g3, 2, 3 Or: runtime panic :crit, g4, 3, 5 section Rust Code written :done, r1, 0, 1 Compiler rejects :active, r2, 1, 2 Fix immediately :done, r3, 2, 3

The cost of a bug correlates directly with how late in the process it is discovered. Compile-time discovery is free. Production discovery is expensive. Rust shifts nearly everything to compile time.

When to Use Rust — and When Not To

Rust is not the right tool for every job. The same properties that make it excellent for systems programming make it verbose and slow to iterate with for applications where performance and memory safety are not the primary concerns.

Use Rust when:
- You are building a network daemon or proxy that handles thousands of concurrent connections
- You need predictable, sub-millisecond latency without GC pauses
- You are writing a CLI tool that will be distributed as a binary (fast startup, no runtime dependency)
- You are targeting WebAssembly — Rust has the best WASM toolchain available
- You are writing embedded software where memory is constrained
- You are replacing existing C/C++ code and need the same performance envelope
- Security is a first-class concern and you need structural guarantees, not just best-effort practices

Don't use Rust when:
- You are building a standard CRUD web API — Go, Node, or Python will be faster to ship and easier to maintain
- You are writing machine learning pipelines — Python with PyTorch/JAX is the ecosystem and there's no good reason to fight that
- You need to move extremely fast and the team doesn't know Rust — the learning curve is a real productivity cost in the short term
- You are building a simple script or automation tool — the complexity overhead is not justified

flowchart TD A[New project or rewrite decision] --> B{Performance critical?} B -->|Yes| C{Memory safety critical?} B -->|No| D{Rapid prototyping?} C -->|Yes| E{Latency sensitive?} C -->|No| F[Go or Java] D -->|Yes| G[Python or Node.js] D -->|No| H{Team knows Rust?} E -->|Yes - no GC pauses| I[Rust] E -->|No - GC acceptable| J[Go] H -->|Yes| I H -->|No| K{Worth learning curve?} K -->|Long-lived system| I K -->|Short timeline| F style I fill:#b7410e,color:#fff style J fill:#00add8,color:#fff style G fill:#3572A5,color:#fff style F fill:#4B8BBE,color:#fff

Getting Started in 2026

The Rust toolchain has matured considerably. Installation is a single command:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

rustup manages your Rust installation, and cargo is the build system and package manager. Unlike C/C++, where the build system ecosystem is fragmented (Make, CMake, Bazel, Meson), Rust has a single, excellent build tool that the entire community uses.

The core ecosystem libraries you'll use in most projects:

# Cargo.toml — typical dependencies for a network service
[dependencies]
tokio = { version = "1.36", features = ["full"] }  # async runtime
axum = "0.7"                                        # web framework
serde = { version = "1.0", features = ["derive"] } # serialization
serde_json = "1.0"                                  # JSON support
clap = { version = "4.5", features = ["derive"] }  # CLI argument parsing
rayon = "1.9"                                       # data parallelism
tracing = "0.1"                                     # structured logging
anyhow = "1.0"                                      # error handling

The 2024 Rust edition (the language versioning system, separate from the compiler version) brought improvements to the borrow checker's handling of non-lexical lifetimes and ergonomic improvements to async code that removed some of the most common friction points for beginners.

A minimal async HTTP server with axum — the dominant web framework in 2026 — looks like this:

use axum::{
    extract::Path,
    routing::get,
    Json, Router,
};
use serde::Serialize;
use tokio::net::TcpListener;

#[derive(Serialize)]
struct Health {
    status: &'static str,
    version: &'static str,
}

async fn health_check() -> Json<Health> {
    Json(Health {
        status: "ok",
        version: env!("CARGO_PKG_VERSION"),
    })
}

async fn greet(Path(name): Path<String>) -> String {
    format!("Hello, {}!", name)
}

#[tokio::main]
async fn main() {
    let app = Router::new()
        .route("/health", get(health_check))
        .route("/greet/:name", get(greet));

    let listener = TcpListener::bind("0.0.0.0:3000").await.unwrap();
    println!("Listening on {}", listener.local_addr().unwrap());
    axum::serve(listener, app).await.unwrap();
}

This is idiomatic, production-ready Rust. The async/await syntax is clean, the routing is type-safe, and the whole thing compiles to a single static binary with no runtime dependencies.

Comparison chart: Rust ecosystem crates — tokio, axum, serde, rayon, clap — with download statistics and production adoption ratings, positioned against their C++ and Go counterparts

Generated with Higgsfield GPT Image — 16:9

The learning resources have also improved dramatically. "The Rust Programming Language" (colloquially "the book") is available free online at doc.rust-lang.org and is genuinely one of the best language references ever written. Rustlings (the interactive exercises), Rust by Example, and the official async book together provide a learning path that most developers can work through in two to four weeks of part-time study.

Conclusion

Rust isn't replacing everything. Python will continue to dominate ML. Go will continue to dominate DevOps tooling and microservices. JavaScript will continue to dominate the browser. Each has its place.

But for the systems that sit at the foundation of modern infrastructure — the network proxies, the hypervisors, the kernels, the security-critical daemons — Rust is becoming the default choice, and that shift is accelerating in 2026. The reasons are concrete: 70% of CVEs eliminated by design, predictable latency with no GC, performance competitive with C and C++, and a toolchain that has matured to the point where it no longer feels experimental.

The organizations that have made the switch — Cloudflare, AWS, Discord, Figma, Google, Microsoft, the Linux kernel team — are not doing it out of enthusiasm for new technology. They are doing it because the cost of the alternative is too high.

If you maintain systems written in C, C++, or even Go, the question isn't whether Rust is worth learning. It's whether your systems belong in the category where Rust's guarantees matter. For a surprising number of them, the answer is yes.

In the next post in this series, we'll go deeper on the Rust versus Go comparison — when to choose each, what the performance tradeoffs actually look like in practice, and how senior engineers think about picking between them for new systems in 2026.

Next: Rust vs Go in 2026: A Practical Guide to Choosing the Right Language

About the Author

Toc Am

Founder of AmtocSoft. Writing practical deep-dives on AI engineering, cloud architecture, and developer tooling. Previously built backend systems at scale. Reviews every post published under this byline.

LinkedIn X / Twitter

Published: 2026-04-10 · Written with AI assistance, reviewed by Toc Am.

Get These In Your Inbox

Weekly deep-dives on AI engineering, no fluff. Join the newsletter →

Subscribe (free)

Or grab the book ($39, ~100 pages) · Buy me a coffee

☕ Buy Me a Coffee · 🔔 YouTube · 💼 LinkedIn · 🐦 X/Twitter

AmtocSoft Tech Insights

Friday, April 10, 2026

Why Companies Are Rewriting Critical Systems in Rust in 2026

The Memory Safety Crisis

What Rust Actually Solves

Real-World Rewrites: Who, Why, and What They Found

Cloudflare Pingora

AWS Firecracker

Discord: Go to Rust

Figma

The Learning Curve Is Real — and Worth It

When to Use Rust — and When Not To

Getting Started in 2026

Conclusion

No comments:

Post a Comment

LLM Observability and Tracing in Production: Debugging the Black Box

Report Abuse

Labels