Hacker News Clone

jandrewrogersMay 26, 2026, 3:30 AM

I would summarize it thusly: Rust is roughly as performant as C. This matches my experience and Rust is more ergonomic than C in many regards. The caveat is that modern C++ is notably more performant than C and by implication Rust. This also matches my experience for both C and Rust.

I think most of this is attributable to the ergonomics of compile-time expressiveness. C++ can effortlessly do things that require mountains of ugly boilerplate and macros in C or Rust. In principle they can express the same things but no one wants to write or deal with that ugly boilerplate so the equivalency is never realized in real code bases.

Zig is interesting because it slots in as a C-like language with a competent and expressive compile-time story. I don’t use Zig but I recognize its game.

root-parentMay 26, 2026, 8:08 PM

>> The caveat is that modern C++ is notably more performant than C and by implication Rust.

Please provide proof for this outrageous statement.

musicaleMay 27, 2026, 6:34 AM

Is it outrageous because "performant" is kind of a vague term. Does it mean... Fast? GPU-friendly? Scalable? Energy-efficient? Reliable? User-friendly? Maintainable? For what kind of applications?

Modern Fortran has a lot to offer for scientific and numeric computation - easier to learn than C++, and easier to optimize in many cases. Scales from small systems to supercomputers, and there is even CUDA Fortran.

mynegationMay 26, 2026, 8:20 PM

I was also dumbfounded by this claim. The only thing I could think of were C++ monomorphic templates that will avoid the penalty of some indirection and DIY dynamic typing.

apiMay 26, 2026, 9:54 PM

I think they may be talking about math algorithm heavy code, where C++'s looser almost-just-a-substitution generics system (really "templates" not even quite generics) can be used to create abstractions that compile everything down to inlined maximally performant code.

This type of code tends to be hard to maintain though.

AFAIK you can get there in Rust but it's a little more cumbersome. You have to implement a lot of operators, and for that type of code you might actually benefit from #[inline(always)] which is discouraged in normal Rust.

cogman10May 26, 2026, 7:44 PM

> modern C++ is notably more performant than C and by implication Rust

I don't think this holds. Rust has the same facilities which C++ has. Rust's metaprogramming capabilities are now on par with C++ (they weren't always). Rust has a similar generics implementation which allows it to do what C++ does in terms of method dispatch and generation. And now Rust has pretty much the same compile time constant generation capabilities that C++ has.

I don't think there's a part of C++ which isn't in Rust at this point. The only thing potentially missing is the experience and investment using those features.

spacechild1May 26, 2026, 9:40 PM

> Rust's metaprogramming capabilities are now on par with C++ (they weren't always).

Is that really true, though? I haven't really written any Rust code, so I have no idea, but I don't think Rust has static reflection. Also, aren't const generics much more limited? I've also heard there is no template specialization and no "if constexpr". Or what about dynamic allocations in constexpr functions?

cogman10May 26, 2026, 9:59 PM

> I don't think Rust has static reflection.

Before C++ in fact through procedural macros. You can do everything you can do with C++ static reflection.

Now, it could be better. Proc macros require you to pull in secondary packages for parsing the token stream. But all the sorts of operations you can do via static reflection you can do via proc macros. That's how the most popular rust serialization package like serde works. It's also how some more popular database libs work like sqlx.

> Also, aren't const generics much more limited? I've also heard there is no template specialization and no "if constexpr".

Both have been added and expanded. AFAIK they are now roughly on par with what C++ const expressions can do. What they can't do, proc macros can do.

> Or what about dynamic allocations in constexpr functions?

IDK if that's possible in rust. Const expr capabilities of rust have been rapidly expanding though in the last year.

spacechild1May 26, 2026, 10:54 PM

> You can do everything you can do with C++ static reflection.

Are you really sure about that?

I have a slight problem with such sweeping statements and also with your original claim that "Rust's metaprogramming capabilities are now on par with C++". I think you can only make such claims if you know both languages really, really well.

That being said, I acknowledge that Rust's metaprogramming capabilities have improved significantly in recent years.

> Both have been added and expanded.

In stable Rust?

dwatttttMay 26, 2026, 11:32 PM

It's hard to be concrete without talking about something specific. At the limit, in stable Rust (for 8 years?), a proc_macro consumes and emits an arbitrary token stream at compile time; it's not ergonomic, but it's possible.

The equivalent in C++ is in the realm of arbitrary codegen.

zamalekMay 26, 2026, 8:50 PM

Last I checked (which was a while ago to be fair), LLVM machine code quality still lagged behind GCC - so things should be slightly more interesting with the GCC back end.

There were also some bugs (hence disabled optimization passes) and missed opportunities from the lack of aliasing Rust precipitates - again, not sure where those sit - and GCC will have to play catch up here (unless there are other languages that exercise this part of the backend).

stephc_int13May 26, 2026, 9:11 PM

C++ being more performant than C is not something that I've seen in any benchmarks or personal experience.

In practice, some of the cases about specialization that was made possible with C++ constructs is also achieved by modern C compilers.

flohofwoeMay 26, 2026, 6:41 AM

> The caveat is that modern C++ is notably more performant than C and by implication Rust.

This really needs more realworld evidence to back up the claim. In the end the important optimizations happen down in the Clang optimizer passes on the LLVM IR, and those optimizations are the same across C, C++, Rust (or Zig for that matter) - assuming of course that the optimizer can see all function bodies, which in C can be achieved via LTO or alternatively via 'unity builds'.

If the output of one of those languages differs so much (on an LLVM-based compiler) that there are noticeable performance differences I would start investigating whether there's a compile/link setting missing somewhere instead.

logicchainsMay 26, 2026, 7:43 AM

OP said "C++ can effortlessly do things that require mountains of ugly boilerplate and macros in C or Rust". In theory Rust can be as performant but some things are much less ergonomic to do in Rust macros than in C++ metaprogramming, so often end up not being done.

flohofwoeMay 26, 2026, 8:25 AM

Often that's also because the programmer doesn't know how the optimizer will help them to remove inactive code also in C code. As a simple example, when I have a 'general' bulk-getter function in C which returns a large struct with tons of values but the caller is only interested in one value, the compiler will 'collapse' the entire function call to a single memory access (if it can see the function body, but this is where LTO comes in), e.g.:

https://www.godbolt.org/z/n3Y54Yhqr

This is basically the gist of C++ 'zero cost abstraction', but C-style (the bulk of what enables C++ zero-cost-abstraction doesn't happen up in the language, but down in the optimization passes).

elcritchMay 26, 2026, 8:49 AM

Nim also has top notch meta programming, probably more so than Zig. You can easily do loop unrolling, specialization, etc. For example Constantine, which is a constant time crypto library that outperforms C, etc.

To me programming Rust feels so limiting due to lack of good compile time meta programming with types. That’s the key.

zamalekMay 26, 2026, 8:52 PM

I have tried Nim meta programming (to make a tree vaguely like the one used by Zed), and the tooling support of pretty dreadful. I ran into multiple compiler crashes or simply unhelpful and confusing error messages, alongside LSP hangs.

afdbcreidMay 26, 2026, 12:01 PM

How can you create constant-time code with Nim when none of its backends support it (e.g. LLVM may turn an apparently-constant-time code into non-constant-time assembly)?

cb321May 26, 2026, 3:31 PM

You can see all the details at: https://github.com/mratsim/constantine , but to answer your "how" question briefly here, something Nim shares with most (all?) "systems programming languages" is "easy" integration with assembly languages -- whatever the backend for "most" compiled code is (whatever that "most" even is - weighted by any number of measures of static source size or dynamic instruction counts). Of course, hand-rolled assembly can cost you a lot in portability/effort to port to new platforms/etc.

The entire concept of the "performance of a PLang" in terms of the run-time of programs written "mostly in it" is rather seriously under-specified, TBH. This is (or should be) uncontentious in spite of the slew of articles with titles like the one for this thread.

elcritchMay 26, 2026, 3:58 PM

Exactly, Constantine generates assembly output and links that into normal Nim objects (e.g. C). That can then used in Nim or in Rust, Go, etc.

From its "Why Nim" in the readme:

- Assembly support either inline or a simple {.compile: "myasm.S".} away

- No GC if no GC-ed types are used (automatic memory management is set at the type level and optimized for latency/soft-realtime by default and can be totally deactivated).

- Procedural macros working directly on AST to create generic curve configuration, derive constants write a size-independent inline assembly code generator

nsajkoMay 26, 2026, 10:53 AM

Julia is another contender. Julia code can be as performant as C++ code, but Julia code may be even more elegant than C++. Even without accounting for Julia's metaprogramming features, the compile-time expressiveness is top-notch.

It shares some of the same drawbacks as C++, though. The language is extremely powerful, so while it is easy to write performant code, it is also easy for non experts to write very suboptimal code.

afdbcreidMay 26, 2026, 12:05 PM

Julia is not a systems language. Also its design (GC, dynamic typing) does not allow it to reach exactly the same level as C++.

postflopclarityMay 26, 2026, 6:59 PM

you are right one has to be careful to avoid the GC and dynamic dispatch, but if you do it can for sure reach the same level as C++. with tightly optimized Julia code there is little to no overhead over any other low-level language.

rirzeMay 26, 2026, 12:33 PM

Julia only cares about numerical performance, and in that regime, it’s pretty fast.

So not generally fast, no.

vvandersMay 26, 2026, 9:41 AM

I'm surprised to see someone putting forth the argument that templates are easier to use than macros. I've found the opposite and in many cases the monomorphization of templates to explode code size which has a fairly material impact on performance in my domains. Debugging macros with cargo expand is infinitely easier than debugging template errors.

While you can write high performance C++ my experience is that many people will reach for shared_ptr and their like while Rust will force them into proper structure/ownership as Arc and their like have a lot higher friction.

tialaramexMay 26, 2026, 11:04 AM

You say you are "summarizing" something but instead you seem to have just injected your opinion that C++ is "notably more performant than C and by implication Rust".

It's true that you can express many things in C++ -- the problem is that the language deliberately doesn't distinguish whether the things you've expressed are nonsense, so you might well have written total nonsense and you only find out when, much later, diagnosing a real world event you discover oh, this is nonsense, why did this even compile? Well sorry, it was "more performant" to allow nonsense.

gobdovanMay 26, 2026, 2:51 AM

Rust is in an awkward position of being already complicated enough that adding proofs for skipping bounds checks probably will not happen for a long time, even though this kind of low-level operation is where a lot of optimisation is lost.

Compounding on this, Rust is also unstable underneath, since there is no public, stable contract for carrying high-level semantics from HIR into MIR. Because these high-level invariants are lost during compilation, the compiler cannot easily use them to prove and eliminate low-level safety checks. But even if the frontend was perfect, Rust relies on LLVM's language-neutral SCEV, which operates purely on low-level math and cannot reason about high-level language semantics.

Ultimately, a lot of things would need to change for Rust to pay no performance for safety features.

aw1621107May 26, 2026, 3:48 AM

> Compounding on this, Rust is also unstable underneath, since there is no public, stable contract for carrying high-level semantics from HIR into MIR. Because these high-level invariants are lost during compilation

Not sure if I'm just out of the loop, but I'm having a hard time following this line of reasoning. Why is a public and/or stable contract needed to carry high-level semantics from HIR to MIR? Neither seems necessary to me; from what I understand HIR and MIR are rustc-internal so public contracts shouldn't matter, and the lack of stability means the Rust devs aren't precluded by backwards compatibility from modifying the IRs to add the ability to carry such invariants.

gobdovanMay 26, 2026, 5:41 AM

Whoops! Although there is no public contract between HIR and MIR, the public part was not relevant here. What I wanted to highlight is that if they'd want to add proper proof machinery to eliminate low-level safety checks, they'd have to do it at: surface language, which is already complex enough; then HIR->MIR boundary with clean provenance (which I think current MIR would collapse too aggressively) and which may require a much clearer contract; then, even if they do the full front- and mid- ends properly, if you leave it up to LLVM, it ends up in SCEV, which is language neutral and would not be a good fit to support the proof obligations that would be specific to Rust.

I dug up a proposal from 2021 around bounds check hoisting in MIR, and from the discussion, details are pretty thorny [0]. It's narrower than general proofs but the frictions are very similar. The easiest example that shows HIR -> MIR difficulties is the part around `for i in 0..32 { a[i] = 1; }`. At the source level the range fact is super obvious, but after the for-loop/iterator lowering the MIR optimiser has to recover that `i` comes from exactly that range before it can turn 32 checks into the one hot-path check. Then it also would have to check for panic strategy to maintain the correct behaviour after optimisation.

[0] https://github.com/rust-lang/rust/issues/92327

nicoburnsMay 26, 2026, 10:53 AM

Of course you can write the above as:

a[0..32].iter_mut().for_each(|el| *el = 1)

and have per-iteration bounds checks elided in Rust today.

rstuart4133May 26, 2026, 10:35 PM

Or as mentioned in the OP, just add at the top:

    assert!(a.len() >= 32);
    for i in 0..32 {
        a[i] = 0;
    }

Or:

    for i in 0..std::cmp::min(a.len(), 32) {
        a[i] = 0;
    }

I confess I hadn't thought about the implications of any of this before reading the article. If you need to squeeze the last 10% of performance out of your code, I'd consider it required reading.

As for the speed comparisons with C++, the OP says at the end you tell the C++ compiler to be as strict as Rust using "-D_FORTIFY_SOURCE=3 -fsanitize=bounds,object-size" & hardened STL, then it slows to below Rust speeds for the same safety unless you use the same techniques.

It's a shame the other optimisation techniques you need to bring Rust in line with C++ aren't as easy to apply.

gobdovanMay 27, 2026, 2:49 AM

[dead]

aw1621107May 26, 2026, 9:05 AM

OK, I think that makes more sense. Thanks for taking the time to explain!

afdbcreidMay 26, 2026, 3:30 AM

The overhead of bounds checking varies a lot. In the common case it's negligible (few percents), but in some cases, depending on what you build, it can go up to even 20%. And if it prevents autovectorization it can cost even more.

There are techniques to minimize the perf loss, though (safely), and of course you can use unsafe code. If you do it smartly, in the vast majority of cases bound checks do not matter (in fact, even in C++ there is a push for a hardened standard library that does bound checks, and e.g. Google uses that).

Rust will never include full proofs, but it might include ranged integers which can minimize bound checks even more.

CrazyboyQCDMay 26, 2026, 3:55 AM

[dead]

peterfireflyMay 26, 2026, 8:19 PM

You can sometimes just add asserts for the index variable(s) and have the LLVM optimizer go "hmm, I should try to prove that those are true" and then get the range checks optimized away.

IshKebabMay 26, 2026, 7:48 PM

The benchmarks in this talk show that the bounds checks are mostly insignificant, and actually it's the integer overflow checks that are far more costly.

Actually nm, I forgot those are disabled in release mode. Good decision I guess.

IcyWindowsMay 26, 2026, 11:40 PM

Do they even count towards safety if they aren't in release mode?

IshKebabMay 27, 2026, 7:21 AM

You can enable them in release mode optionally. But I would say not. Really we need ISAs to provide a no-overhead way to check integer overflow.

encodedroseMay 26, 2026, 1:47 AM

If I followed, Rust's memory safety guarantee means sacrificing roughly ~3% performance with some worst case paths being ~15% (compared to C++ performance)?

marcosdumayMay 26, 2026, 2:28 AM

That's on the typical performance for bounds checking in C too.

But no, "memory safety" includes most of the things discussed on the slides, and those number are for bounds checking only.

encodedroseMay 26, 2026, 4:02 AM

Ah, I was using GH's webui instead of downloading to view the PDF and it stopped loading at slide#47...rereading it now paints a much better picture. Thanks!

AnimatsMay 26, 2026, 3:02 AM

There's a discussion of "delayed bounds checking", but not "hoisted bounds checking", where bounds checking is done early. Consider

    let mut tab: [usize;100] = [0;100];
    ...
    for i in 0..101 {
        tab[i] = i;
    }

This must panic at i=100. Panic becomes inevitable at entry to the loop. Is the compiler entitled to generate a check that will panic at loop entry? The slides suggest that Rust does not hoist such checks, and, so, with nested loops, it has trouble getting checks out of the loop, which prevents vectorization.

guerbyMay 26, 2026, 2:28 PM

On https://godbolt.org/ select Ada and compiler option "-O2"

    function Square(num : Integer) return Integer is
        tab : array (0..100) of integer;
    begin
        for i in 0..101 loop 
            tab(i):=i; 
        end loop;
        return tab(100);
    end Square;

The assembly code generated is :

    sub     rsp, 8    #,
    mov     esi, 11   #,
    mov     edi, OFFSET FLAT:.LC0     #,
    call    "__gnat_rcheck_CE_Index_Check"  #

Loop is not run and exeption handler is called directly.

Link : https://godbolt.org/z/qT4TsKPxz

AnimatsMay 26, 2026, 6:42 PM

Right, that's the extreme case, where the problem is detected at compile time. Unfortunately, it's not a user-visible error message at compile time.

Need to try an example where the size isn't known until run time.

afdbcreidMay 26, 2026, 3:27 AM

Currently LLVM cannot do that because the panic message includes the erroneous index. You can do it manually though if you add `_ = tab[100]`.

Even if the panic message would not include the index, LLVM was unable to do that if the previous iterations had side effects (for example if `tab` is not a local variable).

santiagolertoraMay 26, 2026, 1:27 PM

[dead]

jarymMay 26, 2026, 9:21 AM

I've been doing more and more Rust. Even with sscache the compile times are not great so for any moderately sized codebase that requires frequent rebuilds I don't know how everyone else is doing it

wongarsuMay 26, 2026, 9:55 AM

I'd assume mostly by avoiding the need for frequent rebuilds. Incremental builds are pretty fast (at least fast enough for my needs on a moderate codebase), full rebuilds can be brutal

There are also some optimization tricks related to how you split your code among crates, since a unit of compilation is mostly one crate. Putting your FFI code in a separate crate (-sys crates are the norm) and splitting some of your code in libraries that can be compiled in parallel are the common examples

unsolved73May 26, 2026, 10:11 AM

the linking of the project can take more time than actual compilation.

Use the lld linker instead of the default one, see https://kerkour.com/rust-production-checklist#use-the-lld-li...

cauterizeMay 27, 2026, 5:17 AM

kache helps

peterfireflyMay 26, 2026, 8:22 PM

On Windows? Use a dev drive.

suis_sivaMay 26, 2026, 6:15 AM

I worked professionally with C, C++, Zig and Rust (in that order). My experience is that writing performant code is by far the easiest in C++, and by far the most difficult in C. Most of this, in practical experience, is due to ergonomics, in my opinion.

Templates in C++ benefit from being part of the core language, -- stick a `template` above your `class`, and you're in metaprogramming land. Stick a template specialization, and you've done a niche optimization. You didn't need a separate crate or a whole macro DSL. Variadic templates are also really really nice for monomorphizing N-ary generic functions. The duck typing of templates makes

This is precisely where I struggle with Rust the most -- monomorphization is limited within generics, so you end up going to the `proc_macro` hell, which involves a separate crate, a separate Cargo.toml, etc.

Zig seems like it would fit the bill -- and doing micro-optimizations within zig is surprisingly easy. The language's comptime facilities allow for really good niche optimizations -- however, the language also has some strange decisions. The allocator interface is notoriously a vtable, so a lot of the DOD optimizations that andrewrk has spoken numerously of (and to be clear -- I did learn a lot about DOD from his talks back when I was a wee engineer), raise one of my eyebrows.

C seems like it should be fast, but implementing any data structure, any generic algorithm in C is impossible. Either you're copy-pasting, or you're making macro DSLs. None of which is great.

---

To further talk about the C++ situation -- the monomorphic allocator interface was always awesome. Compared to Zig's vtables and Rust's nothing (up until a couple days ago), having a way to pass custom allocators with types was awesome. The new std::pmr::* interfaces and containers are also really exciting -- monomorphization, as beautiful as it is, does cost a lot -- refactoring it is not easy, compilation times are a mess. Sometimes the right tool is a vtable interface, and, C++ gives you those facilities.

And this is C++'s no1 problem when it comes to performance too -- it's a leviathan -- it'll give you the tools to write REALLY fast code, but it will also give you inheritance -- forget about your caches then.

When I was working at Tesla, there were some pretty gnarly vtable jumps in firmware (of all places), and I suspect part of that could've been alleviated if people knew more about CRTP.

So, here's where I land -- C++ really will give you the tools it can to let you write the fastest code possible. But it will also give you the tools to make your code really slow. Committee language means everyone in the committee needs to be happy.

Rust, on the other hand, is really designed to promote safe-but-very-fast practices -- had the firmware that I discussed used Rust, my guess is that we would've gravitated towards generics and monomorphization, rather than the heavy dynamic inheritance. C++, when it comes to performance, as it does to all other things, is a barreled shotgun. Rust's design almost always promotes the best available pattern and that's why I rarely reach out for C++.

PanzerschrekMay 26, 2026, 6:36 AM

For a couple of years I have written an advanced software rasterizer (like in old PC games) using Rust. With a little bit of unsafe code it was doable and result performance was great. I only used unsafe in places mentioned in the article above, like in tight loops where the compiler's optimizer struggles to remove bounds checks and in a couple of places where CPU intrinsics were used.

smasher164May 26, 2026, 6:07 AM

You end up needing something like refinement types to control the way you statically enforce bounds. That being said, there's stuff like https://flux-rs.github.io/flux/ which uses macros to layer a refinement type system on top of rust's. You can use it to statically eliminate bounds checks.

ozgrakkurtMay 26, 2026, 8:35 AM

There is no performance of language, it is very dependent on the compiler in any language. I don’t think even clang/gcc can “fully” optimize c

_alphageekMay 26, 2026, 3:35 AM

I would have liked to have the checks-off delta plotted across rustc versions - the deck notes this stuff moves non-monotonically, so a trend line would say more than a single-version snapshot.

DeathArrowMay 26, 2026, 6:30 AM

I was looking at Zig. It's performant, it's easier to reason about Zig code than Rust code but its api is unstable, there are a lot of breaking changes. Coding agents have a difficult time write proper Zig because of the breaking changes and of the small amount of new Zig code in the wild.

yogthosMay 26, 2026, 5:45 PM

I find the real issue with Rust is the compiler performance. I decided to use it for a project, and frankly I have huge regrets now that it's grown big enough. It literally takes over a minute to compile on my M1 laptop.

I just don't understand how people find this sort of thing normal. If you implement a feature, and then you want to see it in action, the feedback loop for that is insanely slow. It's incredibly jarring coming from Clojure where you have a live development experience.

I've used Go before, and while you still have to wait for compiling, at least the compiler is actually fast.

I get the problem Rust aims to solve, but the ergonomics are just not there in my opinion.

dralleyMay 26, 2026, 8:32 PM

There are a number of compiler performance enhancements (and correctness improvements) that are being worked on that are kind of at a chokepoint behind some other piece of work. Unfortunately it's not that easy to discern what state they're in or how quickly they're making progress.

At some point though a lot of work will be able to start advancing at once, so long as people exist to do the work.

e.g. https://rust-lang.github.io/rust-project-goals/2026/parallel...

DauntingPear7May 26, 2026, 6:20 PM

Yeah rust has issues. Have you tried splitting your project into multiple crates if possible?

yogthosMay 26, 2026, 7:38 PM

I haven't tried that yet, I guess that would be an option once I get a few stable pieces that I'm not likely to be touching much.

sibidharanMay 27, 2026, 7:47 AM

[dead]

sspoiskMay 26, 2026, 9:37 AM

[flagged]

jccx70May 26, 2026, 9:06 AM

[dead]

Performance of Rust Language [pdf]

Comments