At my previous job, I've written production eBPF exclusively in Rust using Aya (mentioned by a sibling comment), and it's been a blast. Being able to share the type definitions between the kernel-space and the user-space code is a blessing to avoid subtle issues when going through the maps. And, at least in Rust, you can re-use crates and types that make you gain time. As a (simple) example, being able to use the standard library's IpAddr types or the ipnet crate to not have to roll your own IP and network manipulation libraries is a (small) timesave. It's main value is not needing to onboard new developers.
The Rust type system is a good helper in keeping the verifier happy. Slices, iterators, match statements, etc are very good in my experience (e.g. Option is a godsend to ensure you stay withing the bounds of the input packet, esp. slice::split_at when parsing headers).
But you're right that reading C is non-negotiatable, especially since pretty much all example code on the internet is in C.
I agree though, I think it makes more sense just to write the C code. The hard part of eBPF isn't writing the code; it's getting it past the verifier.
> gc, the Go compiler, has no LLVM-based BPF backend. Adding one is a multi-year compiler project. rustc is built on LLVM and that's why Aya works. So gobee emits C and reuses clang's BPF backend, which gives us mature codegen, BTF, and CO-RE relocations for free.
I wonder if TinyGo (https://tinygo.org/) might be a better fit here:
> TinyGo brings the Go programming language to embedded systems and to the modern web by creating a new compiler based on LLVM.
I've not played with TinyGo much so would be interested to hear other peoples experiences.
You could likely improve gobee to use tinygo's packages directly, instead of transpiling to C and calling into clang, and the licenses of the two projects look compatible. You'll still need to deal with defining a subset to pass the verifier, of course.
---
From the README:
> Replace clang. clang's BPF backend gives us CO-RE, BTF, and verifier-friendly codegen for free. Reimplementing that costs years and gains nothing.
The primary gotcha you may hit if you try this is how much of the BPF features are implemented by clang, and how much is instead implemented in core LLVM. Even with a LLVM sitting next door you could pull out, the harnesses may not exist independent of clang, but I have not looked THAT deep.
[1] https://github.com/mickael-kerjean/filestash/blob/master/ser...
What I was more curious about was how TinyGo compared with Googles Go compiler and whether TinyGo’s LLVM compiler is compatible with vanilla LLVM compilers (eg can TinyGo compile to all the same targets that, for example Rust, could?)
Apart from that, the usual qualms one might have about C are not really relevant in eBPF land, so I’ve actually found it the nicest experience writing C I’ve ever had, the verifier is just the price we have to pay.
- portability isn't a concern
- BPF ASM syntax is quite readable
- it can often let you write simpler code by directly doing what the verifier needs instead of dancing around trying to make Clang do it for you.
I think the most exciting alternative BPF language would be one where the compiler interacts with the verifier. E.g. if the program included a logical proof of correctness that the verifier could check more efficiently than its limited builtin analysis.
Personally I would choose Rust as well, but I would choose Rust for almost everything I do. I can see why a Go developer would want a similar experience.
If this was compiling the Golang to BPF then yeah, that would feel ridiculous, but given that it's transpiling instead then, assuming that it's generating correct and reasonable code, I think this is certainly fine enough. Especially if you're just writing a proof of concept or something pretty basic, there's no reason not to start here.
If you're doing something like trying to filter 40Gbps of network traffic in eBPF then you'd probably want to consider something more hand-tuned/low-level, but that might well be a premature optimization for all I know.
Note that we at Bomfather have our userspace code written in Golang and our eBPF code written in C.
But, either way, this is a really cool solution/idea and could make writing eBPF code a lot easier.
The tooling is phenomenal and fast. It won't let me accidentally not use a variable, meaning that it won't let me foo, err := something() and not check err. It makes a lot of stuff explicit (e.g. there's no `array.add(item)`, just `array = append(array, newitem)` which makes it more obvious that I might be creating a lot more arrays than just the one I'm trying to work with, but it lets me do `make([]string, 5000)` to pre-allocate the length I want if I know what I need.
Every variable type has a default 'empty' value that is a valid value; an int with no value assigned is 0, a string with no value assigned is "", so you never get corrupted or random data when one of your branches doesn't set the value.
It has a lot of nice thread-safety stuff, since goroutines are such a thing. There's built-in functionality to say "Spawn all these goroutines and then wait until they're done", but there's also functionality to say "Here's a function, it should be called at most once across the lifetime of the program" so that you don't have to manually synchronize "did I do this initialization yet? Is it done yet? Get a lock and then check everything and then set everything."
And it's fast. It's really, really fast. It's so fast that I was testing a GOCACHEPROG program to cache intermediate compilation results instead of recompiling them and in at least some cases it was faster to recompile than to use the cache. The cache was a cloud storage bucket in another country, mind you, but with Rust or C++ that would still be a huge win. With Golang I had to work really, really hard to get cloud storage of intermediate artifacts to be faster than just recompiling on my laptop.
So yeah, I hate Golang and I hate writing Golang but... yeah, it's pretty good.
Since you don't want to handle any kind of crash, out of bounds exception, etc., the eBPF verifier does a ton of impressively paranoid stuff. It ensures that the program doesn't loop (or if it loops that the loop is provably bounded and cannot be infinite), it guarantees that you don't read from a register that might not have been written to, etc.
Basically, it needs to be able to mathematically prove without a doubt that the program behaves as it's supposed to or the verifier refuses to load it at all. WASM doesn't do that, since WASM is a general-purpose 'machine' and WASM programs could theoretically just run forever in entirely reasonable cases.
This is why National Aeronautics & Space Administration (NASA) guidance is the following:
> Acronyms often confuse readers. Avoid them whenever possible. If an acronym is necessary for future reference, spell the full word and follow with the acronym in parentheses on the first reference. For example, The General Services Administration (GSA).
https://nasa.github.io/content-guide/abbreviations-and-acron...
There is also this longer memo on the NASA Technical Reports Server: https://ntrs.nasa.gov/citations/19950025292
So I can't blame the project's author for the lack of explanation about what BPF is. Particularly when it's just someone's personal project.
And before anyone complains about this comment: I do think the GP is completely fair in asking for clarification as to what BPF is too. There sometimes seems to be backlash on HN against people asking for a term to be explained. This comment isn't that.
My first thought would have been Band-Pass Filter, which is also a filter potentially related to computer systems.
I work in an industry with a lot of Three-Letter Acronyms (TLAs) and eXtended Three-Letter Acronyms (XTLAs) (sometimes known as Four-Letter Acronyms (FLAs)), and there they are often overloaded in their meanings. So in my experience, being clear about the definition is helpful to readers so they can immediately understand the document without having to triangulate meanings from the rest of the document.
Anyone who might need this would already know what BPF is. And anyone who isn't familiar with the term BPF in this context wouldn't be the target audience for this.
It's also worth noting that BPF isn't ever referred to in it's non-acronym form. Literally no-one in the field calls this "Berkeley Packet Filter". Just like nobody calls PHP "PHP: Hypertext Processor" (or whatever backronym they've decided on this week), nor SQL as Structured Query Language. The name for this technically literally is just referred to as "BPF".
So while I agree with your point in general -- it's not really a fair complaint in this specific occurrence.
And it's been explained on HN exactly what it is. So problem solved.
By the way, I noticed you didn't follow your own recommendation for the "RF" acronym in your comment. Nor "NASA" in your first reply. Perhaps you should check your own comments before you criticize others for doing the same.
Maybe you should also leave a comment in the Mullvad story (currently #1 on HN) that nobody has explained the VPN acronym there. Likewise for the threads where people reminisce about BASIC, of which there have been many lately. They're also only obvious if you already know the subject matter.
Not quite as much as Radar or Laser, but halfway there.
~On the other hand, BPF means different things in different domains, and isn't ubiquitous in the same way~
Edit: I should have written it out, that's on me :-)
Here you go: https://github.com/pratyushanand/learn-bpf
Per 6.4.3 (Identifiers) of C23 (ISO/IEC 9899:202y N3886):
— All identifiers that begin with a double underscore (__) or begin with an underscore (_) followed by an uppercase letter are reserved for any use, except those identifiers which are lexically identical to keywords.
— All identifiers that begin with an underscore are reserved for use as identifiers with file scope in both the ordinary and tag name spaces.
https://open-std.org/JTC1/SC22/WG14/www/docs/n3886.pdfAnd per 7.1.3 (Reserved identifiers) of C11 (ISO/IEC 9899:201x N1570):
— All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdfassuming it's your project here is some unsolicited feedback:
(1) imprint missing, no idea where the company is operating based on name or tld, cannot rule out it is in adversary country
(2) not a fan of curl | sh, looks way more professional if some prebuilt packages for common distros are also offered. maybe remove the yellow box and add some distro logos "available on your favorite distro"
(3) On landing page I think the last section with the cost comparison should actually be at the very top. No sysadmin wants to have AI chat on their machines. The cost comparison chart shows well-known tools that every sysadmin knows (splunk etc), and directly relates yeet to it - this is very good.
(4) the main landing page hero text is not really explanatory - linux ops is a big term, and there was not a lot of info I got out of it. Further down there is "yeet gives you kernel level visibility with featherweight overhead. Nothing gets dropped.", which I'd personally prefer. Maybe instead of "yeet is a JavaScript runtime for Linux Ops." use something like "yeet is a Javascript runtime for your linux kernel".
Generally the sysadmins I know are not looking for AI chats or agent toolkits, and right now these are "features" that might make people close the tab. But sysadmins want to easily get custom analytics and reduce SaaS costs, these features are looked for.
Maybe it makes sense to more clearly split up the "specialized Javascript for linux Kernel" thing from the AI features. No manager bats an eye if I install a new Javascript runtime that allows better LOCAL-FIRST (!) linux kernel analytics, but a lot of explanation needs to be done if there are "agents" or "AI chats" which can potentially exfiltrate data.
I'm not really sure why you'd want to use this. If you're writing eBPF, you already need to know how to read C kernel source.