Hacker News Clone

BigRedEyeMar 27, 2026, 11:07 AM

At Perforator, we also started from Google's beautiful pprof, but then eliminated all nested repeated fields, converging to https://github.com/yandex/perforator/blob/main/perforator/pr.... Repeated fields in protobufs are really memory & CPU hungry.

This layout allows us to quickly merge hundreds of millions of samples into a single profile. The only practical limit is protobuf's 2GB message size cap.

felixgeMar 27, 2026, 7:54 PM

We had reached out to y'all last year to explore taking ideas from your format. It definitely looks interesting! But IIRC nobody from your team ended up making it to one of our SIG meetings?

https://github.com/yandex/perforator/issues/13

petterroeaMar 27, 2026, 1:49 AM

This is cool, OTel is getting somewhere.

I've found OTel to still have rough edges and it's not really the one stop shop for telemetry we want it to be at $work yet. In particular, no good (sentry-style) exception capturing yet. They also recently changed a lot of metric names(for good reason it seems), which breaks most dashboards you find on the internet.

I have been warned by people that OTel isn't mature yet and I find it to still be true, but it seems the maintainers are trying to do something about this nowadays

darkwaterMar 27, 2026, 12:47 PM

I think that the "issue" around otel is that instrumentation is easy and free (as both in beer and freedom) but then for the dashboarding part is where there are literally tens of different SaaS solutions, all more or less involved with the OTel development itself, that want you to use their solution (and pay $$$ for it). And even if you can go a loooong way with a self-hosted Grafana + Tempo, even Grafana Labs are putting more and more new features behind the Grafana Cloud subscription model.

tuo-leiMar 27, 2026, 2:26 AM

do you have any suggestions for alternatives then (besides Sentry)? I do feel OTel have pretty wide support in general in term of traces.

petterroeaMar 27, 2026, 4:36 AM

I know a lot of shops that prefer the datadog stack, which apparently does have its own sentry-like exception capturing system. To me, exception capturing is an obvious core feature and it is humiliating to discuss OTel with people who agree, and use datadog and are satisfied.

jamiemallersMar 27, 2026, 9:16 AM

[dead]

d0963319287Mar 27, 2026, 2:34 PM

[flagged]

genthreeMar 26, 2026, 5:11 PM

Relatedly: Has anyone profiled the performance and reliability characteristics of rsyslogd (Linux and FreeBSD distributed syslogger, maybe other platforms too) in its mode where it’s shipping logs to a central node? I’ve configured and used it with relatively small (high single digit nodes, bursts of activity to a million or two requests per minute or so) set-ups but have wondered if there’s a reason it’s not a more common solution for distributed logging and tracing (yes it doesn’t solve the UI problem for those, but it does solve collecting your logs)

Like… has anyone done a Jepsen-like stress test on rsyslogd and shared the results? I’ve half-assedly looked before and not been able to find anything.

jbaiterMar 26, 2026, 7:38 PM

We're doing this with a few dozen GiBs of logs a day (rsylog -> central rsylog -> elasticsearch). It works reliably, but the config is an absolute nightmare, documentation is a mixed bag and troubleshooting often involves deep dives into the C code. We're planning to migrate to Alloy+Loki.

baby_souffleMar 27, 2026, 2:12 PM

Similar experience here as well. Syslog configs and plugins is a mess. Vector is not perfect but it’s got a decent amount of tooling and has native support for tests which is really useful

ehostunreachMar 26, 2026, 11:17 PM

Since this is an OTel-related submission, you could also use OTel collectors to collect and forward logs to a central OTel collector instance.

> yes it doesn’t solve the UI problem for those, but it does solve collecting your logs

I work for Netdata and over the last couple months, we've developed an external Netdata plugin that can ingest/index OTel logs [1]. The current implementation stores logs in systemd-compatible journal files and our visualization is effectively the same one someone would get when querying systemd journal logs [2]. i > Like… has anyone done a Jepsen-like stress test on rsyslogd and shared the results? I’ve half-assedly looked before and not been able to find anything.

I've not used rsyslogd specifically, but I don't see how you'd have any issues with the log volume you described.

[1] https://github.com/netdata/netdata/tree/master/src/crates/ne...

[2] https://learn.netdata.cloud/docs/logs/systemd-journal-logs/s...

nesarkvechnepMar 26, 2026, 6:50 PM

People don’t care about syslog. 98% of my colleagues haven’t heard of it.

malux85Mar 26, 2026, 7:28 PM

You are drawing a global conclusion from a tiny sample!

nesarkvechnepMar 27, 2026, 5:48 AM

I really hope that I am because I care about it and like to use it whenever I can.

rrrix1Mar 27, 2026, 2:51 AM

Except every sysadmin and security engineer ever

nesarkvechnepMar 27, 2026, 5:48 AM

Never met one who knew about syslog but I'm glad that there's a high chance I'm not right generally.

SEJeffMar 26, 2026, 7:25 PM

I wonder how this compares to grafana pyroscope, which is really good for this sort of thing and already quite mature:

https://grafana.com/oss/pyroscope/

https://github.com/grafana/pyroscope

ollienMar 26, 2026, 7:59 PM

As far as I'm aware, Pyroscope itself is not a profiler, but a place you can send/query profiles. OpenTelemtry is releasing a profiler, so they don't compare. One can be used with the other.

SEJeffMar 27, 2026, 12:52 AM

It definitely has profiling client libraries https://grafana.com/docs/pyroscope/latest/configure-client/l...

sciurusMar 26, 2026, 8:50 PM

You can send profiles collected by opentelemetry to pyroscope.

https://grafana.com/docs/pyroscope/latest/configure-client/o...

AbanoubRodolfMar 27, 2026, 3:08 AM

Not really either/or. Pyroscope already has an OTel-compatible receiver, so you can send OTel profiles directly into it without any format conversion. They're complementary.

The OTel profiling standard is more valuable as a client contract than a backend choice. Instrument once with the OTel SDK, then route to Pyroscope, Grafana Cloud, Datadog, or your own Tempo instance, without changing application code. That's the actual pitch.

The current friction isn't the standard itself. It's language coverage gaps. JVM continuous profiling via eBPF is solid and production-ready. Node.js still falls back to the V8 sampling profiler, which adds 2-5% overhead compared to sub-1% for kernel-level eBPF approaches. That's the gap worth watching in the alpha.

ollienMar 26, 2026, 6:47 PM

Very excited for this. We've used the Elixir version of this at $WORK a handful of times and have found it exceptionally useful.

secondcomingMar 26, 2026, 4:48 PM

> Continuously capturing low-overhead performance profiles in production

It suprises me that anything designed by the OTel community could ever meet 'low-overhead' expectations.

tanelpoderMar 26, 2026, 5:19 PM

The reference implementation of the profiler [1] was originally built by the Optimyze team that Elastic then acquired (and donated to OTEL). That team is very good at what they do. For example, they invented the .eh_frame walking technique to get stack traces from binaries without frame pointers enabled.

Some of the OGs from that team later founded Zymtrace [2] and they're doing the same for profiling what happens inside GPUs now!

[1] https://github.com/open-telemetry/opentelemetry-ebpf-profile...

[2] https://zymtrace.com/article/zero-friction-gpu-profiler/

rnrnMar 26, 2026, 9:06 PM

> For example, they invented the .eh_frame walking technique to get stack traces from binaries without frame pointers enabled.

This is not an accurate summary of what they developed.

Using .eh_frame to unwind stacks without frame pointers is not novel - it is exactly what it is for and perf has had an implementation doing it since ~2010. The problem is the kernel support for this was repeatedly rejected so the kernel samples kilobytes of stack and then userspace does the unwind

What they developed is an implementation of unwinding from an eBPF program running in the kernel using data from eh_frame.

tanelpoderMar 26, 2026, 10:12 PM

True, I should have been more specific about the context:

Their invention is about pushing down the .eh_frame walking to kernel space, so you don't need to ship large chunks of stack memory to userspace for post-processing. And eBPF code is the executor of that "pushed down" .eh_frame walking.

The GitHub page mentions a patent on this too: https://patents.google.com/patent/US11604718B1/en

BigRedEyeMar 27, 2026, 10:52 AM

I believe this is a case of convergent invention – the idea of pushing DWARF/.eh_frame unwinding into eBPF seems to have occurred to several people around the same time. For example, there's a working implementation discussed as early as March 2021: https://github.com/iovisor/bcc/issues/1234#issuecomment-7875...

felixgeMar 26, 2026, 5:13 PM

OTel Profiling SIG maintainer here: I understand your concern, but we’ve tried our best to make things efficient across the protocol and all involved components.

Please let us know if you find any issues with what we are shipping right now.

phillipcarterMar 26, 2026, 4:52 PM

Anything to actually add?

antonvsMar 27, 2026, 5:45 AM

Do you feel better now?

hikaru_aiMar 27, 2026, 7:17 AM

[dead]

vicistackMar 26, 2026, 10:30 PM

[dead]

OpenTelemetry profiles enters public alpha

Comments