by podoman on 2/1/24, 3:32 PM with 32 comments
Today we’re excited to be sharing a performance tool we’ve been working on, which helps you inspect the latency of code execution from the client to the server. As engineers at past startups, we often had to debug slow queries, poor load times, inconsistent errors, etc... While tools like Jaegar [2] helped us inspect server-side performance, we had no way to tie user events to the traces we were inspecting. In other words, although we had an idea of what API route was slow, there wasn’t much visibility into the actual bottleneck.
This is where our performance product comes in: we’re rethinking a tracing/performance tool that focuses on bridging the gap between the client and server.
What’s unique about our approach is that we lean heavily into creating traces from the frontend. For example, if you’re using our Next.js SDK, we automatically connect browser HTTP requests with server-side code execution, all from the perspective of a user. We find this much more powerful because you can understand what part of your frontend codebase causes a given trace to occur. There’s an example here [3].
From an instrumentation perspective, we’ve built our SDKs on-top of OTel, so you can create custom spans to expand highlight-created traces in server routes that will transparently roll up into the flame graph you see in our UI. You can also send us raw OTel traces and manually set up the client-server connection if you want. [4] Here’s an example of what a trace looks like with a database integration using our Golang GORM SDK, triggered by a frontend GraphQL query [5] [6].
In terms of how it's built, we continue to rely heavily on ClickHouse as our time-series storage engine. Given that traces require that we also query based on an ID for specific groups of spans (more akin to an OLTP db), we’ve leveraged the power of CH materialized views to make these operations efficient (described here [7]).
To try it out, you can spin up the project with our self hosted docs [8] or use our cloud offering at app.highlight.io. The entire stack runs in docker via a compose file, including an OpenTelemetry collector for data ingestion. You’ll need to point your SDK to export data to it by setting the relevant OTLP endpoint configuration (ie. environment variable OTEL_EXPORTER_OTLP_LOGS_ENDPOINT [9]).
Overall, we’d really appreciate feedback on what we’re building here. We’re also all ears if anyone has opinions on what they’d like to see in a product like this!
[1] https://github.com/highlight/highlight/blob/main/LICENSE
[2] https://www.jaegertracing.io
[3] https://app.highlight.io/1383/sessions/COu90Th4Qc3PVYTXbx9Xe...
[4] https://www.highlight.io/docs/getting-started/native-opentel...
[5] https://static.highlight.io/assets/docs/gorm.png
[6] https://github.com/highlight/highlight/blob/1fc9487a676409f1...
[7] https://highlight.io/blog/clickhouse-materialized-views
[8] https://www.highlight.io/docs/getting-started/self-host/self...
[9] https://opentelemetry.io/docs/concepts/sdk-configuration/otl...
by kirubakaran on 2/2/24, 9:10 PM
by Kovah on 2/3/24, 11:30 AM
by omeze on 2/3/24, 6:41 AM
I clicked into the demo (reference 3 in your post) and I dont get how this is different from any off the shelf OTel solution like Lightstep or Jaeger. Otel already has client side vendors and SDKs (I was the engineer who introduced OpenTracing to Plaid, and we did both). Bridging cross-network traces is literally the point of OTel, so Im not sure why thats a differentiator for you… every system does it.
by PreInternet01 on 2/2/24, 8:53 PM
IBM Plex Mono,Menlo,DejaVu Sans Mono,Bitstream Vera Sans Mono,Courier,monospace
...will end up being "Courier" and that's not good in any way, shape or form. Consolas, Cascadia Code|Mono, even Lucida [Sans] Console are much nicer, and would spruce up the Safari experience for some segment of your audience.Plus, not sure why timestamps use font-light (this is Tailwind, right?), it just makes these less readable, while not taking up less space or anything.
by fifthofhisname on 2/3/24, 6:42 PM
by jamager on 2/2/24, 8:38 PM
Would be nice to be able to see the product without having to sign in.
by samstave on 2/2/24, 11:43 PM
Can this be used as a pen-testing tool to Highlight-TraceRt through a {target url}?
--
Re:
Yeah, though what benefits would surmise a Shopify Shop Keeper could get from your service, or do you think this is a tool that Shopify Corporate should be using on their hosting infra?
by viraptor on 2/3/24, 1:05 AM
by withinboredom on 2/3/24, 10:33 AM
by ngalstyan4 on 2/2/24, 10:30 PM
Does this only collect logs from frontend?
Or it can also collect the backend and DB latency data related to a frontend interaction?
by loceng on 2/3/24, 9:23 AM