Exploring OpenTelemetry Profiling Progress — Quick Start with the eBPF Agent
This article introduces the progress of OpenTelemetry Profiling and provides a quick guide to trying out the profiling eBPF agent. We’ll cover the concept and implementation of OTEL Profile in the next article.
Although profiling has progressed rapidly, the profile feature remains in the experimental phase and is not recommended for production use.
Background
In software engineering, observability refers to the capability of collecting and analyzing data on a program’s execution, internal module states, and inter-component communications. The three pillars of observability — metrics, traces, and logs — are key tools for gaining deep insights into application behavior, especially in distributed systems.
In distributed systems, application components are deployed across multiple nodes, increasing the complexity of event chains and data flows. Effective use of these observability tools can help identify potential issues, locate the source of failures, and greatly improve system stability and maintainability.
While the three pillars of observability provide extensive insights into system behavior and performance, they cannot delve into runtime performance at the code-function level. Profiling provides guidance for understanding resource usage and execution efficiency in applications. Continuous profiling allows ongoing performance analysis during application runtime, helping developers understand how code performance evolves over time.
In March of this year, OpenTelemetry Profiling SIG, founded two years ago, announced support for Profiling Signal. Since then, significant progress has been made from standardization to tool development.
Progress
OTLP Profile Type
Following traces, metrics, logs, and baggage, OTLP 1.3 added a new data type, profiles. Initially planned for compatibility with Google PPROF, this goal was later dropped.
eBPF Agent Improvements
In June, Elastic donated its profiling agent to the OpenTelemetry community, leading to the creation of the opentelemetry-ebpf-profiler project.
The ultimate goal of the eBPF agent is to function as a collection receiver on each node, gathering host profiling data and forwarding it via OTLP.
Collector Support for Profile Data
Since v0.112.0, OpenTelemetry Collector can receive, process, and export profiling data, with OTLP-based extraction and export enabled via service.profilesSupport
.
Let’s now walk through a quick start with the eBPF profiling agent.
Quick Start
The eBPF agent has specific OS kernel requirements, with minimum kernel versions varying by platform:
- amd64/x86_64: 4.19
- arm64/aarch64: 5.5
This guide uses a macOS + Linux cloud server environment. Currently, M1 virtual machines cannot access data outside the kernel, but this may change as tracked in this issue.
On a Linux host, we’ll build and run the eBPF agent. Docker is required for the build process, and official Docker images are available for convenience.
On macOS, we’ll run Elastic’s devfiler, which enables data visualization. Devfiler supports both x86 and arm65 versions of macOS and Linux. In development, devfiler can act as the profiling data receiver, listening on 0.0.0.0:11000
.
The Linux agent needs access to the devfiler instance on macOS to report data. If both are on the same network, they can communicate directly. Here, using a cloud host, I’ve connected networks via ZTM tunnel, so you’ll see the agent configured to 127.0.0.1:11000
. For network bridging, refer to my guide on enhancing remote access for Zspace NAS with ZTM.
Let’s get started!
Build and Compile
Clone the repository and compile it with make
. The executable ebpf-profiler
will be available upon success.
git clone https://github.com/open-telemetry/opentelemetry-ebpf-profiler.git
cd opentelemetry-ebpf-profiler
make docker-image
make agent
The eBPF agent hasn’t released a stable version yet, so the version number is still v0.0.0
🤣.
./ebpf-profiler --version
v0.0.0
Start Devfiler
Download Elastic’s devfiler and authenticate using token c74dfc4db2212015
.
If the macOS application is unsigned, after extracting, use the following command:
xattr -d com.apple.quarantine ~/Downloads/devfiler-apple-silicon.app.zip
If no data is received, the screen will remain empty.
Start the Agent
On the Linux host, use the following command to start the agent, specifying the devfiler address via collection-agent
.
sudo ./ebpf-profiler -collection-agent=127.0.0.1:11000 -disable-tls
Here, I used Docker to run a Java application to simulate application activity.
docker run --rm addozhang/spring-boot-rest:latest
Viewing Data
If the eBPF agent and network are configured correctly, you should see data in devfiler, such as a flame graph displaying the Java application’s call stack.