Efficient Application Log Collection and Analysis using OpenTelemetry and Loki
In previous articles, we introduced how to use Otel’s automatic instrumentation in Kubernetes and how Otel collaborates with service mesh to implement distributed tracing. Both articles focus on distributed tracing, but logs, as one of the three pillars of observability, are also a frequently used system observation method. Today’s article will experience the operational closed-loop of application logs.
Background
Introduction to OpenTelemetry
OpenTelemetry (referred to as Otel) is an open-source project aimed at providing unified standards for distributed tracing, metrics, and logging, simplifying the observability of applications. It offers a range of tools and APIs for collecting and transmitting performance data and logs of applications, aiding developers and operations teams in better understanding system behavior. Its features include automatic and manual detection of application trace data, collection of key metrics, and capturing and transmitting logs. Otel supports various programming languages and frameworks and can be integrated with several backend systems, such as Prometheus, Jaeger, Elasticsearch, etc.
Log is a part of the OpenTelemetry project, designed to standardize the collection, transmission, and storage of log data.
Introduction to Loki
Loki is a horizontally scalable, high-availability, multi-tenant log aggregation system developed by Grafana Labs. Designed for efficiency and ease of use, Loki primarily indexes the metadata of log content rather than the content itself, making it both lightweight and efficient. Loki adopts a tagging system similar to Prometheus, enabling more flexible and powerful log queries. It is commonly used for storing and querying large volumes of log data, especially when used in conjunction with Grafana, providing powerful log visualization and analysis capabilities.
Demonstration
In this demonstration, a Java application will be used to demonstrate log closed-loop operations. In the languages supported by Otel Log, Java is one of the most comprehensive.
Architecture
- Otel Operator installs probes and loads configurations for Java workloads through automatic instrumentation settings.
- The application reports logs to the Otel collector via the otlp endpoint.
- The Otel collector outputs logs to Loki.
- Grafana visualizes logs using Loki as a data source.
Prerequisites
- Kubernetes cluster
- kubectl cli
- helm cli
Installing Loki and Grafana
Install the Grafana helm repository.
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
Prepare the configuration file values.yaml
for Loki.
loki:
auth_enabled: false
commonConfig:
replication_factor: 1
storage:
type: 'filesystem'
singleBinary:
replicas: 1
Install Loki.
helm install --values values.yaml loki grafana/loki
Install Grafana.
helm install grafana grafana/grafana
Access Grafana through port forward at http://localhost:3000.
POD_NAME="$(kubectl get pod -l app.kubernetes.io/name=grafana -o jsonpath='{.items[0].metadata.name}')"
kubectl --namespace default port-forward $POD_NAME 3000
Configure the Loki data source in Grafana, pointing to the deployed Loki.
Installing Otel Operator
The Otel Operator relies on cert-manager for certificate management. Install cert-manager before installing the operator.
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.yaml
Execute the following command to install the Otel Operator.
kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml
Configuring Instrumentation
After successfully installing the Otel Operator, the next step is to configure the installation and configuration of probes. For detailed configuration instructions, refer to the Instrumentation API documentation.
Instrumentation is another CRD of the Otel Operator, used for automatic installation and configuration of Otel probes. Although this demonstration primarily focuses on logs, we still retain the distributed tracing configuration used previously to ensure the transmission of link information.
propagators
are used to configure the method of trace information transmission in the context.sampler
is the sampler.env
and[language].env
are environment variables added to the container.
For Java applications, set the oltp endpoint through the environment variable OTEL_EXPORTER_OTLP_ENDPOINT
, and set the log output method oltp
for the application with OTEL_LOGS_EXPORTER
. It can also be set to logging, oltp
, outputting logs to the console and the oltp endpoint.
kubectl apply -f - <<EOF
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: instrumentation-sample
spec:
propagators:
- tracecontext
- baggage
- b3
sampler:
type: parentbased_traceidratio
argument: "1"
env:
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: otel-collector.default:4318
java:
env:
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: http://otel-collector.default:4317
- name: OTEL_LOGS_EXPORTER
value: otlp
EOF
Configuring OpenTelemetry Collector
In our design, the Otel Collector outputs logs to Loki. In reality, this is done by sending logs through Loki’s HTTP API. Therefore, an exporter compatible with the Loki API is needed: lokiexporter.
The lokiexporter comes from the Otel Collector Contrib library and is not included in the official release. There are two ways to use lokiexporter in the collector:
- Use the official tool OpenTelemetry Collector Builder (ocb) to include lokiexporter in the binary during the collector build.
- Use the official distribution package otelcol-contrib, which contains all third-party components from the Contrib library. However, it is not recommended for production environments, only for testing. Our demonstration will use this distribution package.
For detailed configuration of the Otel collector, refer to the official documentation.
- Receiver, we configure
otlp
to receive trace information from applications. - Processor, convert some resource attributes in the log into Loki labels, such as service name, container name, namespace, pod name.
- Exporter, configure the HTTP API endpoint of Loki
http://loki.default:3100/loki/api/v1/push
. - Pipeline service, use
otlp
as the input source andloki
as the output destination.
kubectl apply -f - <<EOF
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: otel
spec:
image: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib:0.90.1
config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
resource:
attributes:
- action: insert
key: loki.resource.labels
value: service.name, k8s.container.name, k8s.namespace.name, k8s.pod.name exporters:
debug:
verbosity: detailed
loki:
endpoint: "http://loki.default:3100/loki/api/v1/push"
tls:
insecure: true
default_labels_enabled:
exporter: true
job: true service:
pipelines:
logs:
receivers: [otlp]
processors: [resource]
exporters: [loki]
EOF
Deploying the Sample Application
This is a very simple Java application, listening on port 8080
, printing logs when responding to requests.
@SpringBootApplication
@Slf4j
@RestController
public class SpringBootRestApplication {
public static void main(String[] args) {
SpringApplication.run(SpringBootRestApplication.class, args);
} @GetMapping("/")
public String hello() {
log.info("Hello World");
return "Hello World";
}
}
In the Maven pom.xml
, only two dependencies are included: spring-boot-starter-web
and lombok
.
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.28</version>
</dependency>
</dependencies>
Deploy the application.
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: java-sample
spec:
replicas: 1
selector:
matchLabels:
app: java-sample
template:
metadata:
labels:
app: java-sample
annotations:
instrumentation.opentelemetry.io/inject-java: "true"
spec:
containers:
- name: java-sample
image: addozhang/spring-boot-rest
imagePullPolicy: Always
ports:
- containerPort: 8080
EOF
After deploying the application, it can be successfully accessed via port forwarding.
curl localhost:8080
Hello World
Testing
After configuring Loki as a data source in Grafana, select the configured Loki data source in Explore, and then choose the filter name service_name
and value java-sample
in Label Filters.
Clicking Run query displays the search results.
Summary
In this article, we explored how to efficiently collect application logs using OpenTelemetry’s automatic detection feature, process them via OpenTelemetry Collector, and send the log data to Loki using the Loki Exporter. Finally, we demonstrated how to use Grafana for in-depth querying and analysis of these logs. This process not only optimizes the log management workflow but also enhances data visualization and usability. This integration provides developers and operations teams with a comprehensive view, aiding them in more effectively understanding and optimizing their applications and infrastructure. Particularly, incorporating distributed tracing information like traceid
, spanid
as labels in Loki logs significantly enhances the traceability and analyzability of log data.