In this guide, we will cover some of the most useful OpenTelemetry Collector Processors out there. Hopefully, you will either optimize how you currently use the Collector or even consider implementing the Collector to take advantage of the Processors’ capabilities.
Before jumping into each Processor, it’s important to understand what is the Collector’s role and its structure.
What to Expect
- The OpenTelemetry Collector
- OpenTelemetry Collector Processors
- Data flow: Filter, Group by and Routing
- Sampling: Tail and Probabilistic
- Runtime Environment
- Data Control: Attribute and Reduction
- Transformation: Transform, Span Metrics, Metrics Generation, and Cumulative to Delta / Delta to Rate
- Other: Schema and Graph
- Core: Batch Processor and Memory Limiter
- Pro Tip #2: Order of Processors
- Wrapping Up
The OpenTelemetry Collector
The Collector is a backend component that receives telemetry data sent by the OpenTelemetry SDK (or anything that can produce OTel data. e.g. Envoy, Istio, etc) and allows us to deliver it to any destination (e.g., db, a visualization tool, etc).
You can learn more about it in our Collector guide.
The OpenTelemetry Collector has 3 main components: Receivers, Processors, and Exporters.
As their name suggests, Receivers are in charge of receiving data from our applications. We can use a native OpenTelemetry SDK to create spans and export them to the receiver that listens for calls in a specified port on the collector. We can configure our receivers to accept both gRPC and HTTP protocols.
The OpenTelemetry Collector supports a number of built-in receivers that can be used to collect data from different sources. For a list of receivers, you can use this link. Some of the built-in receivers in the OpenTelemetry Collector include otlp, jaeger, zipkin, statsd, and prometheus for metrics.
The component that exports processed telemetry data to a specific destination. The OpenTelemetry Collector supports a number of built-in exporters that can be used to send data to different destinations.
The goal of the exporters is to convert OpenTelemetry data to different data formats as needed and then send it to the endpoint you define (using either HTTP or gRPC). You can export directly to Elasticsearch, Jaeger, Zipkin, and other vendors to enable distributed services visualization as well as Prometheus, for example, for metrics. Similar to the receivers there are built-in exporters that include otlp, jaeger, zipkin, statsd, and prometheus for metrics.
Visit the OTel Collector Contrib to see a list of additional exporters you can use.
💡Pro Tip #1
In addition to the built-in exporters and receivers, the Collector allows us to build custom exporters and receivers to receive and send data to proprietary systems or destinations.
We don’t have much flexibility when it comes to receiving or exporting data. We have to follow predetermined criteria – for instance, we must use an Elasticsearch exporter to export data to Elasticsearch and a Kafka exporter to export data to Kafka. It is quite straightforward.
Within the Collector, we find the Processors component to be the most interesting since it provides a lot of options and room to play.
The Processor’s goal is to enable us to manipulate the data the Collector receives right before we export it to our DB or backend. It is really where the Collector’s magic happens.
There are several processors for each telemetry type (logs, metrics, traces). Each signal has its own pipeline you can alter accordingly. This makes sense since logs, metrics, and traces have different purposes. Therefore, we want to treat them differently.
OpenTelemetry Collector Processors
Here are some of the most common OpenTelemetry Collector Processors organized into categories:
- Data flow: filter, group by, routing
- Sampling: tail, probabilistic
- Runtime environment: K8s, detector, resource
- Data control: attribute, redaction
- Transformation: transform, cumulative to delta, delta to rate, metrics generation, span metrics
- Others: schema, graph
- Core: batch, memory limit
We will be covering each one along with use cases.
Note: The categories we used to organize the Processors are not formal and are simply our own way of organizing them.
It is also important to mention that the Processors listed are part of the OpenTelemetry Collector contrib repository and are not part of the core functionality of the OpenTelemetry collector (other than the Core category). They are separate and can be added to the Collector if needed.
Let’s review our Processors.
The Processors in this category are useful when you want to control the flow of data as it is ingested. This can involve emitting certain data or altering the flow of the data in some way. These processors can be useful for managing the data being collected and ensuring that it is being used effectively.
The Filter Processor allows you to filter various data types and values, enabling you to efficiently manage the flow of data from the exporter to its destination.
For example, you may receive a large number of logs from production with varying levels of verbosity. It would be costly and unnecessary to send every single log to your vendor, so you can use the Filter Processor to only send logs with a level of “warning” or “critical” to your vendor. This way, you can control the amount of data being sent to the vendor with a single configuration change.
The Group by attribute allows you to combine multiple spans, logs, or metrics into a single group. This allows you to address the group as a whole, rather than each item individually. This attribute is useful for organizing and managing large amounts of data.
The Routing Processor reads a header from the incoming HTTP request (gRPC or plain HTTP) and directs the trace information to specific exporters based on the attribute’s value. This allows us to diversify or change the direction of the pipeline (we can have multiple pipelines for traces).
The Sampling Category Processors are similar to the Filter Processor in that they allow us to decide whether or not to include or exclude certain data in your pipeline.
The Tail Processor allows us to select which data to sample based on a set of defined criteria. For example, sample only spans that contain errors or specific HTTP routes/status.
To properly handle and manage this processor, it is necessary to do some research. However, the effort is worth it because the Tail Processor allows you to gain valuable insights while minimizing costs. To learn more about tail sampling, refer to this article on managing OpenTelemetry cost.
The Probabilistic Processor uses probabilistic sampling to randomly sample a portion of your traffic. For example, you can sample 10% of your traffic.
As opposed to the Tail Processor, in this case, all data is equal. You pick an X% out of your entire trace data. You would probably find out you are sampling the most common X% rather than the insightful ones.
These Processors are highly recommended because they allow you to collect information about the environment in which your code is running.
There are three types of processors within the Environment Processor: the Kubernetes Processor, the Detectors Processor, and the Resource Processor.
1) The Kubernetes Processor detects metadata specific to the Kubernetes environment, such as the pod ID.
2) The Detectors Processor gathers more general information about the environment, such as whether it is in staging or production, and the AWS region.
3) The Resource Processor allows you to manually add resources.
By using these Processors, you can easily gather important information about the environment and include it in your logs, metrics, and traces.
The Attribute Processor and Reduction Processor are useful tools for modifying and protecting data.
The Attribute Processor allows you to modify the attributes of the data itself. This Processor is useful in a number of situations, such as when something is wrong with the data or when you want to add additional information to the data for enrichment purposes. This is in contrast to data flow and sampling, which allow you to control the flow and selection of data, respectively.
The Reduction Processor is an effective tool for protecting personally identifiable information (PII) and complying with the General Data Protection Regulation (GDPR). For example, you can use the Reduction Processor to obscure email addresses for data privacy purposes.
The Processor in this category allows us to modify and customize our data (e.g., modify fields within spans and generate new metrics fields)
The Transform Processor allows you to modify fields in traces, metrics, and logs within the collector. You can define transformations and conditions, then apply them to your telemetry data. This way, you can customize the data collected and processed by the collector.
Using the Span Metrics Processor, we can aggregate data from spans into metrics and then export them to a relevant backend. For example, you could gather data about the HTTP path and response status code, enabling you to see which routes have the most 500 errors.
The Metrics Generation Processor is a useful tool for deriving new metrics from existing data. For example, you can use this Processor to calculate the ratio of actual memory usage to total available memory for a specific application. This can be useful for understanding the performance of the application and identifying any potential issues.
Cumulative to Delta / Delta to Rate
This Processor changes how metrics data is displayed. It takes data collected over time and shows the change in that data over time. Essentially it converts monotonic, cumulative sum, and histogram metrics to monotonic, delta metrics.
OpenTelemetry publishes a telemetry schema as part of the specification. This schema constantly changes and you may get different versions of the semantic conventions. This Processor helps you close the gap between the different versions and transform one schema into another.
The Graph Processor allows you to generate a graph view of your services.
The Batch Processor is used to group spans, metrics, or logs into batches for more efficient transmission. This can help to compress the data and reduce the number of connections needed to send it. The Batch Processor supports both size-based and time-based batching.
The Memory Limiter helps to prevent out-of-memory exceptions by limiting the amount of memory that can be used. This can help to ensure that the system does not exceed its available memory resources and prevent performance issues.
💡 Pro Tip #2: Order of Processors
The order of processors is important when building your pipeline because one processor communicates with another. For example, it’s best to sample spans first and then alter the remaining data – there’s no point in processing spans that will eventually be filtered out.
The OpenTelemetry Collector Processor is, in our view, the most interesting component of the three. It is where much of the processing and management of telemetry data occurs, allowing users to optimize their use of telemetry data, improve performance, and potentially save costs (which we LOVE)
As we see it, familiarizing yourself with the different processors available in the OpenTelemetry Collector is extremely helpful to fully utilize its potential. The Collector, in general, is especially important to consider when looking to scale your use of OpenTelemetry.
To learn more about the OpenTelemetry Collector, we have created many resources that you can refer to, and as always, we are available to answer any questions you may have about OpenTelemetry via our chat.