Aspecto blog

On microservices, OpenTelemetry, and anything in between

Open-Source Malabi: How to Use Jaeger Tracing as a Backend

How to use Jaeger as a Malabi backend

Share this post

Share on facebook
Share on twitter
Share on linkedin

Malabi is an open-source library that helps you write integration tests, by doing 2 things:

  1. Collecting data from your tested microservice during the test run
  2. Exposes an endpoint to make assertions on this data (with helper functions to call the endpoint).

It is important to understand that malabi is an implementation of a new testing paradigm called “Trace-Based Testing” – which means that we utilize the power of traces to make test assertions.

You can learn more about trace-based testing in this article, and more on that later.

In this guide, I’ll explain how to use it with a Jaeger Tracing storage backend (and tell you what I mean by that).

Malabi Architecture Overview

At the end of this guide – your setup would look like this – (detailed explanations about each component follow):

1) Data Collection – OpenTelemetry Under The Hood

To use this library, it’s important to understand a bit about how it works.

The collected data is generated by OpenTelemetry – an open-source that allows creating and generating telemetry data. Telemetry data is the data that we use to monitor our applications. It’s an umbrella term for metrics, logs, and traces.

Using OpenTelemetry to collect this data enables us to understand our microservices better. The data OpenTelemetry generates is represented by traces and spans.

Spans are the atomic unit of OpenTelemetry. Each operation is represented by a span that contains info about this operation.

For example – an HTTP call would have a span containing the method, route, hostname, etc. A MongoDB save operation would have info about the saved document.

A trace is a directed acyclic graph of spans. For this guide, and open source, you can think of it as a collection of spans.

Trace-span diagram

2) Using The Data – Exposing an Endpoint for Span Retrieval on The Tested Microservice

To assert on spans created during the test run, the test runner needs to have an option to retrieve those spans.

That’s why Malabi creates an endpoint exactly for this, (at the port of your choice), and provides a wrapper function (called malabi), to access those spans. Example:

const telemetryRepo = await malabi(async () => {
 // Get todos by user
 await axios(`http://localhost:${SERVICE_UNDER_TEST_PORT}/user/tasks`);
});

The code inside the callback is the code we want to create spans for, and the telemetryRepo contains the spans created – the ones we can assert on.

Storage Backends:

To export spans and traces for assertion purposes, like seen above, we need to store them somewhere.

Malabi supports 2 storage backends:

  1. In memory (default)
  2. Jaeger Tracing

When stored in memory – as the name suggests – it stores the spans in the memory of the tested service. When stored in Jaeger – it would send the relevant spans to Jaeger, which is an open-source distributed tracing system.

As a bonus, Jaeger also provides us with a UI that we can even see those spans and traces, which is very useful if something is not working as expected – we can take a look at the data from the test run visually.

These are the traces that were created during the test run (each ‘it’ clause has a separate trace regardless of which storage backend we use).

traces that were created during the test run in Jaeger tracing

This is how it looks like when we select a specific trace with errors:

specific trace with errors in Jaeger UI

P.S. If you’ve heard of Jaeger but never looked into it, here’s a quick guide that covers all the basics (and some more).

Connecting The Dots – Running Malabi with Jaeger as a Storage Backend

Let’s assume we have a todo-api microservice responsible for creating and retrieving TODOs from our MongoDB database and serving them to a TODO app we wrote.

We want to make sure that when we query the DB for a specific user’s tasks – we always query the DB only by the user’s ID and not by some other user’s ID (if this happens – that’s quite a bug).

The service is written with ExpressJS, with the following code(authentication is mocked for simplicity, and mongoose code is in the same file for convenience):

const { instrument, serveMalabiFromHttpApp } = require('malabi');
const instrumentationConfig = {
 serviceName: 'service-under-test',
};
instrument(instrumentationConfig);
serveMalabiFromHttpApp(18393, instrumentationConfig);

const express = require('express');
const router = express.Router();
const mongoose = require('mongoose');

main().catch(err => console.log(err));

async function main() {
 await mongoose.connect('mongodb://localhost:27017/testmongoose');
}

const { Schema } = mongoose;

const taskSchema = new Schema({
 text:  String,
 isDone: Boolean,
 userId: String,
});

const getCurrentUser = () => {
 return {
   id: '123'
 }
}

const Task = mongoose.model('Task', taskSchema);

router.get('/user/tasks', async function(req, res, next) {
 const { id: userId } = getCurrentUser();

 // find tasks by user id
 const tasks = await Task.findOne({ userId }).exec();
 console.log('tasks', tasks);
 res.json({ body: 'Successfully got task for user' });
});

router.post('/', async function(req, res, next) {
 const task = new Task();
 task.text = 'write an article'

 // In real life this would come from some middleware
 task.userId = getCurrentUser().id;

 await task.save();
 res.json({ title: 'Saved task' });
});

module.exports = router;

Notice the above code responsible for initializing Malabi:

const { instrument, serveMalabiFromHttpApp } = require('malabi');
const instrumentationConfig = {
 serviceName: 'service-under-test', // 1
};
instrument(instrumentationConfig); // 2
serveMalabiFromHttpApp(18393, instrumentationConfig); // 3

Line 1 – setting the name of the service under test.

Line 2 – telling malabi to instrument my service with that name (so that traces would have the correct service name if we try to inspect them)

Line 3 – Here we tell malabi to expose the span retrieval endpoint on port 18393.

When we run the service, it’s important to launch it with the MALABI_STORAGE_BACKEND env var set to Jaeger, because we want to use Jaeger as a storage backend. In my case, it would look like this:

MALABI_STORAGE_BACKEND=Jaeger node ./bin/www

But before we actually do this, we need to run Jaeger and write some integration test code.

To run jaeger, simply use Docker with the following command:

docker run -d --name jaeger \
  -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14250:14250 \
  -p 14268:14268 \
  -p 14269:14269 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.31

As you can see above in the service code, we have the GET endpoint for serving TODOs, POST endpoint for creating a TODO item.

Before my test, I pre-populate the DB with a task from a user with ID ‘123’ and want to make sure that only his ID is being queried.

And here is the integration test code(again with auth mocked for simplicity):

var assert = require('assert');
const { instrument, malabi } = require('malabi');
const SERVICE_UNDER_TEST_PORT = 3000;

instrument({
 serviceName: 'tests-runner',
});

const axios = require('axios');

async function login({ username, password }) {
 return ({ id: "123" });
}

describe('integration test', () => {
 it('should verify that mongodb is fetched with authenticated user id', async () => {
   // perform login
   const user = await login({ user: 'user', password: 'password' });
   const userId = user.id;

   // get information about what operations happened
   const telemetryRepo = await malabi(async () => {
     // Get todos by user
     await axios(`http://localhost:${SERVICE_UNDER_TEST_PORT}/user/tasks`);
   });

   // Validate internal mongo call (by asserting on the data we collected above)
   const getUserByIdOperation = telemetryRepo.spans.mongo().first;
   assert.equal(getUserByIdOperation.attribute('db.statement'), `{"condition":{"userId":"${userId}"}}`)
 });
});

You can see that when we want to generate spans for some operation (like getting tasks for the current user), we must put it inside the malabi callback.

We then get the results that we can assert in a variable we call a telemetryRepo(Malabi’s name for holding all telemetry data – the spans).

Then we can make an assertion using an assertion library of our choice, in this case, the assert library.

Lastly, we make the assertion – select the first MongoDB span (the only one in our case) from our telemetryRepo.

To actually run it, we use mocha. 

We also need to use the MALABI_STORAGE_BACKEND the same as we did above:

MALABI_STORAGE_BACKEND=Jaeger MALABI_ENDPOINT_PORT_OR_URL=http://localhost:18393 mocha

The MALABI_ENDPOINT_PORT_OR_URL lets malabi know where it should call to get the spans. If a port is given – localhost is assumed.

When running – I see that everything is working as expected.

Malabi integration test

And you can also see the trace created for this it clause in Jaeger:

That would be it! You can use it not only to query MongoDB but almost any other DB / API / other parts of your microservice.

End Note

I hope you find this library to be useful. As one of the authors of the library, I would really want to hear from you.

If you have ideas for improvement or something not clear – feel free to reach out to me on Twitter or open an issue in the GitHub repo.

Spread the word

Share on facebook
Share on twitter
Share on linkedin
Subscribe for more distributed applications tutorials and insights that will help you boost microservices troubleshooting.