Introduction to Lightrun with Java – 使用Java的Lightrun简介

最后修改: 2022年 6月 8日

1. Introduction


In this article, we’re going to explore Lightrun – a Developer Observability platform – by introducing it into an application and showing what we can achieve with it.


2. What Is Lightrun?


Lightrun is an observability platform that allows us to instrument our Java (other languages are also supported) applications and then view the instrumentation directly from within IntelliJ, Visual Studio Code, and many other logging platforms and APMs. It’s designed to be able to seamlessly add instrumentation to applications running in any environment and access them from anywhere, allowing us to quickly diagnose issues anywhere from our local workstation all the way to production instances.

Lightrun是一个可观察性平台,它允许我们对Java(也支持其他语言)应用程序进行检测,然后直接从IntelliJ、Visual Studio Code以及其他许多日志平台和APM中查看检测结果。它被设计为能够将检测结果无缝添加到在任何环境中运行的应用程序,并从任何地方访问它们,使我们能够快速诊断从本地工作站一直到生产实例等任何地方的问题。

Lightrun works with two different components that integrate together:


  • The Lightrun Agent runs as part of the application and instruments telemetry as requested. In Java applications, this works as a Java Agent. We’ll run this agent as part of every application that we want to use Lightrun with.
  • The Lightrun Plugin runs as part of our development environment and allows us to communicate with the agents. This is our means to see what is running, add new instrumentation to an application and receive the results of this instrumentation.

Once all of this is set up, we can then manage three different types of instrumentation:


  • Logs – These are the ability to add arbitrary log statements into the running application at any point, logging out any available values (including complex expressions). These logs can be sent either to the standard output, back to the Lightrun plugin in our development environment, or both at the same time. In addition, they can be invoked conditionally – for example, based on a specific user or session ID pre-defined in the code.
  • Snapshots – These allow us to capture a live snapshot of the application at any point. This will record the details of exactly when and where the snapshot was triggered, the value of all variables, and the complete call stack to this point. These can also be invoked conditionally, much like Logs.
  • Metrics – These allow us to record metrics similar to what can be generated by Micrometer, allowing us to count the number of times a line of code is executed, record timings for a block of code, or any other numerical calculation we might want.

All of these things can be done easily in our code already. What Lightrun gives us here is the ability to do these things in an already running application without needing to change or re-deploy the application. This means we can get targeted instrumentation in production with zero downtime.


Furthermore, all these logs are ephemeral. They do not persist in the source code or running application and can be added and removed as needed.


3. Example Application


For this article, we have an application that is already built and ready to work with. This application is designed for tracking tasks that are assigned to people and allows users to query this data. This code can be found on GitHub and will require Java 17+ and Maven 3.6 to build it correctly.

在这篇文章中,我们有一个已经建成并可以使用的应用程序。该应用程序旨在跟踪分配给人们的任务,并允许用户查询这些数据。这段代码可以在GitHub上找到,需要Java 17+和Maven 3.6才能正确构建。

This application is architected as three different services – one for managing users, another for managing tasks, and a third that orchestrates over the two of them. The tasks-service and users-services then have their own databases, and there is a JMS queue between the two – allowing for the users-service to indicate that a user was deleted so that the tasks-service can tidy things up.


These databases and the JMS queue are all embedded within the applications for convenience. However, in reality, this would naturally use real infrastructure.


3.1. Tasks Service


In this article, we’re only interested in the tasks-service. However, in future articles, we’re going to explore all three of them and how they interact with each other.


This service is a Spring Boot application built with Maven on Java 17. When running, this has HTTP endpoints for:

该服务是一个用Maven在Java 17上构建的Spring Boot应用程序。运行时,它的HTTP端点为:。

  • GET / – Allows the client to search tasks, filtering by the user that created it and by the status of it.
  • POST / – Allows the client to create a new task.
  • GET /{id} – Allows the client to get a single task by ID.
  • PATCH /{id} – Allows the client to update a task, changing the status and the user it’s assigned to.
  • DELETE /{id} – Allows the client to delete a task.

We also have a JMS listener, which can indicate when a user was deleted from our users-service. In this case, we automatically delete all tasks created by that user and unassign all tasks assigned to that user.


We also have a couple of bugs in our application that we’ll be able to diagnose with the help of Lightrun.


4. Setting Up Lightrun


Before we start, we’ll need an account with Lightrun and to set it up locally. This can be done by visiting and following the instructions.


Once we have registered, we’ll need to select the development environment and programming language. For this article, we’ll be using IntelliJ and Java, so we’ll select those and move on:


lightrun setup

We then get instructions for how to install the Lightrun plugin into our environment, so we can just follow these.


We also need to ensure that we sign in to our new account from our development environment, after which we’ll have access to our Lightrun agents – none yet – from within the editor:


lightrun connect

Finally, we get instructions on how to download the Java agent that we’ll use to instrument our applications. These instructions are platform-specific, so we need to make sure we follow the ones that work for our exact setup.


Once we’ve done this, we can start our application with the agent installed. Make sure that the tasks-service is built, and then we can run it:


$ java -jar -agentpath:../agent/ target/tasks-service-0.0.1-SNAPSHOT.jar

At this point, the Onboarding screen in our web browser will allow us to progress, and the UI in our development environment will update automatically to show our application running:


lightrun connected

Note that these are all connected to our Lightrun account, so we can see them regardless of where the applications are running. This means we can use the exact same tooling on our applications running on our local machine, inside Docker containers, or any other environment that supports our runtime, regardless of where it is in the world.


5. Capturing Snapshots


One of the most powerful features of Lightrun is the ability to add snapshots to currently running applications. These will then allow us to capture the exact state of execution at a given point in our application. This can then give invaluable insights into exactly what is happening within our code. They can be thought of as “virtual breakpoints”, except that they don’t interrupt the flow of the program. Instead, they capture all of the information that you would be able to see from a breakpoint for us to look at later.

Lightrun 最强大的功能之一是能够向当前运行的应用程序添加snapshots。这将使我们能够捕捉到我们应用程序中某一点的确切执行状态。这将为我们了解代码中的确切情况提供宝贵的见解。它们可以被认为是 “虚拟断点”,只是它们不会中断程序的流程。相反,它们捕获了你能从断点中看到的所有信息,供我们以后查看。

Snapshots – as well as Logs and Metrics – are added from within our development environment. We’ll typically do this by right-clicking on the line that we want to add the instrumentation and then selecting the “Lightrun” option.

快照–以及日志和指标–是在我们的开发环境中添加的。我们通常通过右键点击我们想要添加仪器的那一行,然后选择 “Lightrun “选项来完成。

Then we can add our instrumentation by selecting it from the subsequent menu:


lightrun snapshots

This will then open a panel allowing us to add the snapshot:


lightrun create snapshot

Here we need to select the agent that we want to instrument, and possibly specify other details about exactly how it will work.


When we’re happy with everything, we then hit the Create button. This will then add a new Snapshot entry into our sidebar, and we’ll get a blue camera icon against the line of code.


This then indicates that this line will capture a snapshot when executed:


lightrun snapshot entry

Note that if something goes wrong, the camera will be red instead. Typically, this would mean that the running code doesn’t correspond to the source code, though other reasons might exist and need to be explored here as well.


6. Diagnosing A Bug – Searching Tasks


Our tasks-service, unfortunately, has a bug where performing a filtered search of tasks never returns anything. If we perform an unfiltered search, then this will correctly return all tasks, but as soon as a filter is added – whether it’s createdBy, status, or both – then we suddenly get no results.

我们的任务服务,不幸的是,有一个错误,即执行过滤的任务搜索永远不会返回任何东西。如果我们执行未过滤的搜索,那么这将正确地返回所有任务,但只要添加一个过滤器 – 无论是createdBystatus,还是两者 – 然后我们突然得到任何结果。

For example, if we make a call to http://localhost:8082?status=PENDING then we should get some results, but instead, we always get an empty array.


Our application is architected such that we have a TasksController to handle the incoming HTTP request. This then calls the TasksService to do the real work, and this works in terms of a TasksRepository.


This repository is a Spring Data interface meaning that we have no code in there directly that we can instrument. Instead, we’ll add a snapshot in the TasksService. In particular, we’ll add it on the very first line of the search() method. This will let us see the initial conditions that exist when the method is called, regardless of which code path we end up going through inside the method:

该资源库是一个Spring Data接口,这意味着我们在其中没有可以直接使用的代码。相反,我们将在TasksService中添加一个快照。特别是,我们将在search()方法的第一行添加它。这将让我们看到该方法被调用时存在的初始条件,无论我们最终在该方法中通过何种代码路径。

lightrun add snapshot

Having done this, we’ll then call our endpoint. Again, we’ll get the same result of an empty array.


However, this time we’ll capture a snapshot in our development environment – which we can see on the Snapshots tab:


lightrun snapshots tab

This shows us the stack trace to where our snapshot was captured and the state of all visible variables at the time it was captured. Let’s focus on the variables here. Two of these are the parameters that were passed to the method, and the third is this. The parameters are the ones that are potentially most interesting, so we’ll look at those.


Immediately, we can see the problem. We’ve been given the value “PENDING” – which is the status that we’re searching for – in the createdBy parameter!

立即,我们可以看到问题所在。我们在createdBy参数中得到了 “PENDING “的值–也就是我们要搜索的状态!这就是问题所在。

Looking closer at the code, we see that we’ve unfortunately transposed the parameters between TasksController and TasksService. This is an easy fix, and if we were to make it – either by swapping the parameters in TasksService or the values passed in from TasksController – then suddenly, our search will start working properly.


7. Summary


Here we’ve seen a quick introduction to the Lightrun observability platform, how to get started with it, and some of the benefits it can give us. We’ll be exploring these in more depth in upcoming articles.


Why not use it in your next application, to give more confidence and insight into the way it operates.