Making JFR Quack: Importing JFR Files into DuckDB

14 hours ago 1

In my previous post, I showed you how tricky it is to compare objects from the JFR Java API. You probably wondered why I wrote about this topic. Here is the reason: In this blog post, I’ll cover how to load JFR files into a DuckDB database to allow querying profiling data with simple SQL queries, all JFR views included.

This blog post will start a small series on making JFR quack.

TL;DR

You can now use a query tool (via GitHub) to transform JFR files into similarly sized DuckDB files:

> java -jar target/query.jar duckdb import jfr_files/recording.jfr duckdb.db > duckdb duckdb.db "SELECT * FROM Events"; ┌───────────────────────────────┬───────┐ │ name │ count │ │ varchar │ int32 │ ├───────────────────────────────┼───────┤ │ GCPhaseParallel │ 69426 │ │ ObjectAllocationSample │ 6273 │

Or run the queries directly, with the database file being cached (if you don’t pass --no-cache), directly supporting all built-in JFR views:

> java -jar target/query.jar query jfr_files/metal.jfr "hot-methods" Method Samples Percent -------------------------------------------------------------------------------------------------------- ------- ------- java.util.concurrent.ForkJoinPool.deactivate(ForkJoinPool.WorkQueue, int) 1066 8.09% scala.collection.immutable.RedBlackTree$.lookup(RedBlackTree.Tree, Object, Ordering) 695 5.27% akka.actor.dungeon.Children.initChild(ActorRef) 678 5.14%

This view is implemented as:

CREATE VIEW "hot-methods" AS SELECT (c.javaName || '.' || m.name || m.descriptor) AS "Method", COUNT(*) AS "Samples", format_percentage(COUNT(*) / (SELECT COUNT(*) FROM ExecutionSample)) AS "Percent" FROM ExecutionSample es JOIN Method m ON es.stackTrace$topMethod = m._id JOIN Class c ON m.type = c._id GROUP BY es.stackTrace$topApplicationMethod, c.javaName, m.name, m.descriptor ORDER BY COUNT(*) DESC LIMIT 25

Previously

My new tool is the next evolution of my JFR-query-based tool, which I showcased in my post, An Experimental Front-End for JFR Queries, essentially allowing you to execute queries directly on JFR files using the JFR query syntax. It allowed running queries like the following, which is the view defined directly above in SQL:

COLUMN 'Method', 'Samples', 'Percent' FORMAT none, none, normalized SELECT stackTrace.topFrame AS T, COUNT(*), COUNT(*) FROM ExecutionSample GROUP BY T LIMIT 25

But the JFR query system has severe limitations:

  1. It was only ever built to implement a narrow set of views for the jfr view tool.
  2. Its SQL variant is truly custom with syntax like DIFF([B|E].startTime) to fit the jfr view use case. But this makes the syntax hard to understand and hard to use.
  3. The SQL variant lacks even basic things like non-equality comparisons
  4. It has a custom-built query engine, which is why custom syntax can be implemented, but also why it doesn’t have any query optimisations.
  5. Extending the simple code is possible, but developing and maintaining a version with the missing SQL features would be a considerable effort.

The JFR query system is suitable for its intended use case, but not for general-purpose querying. This is why I went on to search for a viable alternative. I found it in the form of DuckDB. It’s more verbose than JFR files, but executing the queries harnesses the power of a proper database engine and is thereby significantly faster than running the JFR queries.

Alternatives

There are several alternatives to developing a custom project: First and foremost, of course, the jfr-analytics project by Gunnar Morling, which uses Apache Calcite to directly support SQL queries on top of the JFR Java API. This tool allows you to write queries like (source)

SELECT ts."startTime", ts."parentThread"."javaName" as "parentThread", ts."eventThread"."javaName" AS "newThread", TRUNCATE_STACKTRACE(ts."stackTrace", 20) AS "stackTrace" FROM "jdk.ThreadStart" ts LEFT JOIN "jdk.ThreadEnd" te ON ts."eventThread"."javaThreadId" = te."eventThread"."javaThreadId" WHERE te."startTime" IS NULL;

to find ThreadStart events without a corresponding ThreadEnd event.

The advantage is that it can easily support the nested structs in JFR data and doesn’t require the data to be transformed. The main disadvantage is that the tool uses non-standard SQL and is sadly not maintained anymore.

Other database importers exist for InfluxDB or Grafana, but all implementations I could find only support a limited number of JFR events.

Another notable project is jfrv, a web-based JFR viewer with a custom DuckDB reader to import JFR files directly. The JFR reader is written in Rust, which allows it to be compiled to WASM together with DuckDB and run directly in the browser. The main disadvantage of this project is that developing and supporting your own JFR reader is not advisable, as the JFR file format is neither well-documented nor well-specified.

Update: Frederic Thevenet wrote an interesting adapter for Lucene that I somehow must have missed.

I also toyed with a similar idea a while back: https://binjr.eu/blog/2023/08/new-data-adapter-jdk-flight-recorder/
With that said, there are some differences in the approach I took over the one you discussed in your post.
For one, I opted to use an inverted index (#Lucene) instead of a relational DB as my backend, which comes with it’s own trade-offs, like offering a query language that is somewhat easier to use, but not as nearly as powerful.
The other main difference, is that the route I used to get there is kinda like the opposite from the one you took: while you went from the backend working your way up to the UI, I very much started there (as I already had it) and worked my way down.
Doing things this way around meant that I could benefit immediately from the UI features that were there already (which was the whole point, of course) but it makes integrating new ones that don’t fit so naturally with the rest of the tool, much more time consuming…

At any rate, I would love to hear your thoughts if you find the time to give it a try!
(you can get it here: https://github.com/binjr/binjr/releases)

Frederic Thevenet On MASTODON

This is an interesting approach, but as Frederic wrote, it has some significant short-comings. But in the end its good to have a whole set of tools.

Why DuckDB

DuckDB is an easy-to-use in-memory database. This database is optimized for analytical workflows:

DuckDB is designed to support analytical query workloads, also known as online analytical processing (OLAP). These workloads are characterized by complex, relatively long-running queries that process significant portions of the stored dataset, for example aggregations over entire tables or joins between several large tables.

WHY DuckDB on duckdb.org

DuckDB is open-source, has a proper query optimizer, and supports many SQL features. The Java library already includes the DuckDB binaries for multiple platforms, so installing any software is unnecessary.

SQLite would be another good option, but DuckDB seems more optimized for my use case. Proper database systems like PostgreSQL or Prometheus would force the user into a more complex setup process, countering my project goals.

Goals and Non-Goals

The main goal of this project is to be able to query and analyze JFR data, with a focus on garbage collection related data, using the already widely known SQL. The aim is to provide a simple tool that mimics the jfr tool, which doesn’t require any setup or installation of additional software. The produced database files should be self-contained and usable with other tools.

But of course, with all that, the tool should also be maintainable. The eventual goal is to include a basic UI.

On the other hand, it is not a goal to replace the JFR tool or to be too configurable. So, it will probably never support databases other than DuckDB. It’s also not a goal to support every custom JFR event in existence, nor is it a goal to support deep stack traces in the events. But the goal is to support all JVM-internal JFR events and most user-defined ones, too, so there is no special handling of specific events.

Simplicity over complexity, and maintainability over feature richness.

Modelling JFR Events as Tables

We need to model JFR events as simple tables because we can’t have the luxury of Apache Calcite to query the object graph. So we start by translating the most common elements.

Events are becoming tables, so, for example, an event like jdk.CPULoad

is modelled as a table consisting of a string and an integer:

D SHOW CPULoad; ┌──────────────┬─────────────┬─────────┬─────────┬─────────┬─────────┐ │ column_name │ column_type │ null │ key │ default │ extra │ │ varchar │ varchar │ varchar │ varchar │ varchar │ varchar │ ├──────────────┼─────────────┼─────────┼─────────┼─────────┼─────────┤ │ startTime │ TIMESTAMP │ YES │ NULL │ NULL │ NULL │ │ jvmUser │ FLOAT │ YES │ NULL │ NULL │ NULL │ │ jvmSystem │ FLOAT │ YES │ NULL │ NULL │ NULL │ │ machineTotal │ FLOAT │ YES │ NULL │ NULL │ NULL │ └──────────────┴─────────────┴─────────┴─────────┴─────────┴─────────┘

With the start time modelled as a timestamp. The additional data type information is sadly lost in translation, as DuckDB (in contrast to PostgreSQL) doesn’t have proper custom data types. The specific data type is only noted in the table comment:

D SELECT comment FROM duckdb_tables() where table_name = 'CPULoad'; comment = DESCRIPTION(Information about the recent CPU usage of the JVM process); Column "jvmUser": Percentage; Column "jvmSystem": Percentage; Column "machineTotal": Percentage

This isn’t pretty, but it should at least help for documentation purposes. You also see the table description.

Inlined Structs

Structs are modelled in two ways: Either they are inlined or a new table is created with an _id column. Structs like VirtualSpace are inlined if they either consist of a single property or if they only, like VirtualSpace consist of numeric properties:

This struct is inlined into jdk.GCHeapSummary:

This makes it easier to formulate SQL queries by reducing the number of joins, improving the speed, and file size. The event table looks like the following:

D SHOW GCHeapSummary; ┌─────────────────────────┬─────────────┬─────────┬─────────┬─────────┬─────────┐ │ column_name │ column_type │ null │ key │ default │ extra │ │ varchar │ varchar │ varchar │ varchar │ varchar │ varchar │ ├─────────────────────────┼─────────────┼─────────┼─────────┼─────────┼─────────┤ │ startTime │ TIMESTAMP │ YES │ NULL │ NULL │ NULL │ │ gcId │ INTEGER │ YES │ NULL │ NULL │ NULL │ │ when │ VARCHAR │ YES │ NULL │ NULL │ NULL │ │ heapSpace$start │ BIGINT │ YES │ NULL │ NULL │ NULL │ │ heapSpace$committedEnd │ BIGINT │ YES │ NULL │ NULL │ NULL │ │ heapSpace$committedSize │ BIGINT │ YES │ NULL │ NULL │ NULL │ │ heapSpace$reservedEnd │ BIGINT │ YES │ NULL │ NULL │ NULL │ │ heapSpace$reservedSize │ BIGINT │ YES │ NULL │ NULL │ NULL │ │ heapUsed │ BIGINT │ YES │ NULL │ NULL │ NULL │ └─────────────────────────┴─────────────┴─────────┴─────────┴─────────┴─────────┘

The inlined struct’s properties are prefixed by the event property name. These inlined structs are also included in the comments:

D .mode line D SELECT comment FROM duckdb_tables() where table_name = 'GCHeapSummary'; comment = Column "heapSpace$start": MemoryAddress with DESCRIPTION(Start address of the virtual space); Column "heapSpace$committedEnd": MemoryAddress with DESCRIPTION(End address of the committed memory for the virtual space); Column "heapSpace$committedSize": DataAmount(BYTES) with DESCRIPTION(Size of the committed memory for the virtual space); Column "heapSpace$reservedEnd": MemoryAddress with DESCRIPTION(End address of the reserved memory for the virtual space); Column "heapSpace$reservedSize": DataAmount(BYTES) with DESCRIPTION(Size of the reserved memory for the virtual space); Column "heapUsed": DataAmount(BYTES) with DESCRIPTION(Bytes allocated by objects in the heap)

You might notice that the comments also include descriptions of properties where available. A front-end could parse this.

Referenced Structs

But not all structs are inlined, so what do they look like? As hinted before, we create a separate table for them with an _id column, which is used to reference the structs. An example of this is the Class struct:

This is modelled as:

D SHOW Class; ┌─────────────┬─────────────┬─────────┬─────────┬─────────┬─────────┐ │ column_name │ column_type │ null │ key │ default │ extra │ │ varchar │ varchar │ varchar │ varchar │ varchar │ varchar │ ├─────────────┼─────────────┼─────────┼─────────┼─────────┼─────────┤ │ _id │ UINTEGER │ NO │ PRI │ NULL │ NULL │ │ classLoader │ UINTEGER │ YES │ NULL │ NULL │ NULL │ │ name │ VARCHAR │ YES │ NULL │ NULL │ NULL │ │ package │ UINTEGER │ YES │ NULL │ NULL │ NULL │ │ modifiers │ INTEGER │ YES │ NULL │ NULL │ NULL │ │ hidden │ BOOLEAN │ YES │ NULL │ NULL │ NULL │ │ javaName │ VARCHAR │ YES │ NULL │ NULL │ NULL │ └─────────────┴─────────────┴─────────┴─────────┴─────────┴─────────┘

This table itself references the ClassLoader struct. But be aware that this ClassLoader struct cannot, in turn, reference classes, because recursion makes the insertion into tables far more complicated.

The Class table comment includes the relationship between the two tables:

D SELECT comment FROM duckdb_tables() where table_name = 'Class'; comment = Column "classLoader": references "ClassLoader"(_id); Column "package": references "Package"(_id)

Sadly, foreign keys didn’t work for me for this purpose. Some references in JFR can be null, so having zero as a reference number models this. The first row of every struct table is a default row for this purpose:

D SELECT * FROM Class LIMIT 1; ┌─────┬─────────────┬──────┬─────────┬───────────┬────────┬──────────┐ │ _id │ classLoader │ name │ package │ modifiers │ hidden │ javaName │ ├─────┼─────────────┼──────┼─────────┼───────────┼────────┼──────────┤ │ 0 │ NULL │ NULL │ NULL │ 0 │ false │ NULL │ └─────┴─────────────┴──────┴─────────┴───────────┴────────┴──────────┘

Now on to stack traces, which are specially handled.

Stack Traces

Stack traces are modelled in JFR as structs with arrays:

The problem is that supporting arrays in general in the database is hard, as the JDBC driver for DuckDB lacks full support. So the converter only supports arrays for stack traces and handles both stack traces and stack frames with custom logic.

All stack trace structs are inlined, as seen, for example, with the ExecutionSample event:

D SHOW ExecutionSample; ┌─────────────────────────────────┬──────────────┬─────────┬─────────┬─────────┬─────────┐ │ column_name │ column_type │ null │ key │ default │ extra │ │ varchar │ varchar │ varchar │ varchar │ varchar │ varchar │ ├─────────────────────────────────┼──────────────┼─────────┼─────────┼─────────┼─────────┤ │ startTime │ TIMESTAMP │ YES │ NULL │ NULL │ NULL │ │ sampledThread │ UINTEGER │ YES │ NULL │ NULL │ NULL │ │ stackTrace$topMethod │ UINTEGER │ YES │ NULL │ NULL │ NULL │ │ stackTrace$topApplicationMethod │ UINTEGER │ YES │ NULL │ NULL │ NULL │ │ stackTrace$topNonInitMethod │ UINTEGER │ YES │ NULL │ NULL │ NULL │ │ stackTrace$length │ SMALLINT │ YES │ NULL │ NULL │ NULL │ │ stackTrace$truncated │ BOOLEAN │ YES │ NULL │ NULL │ NULL │ │ stackTrace$methods │ UINTEGER[10] │ YES │ NULL │ NULL │ NULL │ │ state │ VARCHAR │ YES │ NULL │ NULL │ NULL │ └─────────────────────────────────┴──────────────┴─────────┴─────────┴─────────┴─────────┘

I made the conscious decision, for now, to only include the method property per frame, this allows me to save a significant amount of memory. But maybe I’ll add the option to include the line number and type if the need arises.

You might notice I capped the stack trace depth at 10 by default to save memory. This is adjustable. The additional $topMethod, $topApplicationMethod, and $topNonInitMethod are just convenience columns that are often used in JFR views.

Of course, I could have normalized the SQL table by replacing the array with ten columns, but this clutters the table view and doesn’t offer space benefits.

Now on possible front-ends:

Front-End

Currently, there is no custom front-end for the DuckDBified JFR files, but there are several existing options. One is the Grafana DuckDB source, which should allow you to view the profiling and monitoring data using Grafana dashboards.

Furthermore, there is a built-in DuckDB UI (via duckdb -ui):

This is a simple web-based UI that allows you to run queries. I used it a lot while developing the current tool. Unfortunately, it doesn’t allow you to create plots.

Name

This project doesn’t have a proper name yet, so I’m open to suggestions. It could be “JFRDuck” or “JFRQuery,” but I’m unsure.

Tool

You can either build the tool via Maven, download the latest release from GitHub, or directly call it using jbang (jbang jfr-query@parttimenerd/jfr-query). Don’t be afraid of its size of almost 80MB, which is primarily due to the embedded DuckDB database driver.

The tool currently has essentially five commands:

> java -jar target/query.jar Usage: query.jar [-hV] [COMMAND] Querying JFR recordings with DuckDB -h, --help Show this help message and exit. -V, --version Print version information and exit. Commands: import Import a JFR recording into a DuckDB database query Execute a SQL query or view on the JFR DuckDB database and print the results. macros List available SQL macros (views) for JFR analysis. views List available SQL views for JFR analysis. context Create a description of the tables, macros and views for generating SQL queries using AI help Display help information about the specified command.

The two most common use cases are the transformation of a JFR file into a database:

> java -jar target/query.jar duckdb import jfr_files/default.jfr duckdb.db

And querying via the query command, as shown at the beginning of this blog post. The idea is to have a self-contained tool that contains every dependency.

Conclusion

In this blog post, I showed my newest work on querying JFR files using DuckDB. In my opinion, this is the way forward and a building block for future JFR-related tooling. The DuckDB database allows you to easily analyze JFR data using SQL, without me having to implement a custom query engine or extend the existing jfr tool.

Thanks for coming so far. I’ll see you in a week or two for a blog post on the next part of this series, most probably a short blog post on how AI can help generate SQL queries.

This blog post is part of my work in the SapMachine team at SAP, making profiling easier for everyone.

P.S.: Enjoy the autumn…

  • Johannes Bechberger

    Johannes Bechberger is a JVM developer working on profilers and their underlying technology in the SapMachine team at SAP. This includes improvements to async-profiler and its ecosystem, a website to view the different JFR event types, and improvements to the FirefoxProfiler, making it usable in the Java world. He started at SAP in 2022 after two years of research studies at the KIT in the field of Java security analyses. His work today is comprised of many open-source contributions and his blog, where he writes regularly on in-depth profiling and debugging topics, and of working on his JEP Candidate 435 to add a new profiling API to the OpenJDK.

    View all posts

New posts like these come out at least every two weeks, to get notified about new posts, follow me on BlueSky, Twitter, Mastodon, or LinkedIn, or join the newsletter:

Read Entire Article