Data grid transformation: architecture evolution in a maturing product

6 days ago 2

Initiate conversation by prototypes
Evolving the architecture
Replacing Tanstack Table with AG-Grid
Get the whole team on board
Lessons learned

Co-authored with Adam Peresztegi.

We enjoy programming to provide digital solutions that solve people’s problems. While building is definitely an exciting part of the process, we also value reflecting on how things went, what to keep doing, and learning from mistakes.

As of this writing, we work for a client in the supply chain management industry. Our team is responsible for building a data platform that ingests various logistics data sources and a web application that provides custom views of the data and streamlines the workflow of the operators who work with shipment containers.

Recently, we were faced with a particular product need that raised questions about one of the core components of the web application. In the following, we share how we approached the dilemma and eventually resolved the product need.

Initiate conversation by prototypes

Gabor: I have been working on this project for a year. I’m leading the development of a web application that provides logistics operators easy access to container shipment data required by their day-to-day workflow. The main UI component of the application is a data grid. Container data is fetched from the data warehouse, optionally filtered, grouped, or sorted on the backend, and then displayed in the UI. Users can save their customized grid views to support recurring tasks. The founding team of the web application chose to integrate TanStack Table for the data grid. It is a lightweight, modular framework that enabled the team to build grid features and easily fit into the existing TypeScript-React-based frontend stack.

Fast forward to today, with a better understanding of user needs and an increased demand for new grid features, it has become clear that the original grid architecture has lately been more constraining than supporting product development. Since we manipulated data on the server side, it required custom development to introduce features like advanced data filtering or grouping. Each time a user altered the data view, it meant an extra round trip between the server and the browser, which also led to a poor user experience. We often felt that our development team’s capacity was being used to implement standard grid features instead of solving logistics-related business problems.

Operators need to stay on top of the container shipments they track. Therefore, it’s crucial that they can view the same dataset from different perspectives, sometimes organizing datasets around ETA dates, other times around MBLs, etc. A valuable tool for this is the ability to group shipment rows by various data points. When the product team conveyed this request, we realized it was time to revisit the technical foundation. Implementing nested grouping on top of the existing architecture would have required significant custom development effort from our team.

Around this time, Adam, a long-time friend and senior full-stack engineer, joined our development team. He was new to our codebase and TanStack Table, and he was challenged to explore options to improve data fetching performance and propose a solution to implement nested data grouping. Could you elaborate on this work, Adam?

Adam: We’ve talked a bit already about the project before I’ve joined, so I came with some understanding of the tech setup. I had zero experience with logistics software though. I quickly explored the product, and it became apparent that all workflows start from the grid and end with the grid. It is the centerpiece of the application. I understood that there were performance issues with the grid and quite a few upcoming features were built on top of it. I decided to slow down a bit and consider all possible solutions when touching this component. I had the feeling that we should nail it in one go, fixing the performance issues and build for maintainability, because coming back to it refactoring later won’t be an option.

When I started working on the grid, I created an in-application prototype. First, I explored the performance issues, and it turned out there’s been a big bottleneck in performance on the front-end side. We rendered too many components at the same time even though these components were not on screen. I learned about the so-called virtualization techniques in React. Virtualization simply means that not all the components are rendered from the Virtual DOM but only the components visible in the viewport, a standard technique in game engines. I made a prototype using TanStack Table and an easy-to-configure solution from the several virtualization libraries available.

So there was a prototype which was fast, but it was not perfect. I couldn’t just plug it in the grid component because not all features were supported. When using this virtualization technique, you must reimplement the whole view, which I didn’t want to do at this point. Going all in on TanStack table would have required significant time investment without guaranteed returns.

Also a new feature request had just dropped. Multiple levels of data grouping was needed. What did this mean? Visualizing the grouping in the grid is solved by showing nested rows - hiding and opening the groups of rows on demand. A single level of grouping was already implemented. After reading the code and examining the TanStack Table library I found that our solution was already stretching the limits of the library. There were many workarounds put into the code to make TanStack Table work with a single level grouping. I quickly created a prototype with TanStack Table, a basic one, for the multi-level grouping. I also implemented an AG Grid prototype exploring the API of this library. It proved to be easy to demonstrate the multi-level grouping feature with a view similar to the existing grid.

So this is how I continued to expand the in-app prototype. First virtualization, then multi-level grouping. Everyone in the company could start testing the new library features. We started a dialogue about next steps with the stakeholders. Implementing new grid features became our primary goal, so we had the budget to explore different solutions rather than immediately jumping to implementation mode. As AG Grid solved both the performance issues, coming with built-in virtualization technology, and seemed to support the upcoming grid features it seemed to be a solid choice from the tech perspective.

Gabor: One of the topics we began discussing with the product team was the future vision for the data grid. Until then, we documented most of our needs as separate items in the product backlog. We had not yet reached a consensus on the specific features we wanted for the data grid in our product. Our focus had primarily been on features for upcoming sprints.

We realized it was important to step back and define a mid-term scope. A clear scope was essential for evaluating different grid solutions and estimating the development costs associated with those features if we decided to implement them in-house. After this evaluation, we concluded that, both from a financial perspective and in terms of product features, it made sense to transition our data grid component from TanStack Table to AG Grid Enterprise.

Within a month of concentrated engineering effort, we resolved the data fetching performance issues and implemented the originally requested nested data grouping. Additionally, we were able to deliver several enterprise grid features from the product backlog, which we could easily do through AG Grid’s configuration options.

This shift in our approach allowed us to move the conversation from the time required to implement specific data grid features to how we could customize the grid to better meet our users’ needs. In this regard, opting to buy the solution proved to be the right choice in the classic “build vs. buy” dilemma.

Evolving the architecture

Adam: It is fascinating to observe how an agile software project moves forward. You want to schedule items for the next months or weeks to focus on, but also when planning these short iterations a broader vision is needed also to find a good architectural solution, with work spanning multiple months or even years.

The trigger for the broader rethinking had several reasons: technical issues, upcoming features, and partially me because I didn’t fully understand the reasons behind the existing architectural choices, since I wasn’t involved in the creation. The code seemed hard to maintain, so I immediately started to think about alternatives. On the other hand I invested quite a bit of time to prove that TanStack Table can be a viable option, but I found that it was not the best choice for us. Had I been involved in writing all the previous iterations with TanStack might have pushed me toward investing more time in it, as opposed to explore other options.

So how can we find an architecture in such a situation? Rewrite is expensive, but building on shaky foundations can be even more expensive. Comparing the current solution rationally, not just on a gut feeling level, with other alternative architectural solutions seems to be the best idea. To collect information for rational decisions a programmer sometimes needs to go deep, read the code and explore, invest time. So I spent time reading the code, paying the tech debt, so I could tear down this Chesterton’s Fence backed by understanding.

Stepping away from the code, it really helps to think through what a particular client needs in the upcoming months or years, so we can plan ahead not just one month but further. For example, would AG Grid compose well with our design component library, which was another track inside the company? We had to invest time exploring the broader context before choosing AG Grid.

Gabor: Based on my professional experience, I have often encountered the dilemma of deciding when to rewrite a solution versus how much to invest to maintain the current solution. In the case of the data grid, introducing TanStack Table was a reasonable choice at the time, considering the product needs and engineering constraints. It was an open-source, lightweight component that provided essential grid features and could be integrated quickly, making it a solid option for an MVP.

However, as the product matured and user needs became clearer, it became necessary to reevaluate earlier choices. TanStack was initially promoted as a good fit, especially for use cases where bundle size and SEO were important. In our case, however, we were developing a web app that required authentication for a limited number of users, and the dataset they needed to download was already a few megabytes. AG Grid’s Enterprise version emerged as a compelling alternative. It offers a wide range of features, including design system support, advanced data filtering, grouping, aggregation, and charting capabilities.

While the product features were already impressive, technical support and maintenance also played crucial roles in our decision-making process. We recognized that, in the long run, a different group of people would be responsible for maintaining the product. It is essential to consider how easily they can be onboarded and whether they will need to manage a significant amount of custom code, or if they will be using an open-source, well-documented software component. This consideration is particularly important in a client-agency setup, but it also applies to company-employee relationships.

Adam: We mentioned that AG Grid is this heavyweight enterprise library — but what does that actually mean? My first thought was that enterprise would mean loading a lot of data and working with it easily in the same grid, like a desktop app. I checked the database, and it seemed that we are easily in the manageable range with our data size to load everything into AG Grid. It handles this amount of data easily on the front end because of virtualization techniques applied. But does this architecture satisfy the client’s future needs also?

We are building a platform for our client and the most used part fits in the enterprise application space. This part of the platform is used by internal teams, so the concurrency needs are not so high. If it was possible to load all the data into the grid for clients to work efficiently and solve all the performance issues we had, that would be ideal, right? Except that we had issues also with our backend.

There was a quite nice architecture in place: TanStack Table with Tanstack Query in the frontend and a GraphQL server in the backend. In theory, we could create complex queries to the backend and get data to the frontend based on these queries with filters, groups, and so on, relying on GraphQL’s flexibility. This is the best fit architecture for GraphQL, one would think, right?

But it turned out that it’s easier to fetch all the data and work with it in the frontend because it’s a much more complex architecture to make a paginated view work with a huge amount of data, especially because of the grouping. With pagination, you can granularly load the data, but what happens when you open a group of thousands of nested rows? Pagination doesn’t seem easy when you need to open and close parts of the data on demand.

It turned out that the GraphQL query performance is the next performance bottleneck, once we solved frontend performance with virtualization. When I tried loading all the data into the grid, it took half a minute with GraphQL. It was interesting to see how the new approach didn’t work well with the old stack. So after replacing just one piece (TanStack Table) with another piece (AG Grid), we had to replace the GraphQL endpoint implementation with a simple JSON API as well. GraphQL has performance overhead - especially with Python it seems - and it just didn’t seem right to try patching it.

Finally, I made all these changes, frontend and backend, and came up with a performant enterprise grid prototype.

Replacing Tanstack Table with AG-Grid

Gabor: We also realized that the assumptions we had made earlier were no longer valid. First, there was no real need for the features of GraphQL, as the only consumer of the data was the application we were building. Second, exposing the entire data set to the client made sense since the application could handle basic grid operations like filtering and grouping. This approach helped eliminate extra network traffic and reduce server load.

I thought it would be helpful to touch on the migration process as well. Ultimately, we decided to go with AG Grid, and we are about to launch the new grid to our users.

The transition to AG Grid wasn’t particularly challenging from a developer’s perspective. The main task was adopting the mindset of how AG Grid operates and familiarizing ourselves with its feature set to establish a solid initial configuration. We also needed to port several of our custom grid components since we don’t only display string and number values in the table; we also have complex components that enable users to interact with the data and perform specific functions within the application.

We needed to verify how data mutations would work with AG Grid, which ultimately integrated seamlessly with our TanStack Query solution. Additionally, it provides nice user experience feedback when the underlying data changes.

What I found most challenging during the transition to AG Grid was that, while it has good documentation, the examples mainly showcase very simple use cases. Our complex scenarios, particularly nested data grouping, required thorough analysis. Through experimenting with different approaches, I identified an effective pattern that suited our needs. I guess this learning curve was the cost of adapting to AG Grid on the go.

Overall, we didn’t face any major issues during the grid migration. What do you think about this?

Adam: Before I joined the project, a single developer primarily worked with TanStack Table and related features. We decided that we would work on the grid migration as a team. It was a great decision because we came up with a more consistent solution, talking about pros and cons, and everyone needed to touch all parts of the code. Another factor shaping the implementation was that we have a design system with an automated pipeline pulling design tokens into the codebase. It meant we had to figure out ways to make AG Grid look native within our design system and component library. As we were migrating an existing solution, we found a few usability issues and technical bugs with AG Grid. We reported these issues and implemented workarounds during the process.

We could almost exclusively work as a team on the migration process for a whole iteration (one month). This was crucial for the migration process and was only possible with the support of business stakeholders and changing product backlog priorities. We got the greenlight, I think mostly because all parties were involved in the research and decision-making process, so we reached alignment quite quickly once we decided to migrate.

So we moved into the implementation phase with a good understanding of the future goals and quite elaborate documentation on why we chose AG Grid. Throughout our work, we communicated within the team and continued to update the engineering documentation. I think it can be said that not only did we enable the application with more advanced grid features, but the new implementation has a better structure and much better documentation as well.

Also, this was the first tech project I worked on with you Gabor. We’ve spent countless hours talking about programming and tech in general before, going back to high-school. I was a bit worried about how it would pan out. Looking back I can say that we should have done something earlier, building things together is fun.

Get the whole team on board

Adam: My main takeaway is that it’s really important to quickly assess how important certain parts of an application are, and when there’s upcoming work, to investigate all of it — not just a small batch. Prioritize finding ways to internalize a healthy amount from the business side of things, especially when you’re new to a project or working as a contractor.

It is almost self-evident that you always have to ask for the broader context and connect with someone who understands the future plans, but it’s really important. I’ve spent too much time analyzing the existing code when I should have first gathered requirements by talking more with stakeholders before building any prototype. In retrospect, pulling in AG Grid from the beginning would have streamlined my first round of performance optimizations.

Takeaway: Collect information on how the technical component you are working on is planned to be used. Avoid creating technically sound solutions solving outdated problems.

Gabor: Regarding what you shared, I found the timing interesting. You had just joined the project and wanted to better understand the context, which allowed you to challenge some of the status quo. Fresh perspectives can often reveal insights that others might overlook. This approach helped us step back and view the project from a different angle.

A key takeaway for me is the importance of establishing a rhythm between maintaining focus on deliverables and taking time to reflect. It’s crucial to regularly step back to validate whether our actions still align with our goals. Retrospective meetings are a great format for this purpose. It’s important to reflect on both product and technical perspectives.

If your team has a dedicated product lead and a technical architect, they can take responsibility for this reflection. However, ultimately, the whole team shares the responsibility for delivering value to the user. It’s essential to deliver quickly to achieve this, which requires that the underlying architecture and infrastructure support that speed.

Additionally, regularly checking in with the product team will help us better understand how to utilize our development capacity, and whether we should buy or build solutions. In our case, we went through this process and concluded that we needed to change the architecture of a key component of our software. As a result, we reduced the amount of custom code and minimized maintenance requirements. This not only saved development costs but also added user value by introducing previously lacking grid features by the time we completed the migration.

Lessons learned

Adam: Another key takeaway: integrating prototypes directly into the main application can be more valuable than building standalone versions — but any quick prototyping beats theoretical debates.

I build two types of prototypes: small standalone applications for initial exploration (quickly discarded after gaining insights) and prototypes embedded within the main application. For this project, the embedded approach proved to be effective. The logistics data grid solution benefited from testing the performance and new features with real data within the application.

Modern development tools make it straightforward to rapidly build these integrated prototypes. Stakeholders can interact with them immediately, generating short feedback loops.

Gabor: I want to share another important lesson. In every team I’ve joined, I’ve found that writing technical RFCs effectively addresses technological challenges or architectural changes. These documents serve as valuable resources for newcomers to the project, helping them understand why the architecture was designed in a certain way. RFCs allow the team to review the decisions they are about to make, leading to better choices in the present. Furthermore, they provide a way to revisit past decisions in the future, offering insight into the reasoning behind them.

Adam: RFCs serve different purposes in different contexts (similar to ADRs - Architectural Decision Records). Based on my experience, each company presents unique challenges when introducing change. In the current team, the challenge was collecting and integrating information. In teams with a longer history, decision-making is hard despite well-documented information. One thing is common though: RFCs function as communication channels between organization members. I’d even say that creating RFCs helps understand company culture and determine how decision-making happens naturally. I also like that it has a “social engineering” aspect, making it possible to discover how people think within the organization.

Understanding the dynamics within the organization makes future changes more fluent, as getting everyone on the same page is the goal, not just documenting decisions.

Over the course of a month, our newly formed development team identified an architectural challenge and delivered a solution that addressed the initial user need while also tackling long-standing items from the product backlog. We migrated the data grid from TanStack Table to AG Grid, introduced nested data grouping, enhanced the performance of data fetching, resolved several user experience issues with the previous implementation, and adapted the grid to align with our design system.

The new features and better performance are the visible outcomes of a development effort. A more hidden result of a successful development project is the evolution of the organization. Starting fruitful cross-functional discussions, paving the road for building early prototypes, and creating a habit out of gathering and recording the lessons learned contribute to delivering value.

Read Entire Article