Blog
Tired of Slow Gantt Charts? IBM DOC/DB Gene Handles Huge Datasets with Ease
DecisionBrain’s IBM DOC/DB Gene platform makes it possible to quickly produce applications with an intelligent and intuitive display of data and performance indicators that, thanks to the integration with optimization and machine learning methods, help professionals make informed decisions.
The applications developed to date with IBM DOC/DB Gene address use cases relating to supply chain, logistics, industrial production, or workforce management in a wide variety of industrial sectors, such as agriculture, pharmaceuticals, textiles, automotive, paper, port terminals, banks, supermarkets, electronics, and call centers.
As planning production or resource allocation is central to many IBM DOC/DB Gene applications, in addition to KPIs and charts, an important visualization tool for such planning is the Gantt chart.
A Gantt chart is used to visualize a plan. It represents assignments of resources (machines, materials, or persons) to activities. Time is presented horizontally, and resources are presented vertically. Assignments (called “events” in the article) are displayed as rectangles spanning from a start time to an end time, and aligned with the corresponding resource. Additional information can be used to define the rendering and the text displayed on events.
Resources can also be grouped based on their properties. For example, human resources can be presented in teams or sites, and machines can be grouped by type or production line.
IBM DOC/DB Gene applications can handle large plans and manipulate hundreds of thousands of events. When using a large number of events, there are two areas where performance can become an issue:
- Rendering
- Data loading and parsing
In this blog post, we detail how the Gantt chart component of IBM DOC/DB Gene was developed to address performance issues in the two areas above.
Rendering Performance
DOM-Based Implementation
A Gantt chart requires a simple representation and has a grid-like layout. Therefore, it makes sense to use DOM (the webpage’s Document Object Model) elements such as DIV to render the Gantt chart. The old Gantt chart component used by IBM DOC/DB Gene was based on this technical choice.
Manipulating the DOM provides certain benefits:
- CSS can be used easily to alter the rendering of events
- Mouse events can be used on DOM elements and provide consistent behavior with other components
- Using a scrollable container provides the scroll capability
The main drawback of this choice is the size of the DOM, which can become problematic when trying to place hundreds of thousands of DIVs with individually assigned absolute positions.
Canvas-Based Implementation
The new Gantt chart component developed by DecisionBrain renders chart elements using the HTML canvas. In addition to a light DOM, there are additional benefits:
- Improved performance of rendering
- High flexibility in rendering
- Easy filtering of the rendered elements, based on what will be visible on screen
- Possibility to render sub-pixel elements when needed (zooming out on a large dataset, for example)
The example below shows a Gantt chart in IBM DOC/DB Gene. In this example, the Gantt chart displays a plan of randomly generated data containing 100,000 events for 200 resources. The “show-all” feature is possible thanks to this canvas-based implementation. With the data already in memory, the rendering of this chart takes less than 400ms (depending on client hardware):
Loading Performance
Loading Steps
The loading is performed in two steps:
- First, load the resources. If the resource is a single primitive field of the event entity, distinct values for this field are retrieved. If the configured resource is another entity, all entities are retrieved
- Once the resource entities are retrieved, events are retrieved
The main reason why these two steps are needed, even when loading all data at once, is to support the option to display resources without any assigned events.
Default Loading
The default loading gets all the resources, then all the events, before building the complete Gantt chart data model.
Dynamic Loading
When manipulating large datasets, it makes little sense to load all the data at once, especially when the resulting Gantt chart displays only 20 to 30 rows. The main idea to optimize this is to implement something similar to pagination in data grids. There are two ways to consider this approach in data grids:
- Classic pagination: display one page of 50 rows at a time, and provide a page navigator
- Inifini-scroll: display the first 50 rows, and when the user scrolls down, start loading the next rows
The Gantt chart of IBM DOC/DB Gene uses an approach similar to the infinite scroll, with the difference that it will load all resources with some data aggregations.
In this situation, the data source builds the resource tree by using IBM DOC/DB Gene’s chart aggregation API. It can build a tree of resources, associated with the minimum and maximum timestamps of the associated events, without actually loading the events. If the resources are entities, only the internal IDs are loaded at this stage. Labels are loaded in a later step.
Loading the whole resource tree, as well as the time range information, allows the component to display the correct time range and vertical scrollbar as soon as the resource tree is ready, before loading the events.
The queries are managed in a queue that allows cancelling unnecessary queries if the user scrolls faster than the data is loaded.
The loading of the totality of the dataset using the dynamic approach may be longer than the classic approach, due to the larger number of queries and the use of filters in these queries. However, there’s no risk of timeout (from the HTTP query or the database access) since the amount of data loaded at once is limited.
Performance Comparison of Classic vs Dynamic Loading
The following chart gives an approximate comparison of the performance between the two models, DEFAULT (load everything) and DYNAMIC (infinite-scroll), for three different datasets containing, respectively:
- 150 resources and 3,000 events
- 500 resources and 100,000 events
- 1500 resources and 300,000 events
Remarks:
- The smallest dataset has a different number of events per resource, which explains a smaller timing for the dynamic loading. For the larger datasets, the timing is similar because the number of events to load and parse is very close.
- Only the main event query (the first event query in the case of the dynamic loading) is displayed.
- The rendering time was left on the chart, but is negligible compared to the loading and parsing times.
Conclusion
Performance Improvements
Thanks to the progressive data loading, the time to display the Gantt chart could be brought down from more than 10 seconds (30 for the largest dataset) to less than one second. The time to display a new page while scrolling is around half a second in these testing conditions.
Paths for Further Improvements
The current implementation only relies on loading data progressively based on the visible resources. Other dimensions and aggregations could be used to load the data progressively:
- Set the default view to a predefined period (e.g., current week) and only load events for this period. Then enable progressive loading while scrolling on the time axis
- Collapse the resource groups by default and only load data when a group is expanded
- Load a summary of events (e.g., number of events between two timestamps) and load the details when zooming in or clicking the data
These approaches and more are possible through the extensive customization possibilities of IBM DOC/DB Gene’s widget controller framework.
About the Author
Cédric Villeneuve is a senior software architect with 25 years of experience across various industries and companies. He has been with DecisionBrain for the past 5 years. After contributing to customer projects, he now works on the IBM DOC / DB Gene platform. His main areas of focus are the user interface and data visualization components, and he is responsible for developing the Gantt chart. Cédric holds a Master’s degree in Computer Science and Automation from ENSPS (now Physique & Télécom Strasbourg), as well as a Master’s degree in Photonics and Imaging from the University of Strasbourg. You can reach Cédric at: [email protected]
At DecisionBrain, we deliver AI-driven decision-support solutions that empower organizations to achieve operational excellence by enhancing efficiency and competitiveness. Whether you’re facing simple challenges or complex problems, our modular planning and scheduling optimization solutions for manufacturing, supply chain, logistics, workforce, and maintenance are designed to meet your specific needs. Backed by over 400 person-years of expertise in machine learning, operations research, and mathematical optimization, we deliver tailored decision support systems where standard packaged applications fall short. Contact us to discover how we can support your business!
Ready to transform your operations?
Contact DecisionBrain today to discover how our solutions can help your business thrive.