Newest Top-Level Projects Enhance Data and AI Workload Management
In a significant development for the tech community, the Apache Software Foundation (ASF) has declared that two of its projects, Apache Gravitino and Apache StormCrawler, have successfully transitioned from incubation to become Top-Level Projects (TLP). This milestone, announced on June 3, 2025, in Wilmington, DE, marks a new era in data and AI workload management, offering robust solutions for organizations worldwide.
Apache Gravitino: Revolutionizing Metadata Management
Apache Gravitino emerges as a powerful open-source metastore that offers a unified approach to managing metadata across various platforms. By integrating data from data warehouses, lakes, lakehouses, streaming platforms, and AI systems, Gravitino addresses the long-standing issue of data silos. This capability is particularly valuable for organizations that need to manage large and varied data and AI assets efficiently.
The key to Gravitino’s success lies in its flexible and centralized architecture, which simplifies modern data infrastructure management. It supports a wide range of ecosystems, including popular platforms like Apache Iceberg, Apache Hive, Apache Kafka, MySQL, and PostgreSQL. This broad compatibility ensures that organizations can leverage Gravitino for intelligent data discovery, effective governance, and efficient lakehouse federation on a large scale.
Jack Song, Director of Uber Data Platform, expressed enthusiasm about Gravitino’s potential: "Gravitino is uniquely designed to bridge data and AI workloads. We’re excited to deploy it across our multi-cloud AI clusters and contribute to many prioritized AI and agent-based use cases." Song’s comments underscore the maturity and readiness of Gravitino to tackle complex data challenges, backed by an active and engaged community.
Apache StormCrawler: Empowering Web Crawling Solutions
Apache StormCrawler is another remarkable project that has reached TLP status. Designed as an open-source software development kit (SDK), StormCrawler is ideal for developers aiming to construct low-latency, scalable, and customizable web crawlers. The project features a collection of reusable resources, primarily written in Java, and operates using Apache Storm®.
StormCrawler is particularly adept at handling environments where URLs need to be fetched and parsed continuously over time. It is also well-suited for large-scale, recursive web crawling, especially in scenarios where fast response times are crucial. This capability makes it an invaluable tool for developers and organizations that rely on timely data collection and processing.
Julien Nioche, a member of the Apache StormCrawler Project Management Committee, shared his excitement about the project’s new status: "Becoming an Apache Software Foundation Top-Level Project is a significant milestone for an open-source community, and we are extremely proud of the accomplishment." Nioche looks forward to seeing how the StormCrawler community will continue to grow, innovate, and collaborate as a TLP.
The Role of the Apache Software Foundation
The Apache Software Foundation plays a pivotal role in nurturing open-source projects and communities. Founded in 1999, ASF has been at the forefront of open-source innovation, powering some of the world’s most ubiquitous software projects, such as Apache Airflow, Apache Camel, Apache Cassandra, Apache Groovy, Apache HTTP Server, and Apache Kafka. The foundation’s mission is to advance software for the public good, setting industry standards and promoting best practices.
ASF supports projects throughout their lifecycle, offering services and mentorship to build strong and resilient communities. Through the Apache Incubator, the foundation provides services to incoming projects, known as podlings, that aspire to join ASF and adhere to the "Apache Way"—a set of practices that emphasize community-driven development and collaboration.
The ASF’s annual Community Over Code event is a cornerstone of its efforts to foster collaboration and knowledge sharing among open-source technologists. This event provides a platform for participants to exchange best practices, discuss use cases, and build critical relationships that drive innovation in the field. More information about the event can be found at Community Over Code.
Conclusion
The graduation of Apache Gravitino and Apache StormCrawler to Top-Level Project status is a testament to the strength and vitality of the open-source community. These projects exemplify the potential of open-source solutions to address complex challenges in data and AI workload management, offering scalable, efficient, and innovative tools for organizations across the globe.
As the tech landscape continues to evolve, the contributions of projects like Gravitino and StormCrawler will play a crucial role in shaping the future of data management and web crawling. The continued support and guidance from the Apache Software Foundation ensure that these projects, and others like them, will have the resources and community backing they need to succeed and thrive.
For more information about the Apache Software Foundation and its projects, visit Apache Software Foundation.
This article reflects the latest advancements in open-source software development, highlighting the collaborative efforts of the global tech community to build powerful, accessible, and sustainable solutions for the digital age.
For more Information, Refer to this article.