Open source Apache Pinot advances as StarTree boosts real-time analytics and observability | VentureBeat

Discover how companies are responsibly integrating AI in production. This invite-only event in SF will explore the intersection of technology and business. Find out how you can attend here.

At its annual Real-Time Analytics Summit today, StarTree announced a series of major product updates aimed at making large-scale, real-time data analytics and observability more accessible in the cloud era.

StarTree is the leading commercial vendor behind the Apache Pinot real-time analytics data store platform. Pinot is led by CEO Kishore Gopalakrishna, who was formerly an engineer at LinkedIn where the project was created. The basic idea behind Pinot is to provide a reliable, high-speed data store with the right type of indexes and optimization to enable real-time analytics at scale. StarTree builds on top of Apache Pinot, providing a cloud real-time analytics as a service platform. Among the many large enterprises that rely on Pinot are Stripe, Walmart and DoorDash. StarTree and Apache Pinot compete against multiple technologies including other open-source options such as the StarRocks online analytical processing (OLAP) database.

The core announcements at the Real-Time Analytics Summit center around the open-source Apache Pinot project as well as StarTree’s commercial offerings that build upon and extend Pinot’s real-time analytics capabilities. Key updates include a new serverless cloud service, native integrations with leading data visualization tools including Grafana and Tableau, general availability of the ThirdEye observability service, vector search support and a new cloud write API.

Real-time data has many different use cases, with the new updates StarTree is doubling down on helping users with observability.

Join us as we navigate the complexities of responsibly integrating AI in business at the next stop of VB’s AI Impact Tour in San Francisco. Don’t miss out on the chance to gain insights from industry experts, network with like-minded innovators, and explore the future of GenAI with customer experiences and optimize business processes.

“Now if you look at metrics, logs and traces that’s another area where we are seeing a lot more pull and people are pushing all these metrics and events into Pinot,”Gopalakrishna told VentureBeat. “Real time is becoming more and more widely applicable and a lot more use cases are popping up.”

To help support observability use cases, StarTree is launching its ThirdEye technology today as a generally available service.

Chinmay Soman, head of product at StarTree explained to VentureBeat that ThirdEye helps with both identification of anomalies in real-time as well as the triaging and root cause analysis. Soman said that the focus with ThirdEye is on complex business metrics that traditional monitoring systems struggle with. With Apache Pinot at the foundation, the ThirdEye service can compute metrics from data points in real time.

It’s important to note that business metrics are not the same as system metrics about IT systems. For example, Soman said that ride-sharing vendor Uber at one point had all of its system metrics being monitored by a PagerDuty system. That technology is not able to easily monitor driver supply metrics, which is a derived and computed metric that Soman said is difficult for a traditional monitoring system to handle.

The new Apache Pinot 1.1 release is also being highlighted at the conference.

The big highlight in the release is vector index support. Vector indexes are increasingly useful and common for large language models (LLM) and generative AI use cases. In recent months, many database technologies have added vector support. Vector support has become such a core capability for a database that Google recently announced that all of its cloud databases would support vectors.

There are multiple ways of enabling vector index search in a database. With Apache Pinot 1.1 support has been added for Hierarchical Navigable Small Worlds (HNSW) graphs.

Gopalakrishna noted that adding support for vectors wasn’t a particularly hard thing to do in Pinot as the technology already has multiple types of indexes.

“Where Pinot shines is the ability to provide very low latency even at high concurrency,” he said. “We have different types of indexes you can index on the geo, text data, JSON, numerical data and  so we have very specialized indexes for all of these data types.”

VB Daily
Stay in the know! Get the latest news in your inbox daily

Subscribe

By subscribing, you agree to VentureBeat’s Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Join GenAI leaders in San Francisco on May 8th for an exclusive invitation-only event focused on the latest advancements shaping the future with the practical applications of generative AI in production.