How to take advantage of a generative tool fueling Glean’s $260M raise: GraphRAG

Technology

September 11, 2024 12:45 PM

Technology a programmer wearing a headset looks at multiple monitors displaying financial data and dollar signs in colorful AI illustration

Credit: VentureBeat made with Midjourney

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

When a sales representative at Glean, an innovative enterprise search company, needed to prepare for a crucial client meeting, they turned to their own powerful generative AI tool. Within minutes, the system had combed through years of emails, Slack messages, and recorded calls, providing a comprehensive overview of the client relationship and spotting opportunities that would have taken hours to uncover manually.

This wasn’t just another AI chatbot. It was a sophisticated search system that understood the complex web of relationships within the company’s data. The result? A level of insight that transformed how businesses could operate.

The power of this technology isn’t just theoretical. One of the world’s largest ride-sharing companies experienced its benefits firsthand. After dedicating an entire team of engineers to develop a similar in-house solution, they ultimately decided to transition to Glean’s platform.

“Within a month, they were seeing twice the usage on the Glean platform because the results were there,” says Matt Kixmoeller, CMO at Glean, in an interview with VentureBeat conduced in late August 2024. “They ended up estimating that across all of their employee base, that everyone was saving, on average, two to three hours a week on finding information faster. And that was over $200 million in savings for them globally.”

This staggering ROI isn’t an isolated incident. As businesses rush to integrate generative AI into their operations, a powerful technology is emerging as the secret ingredient for truly transformative applications: knowledge graphs.

A data engineer’s secret weapon

For data engineers, the pressure to optimize data pipelines, improve data quality, and enhance AI performance while operating under tight budget constraints is relentless. Enter knowledge graphs.

By representing complex data relationships in an intuitive, flexible format, knowledge graphs are revolutionizing how businesses handle, understand, and leverage their vast information ecosystems. This technology is proving particularly powerful when combined with Retrieval Augmented Generation (RAG) systems, giving birth to GraphRAG – an approach that significantly improves the accuracy and context-awareness of AI outputs.

The market is taking notice, with Glean securing a massive $260 million in its latest funding round announced yesterday. From turnkey solutions to advanced custom implementations, knowledge graphs are offering data professionals a spectrum of options to transform their data strategies.

While the initial investment can be significant, the long-term benefits in data integration, gen AI performance, and operational efficiency are substantial. As the technology matures and becomes more accessible, knowledge graphs are poised to become an essential tool for data teams looking to build more intelligent, context-aware, and efficient data ecosystems.

To grasp the concept of knowledge graphs, think of them as a complex sentence or paragraph:

Nodes are like nouns, representing entities or concepts. For example, “customer,” “product,” or “sales meeting.”
Edges are like verbs, showing relationships between nodes. For instance, “purchased,” “attended,” or “is interested in.”
Properties are akin to adjectives or adverbs, providing additional information about nodes or edges. They might include details like “purchase date,” “meeting duration,” or “interest level.”

This new dimensionality to corporate data allows automated systems to elevate insights that would be harder to identify but does come with extra complication.

“A knowledge graph allows you to represent and query these complex relationships efficiently,” said Neo4j CTO Philip Rathle. “When you look at trying to do this across every piece of data in your organization, the scale required, the security required, the permissions required, all of that becomes a real issue.”

Retrieval Augmented Generation (RAG) and GraphRAG

RAG is a technique that enhances AI models by providing them with relevant information retrieved from a knowledge base before generating a response. Traditional RAG systems often rely on vector databases to locate chunks of text based on semantic similarity.

GraphRAG takes this concept further by leveraging the structured relationships in knowledge graphs. As Arjun Landes, engineering manager at Glean, explains: “The fact that we were able to build such a sophisticated knowledge graph and combine it with LLMs is where the real power is.”

In practice, GraphRAG allows for more nuanced and context-aware information retrieval than simple vector search by itself. “You’re loading dice with RAG with vectors, but you know, loading dice isn’t good enough if you’re doing equipment maintenance or complex customer service for a high-value customer,” said Rathle.

Instead of just finding similar text chunks, it can traverse relationships between entities, understand hierarchies, and capture complex dependencies that flat text representations might miss. This can dramatically reduce hallucinations and increase explainability when leveraging LLM outputs.

“What ultimately makes GraphRAG the right solution and desirable is: higher accuracy – potentially 100% accuracy in cases where there is an exact answer,” said Rathle, “And explainability and security, because with vector based RAG, and certainly with LLMs, there are limited hooks for being able to apply security rules.”

Implementing knowledge graphs on a budget

For many organizations, especially those with tight budgets, implementing knowledge graph technology might seem daunting.

However, there are cost-effective ways to incorporate this technology into existing data infrastructures.

Dexter Tortoriello, co-founder and CTO of MindPalace, a startup building a generative tool which will organize and leverage an individual’s different kinds of personal information across, offers some insight: “I think we’re still very early in the consolidation phase [of GraphRAG services]. So I think we’re still on the side where people would rather have building blocks and build their things.” While turnkey solutions like Glean are available, there’s also room for more budget-friendly, DIY approaches.

Open-source tools and community-driven initiatives can significantly reduce implementation costs. Neo4j offers a community edition that is free for smaller-scale projects, Amazon Neptune is integrated with AWS and projects like NebulaGraph provide open-source frameworks for building knowledge graphs.

Rathle explains the value proposition of the Neo4j: “We are the technology provider for anyone who wants a knowledge graph, or has data that, once loaded into a graph database, can be used as a knowledge graph. We provide all the connectors and APIs and query languages, hosted service and tooling for visualizing and querying and natural language to query, and that whole side of things.”

The future of knowledge graphs and enterprise data

As the technology matures, we’re likely to see the automated creation of knowledge graphs become more accessible and cost-effective. Michael Hunger Neo4j’s head of product innovation points out, “There will be models that are fine-tuned for entity and relationship extraction. So it will be, I would say, at least two orders of magnitude cheaper to extract entities than it is today with the big LLMs.”

With enterprises adopting knowledge graphs for data management, generative frameworks like Langchain and LlamaIndex are emerging as powerful allies.

By structuring its agentic work flows as interconnected nodes and edges, Langchain facilitates efficient querying and retrieval, improving performance through enhanced data retrieval, contextual understanding, and scalability. Its natural language querying feature allows users to interact with graph databases like Neo4j and Amazon Neptune through intuitive interfaces.

LlamaIndex offers a flexible framework for building and querying knowledge graphs using LLMs, making it ideal for advanced RAG applications. It provides tools and APIs for constructing knowledge graphs from text documents and retrieving information.

Key features include graph construction and storage, natural language querying, and a property graph index that enables richer modeling and querying by categorizing nodes and relationships with metadata, enhancing the accuracy and governance of AI systems.

Challenges and considerations

Despite the promising future, adopting knowledge graph technology comes with its challenges. Data integration issues and the need for specialized skills can be significant hurdles.

Kixmoeller from Glean acknowledges these roadblocks: “One of the things that is still very challenging is that enterprise environments are actually very, very messy and complicated. There is so much information that is spread across many different systems. Connecting and retrieving this knowledge with AI techniques, and the governance of all that knowledge, is still very difficult.”

To overcome these challenges, organizations might need to invest in training programs or partner with knowledge graph experts. As the technology becomes more mainstream, we can expect an increase in skilled professionals and more user-friendly tools to emerge.

VB Daily

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat’s Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

» …
Read More

Related Posts

N.H. woman suffers leg burns after hiking off trail near Yellowstone Park’s Old Faithful

Deachman: An unofficial back-to-the-office poll — Grrr, Meh or Yay, public servants?

Blinken presses Israel, Hamas on truce, says ‘90% agreed’