DataHub targets SQL query log context to stop hallucinated joins

AIMay 28, 2026 at 05:00 PM

Read full story

Source: VentureBeat

TLDR: NEW YORK—Miro’s Snowflake agents missed the right joins over 65% because context was missing. DataHub now mines SQL query history to index proven joins for agents.

Key Takeaways:

Miro tested Claude based agents against Snowflake and hit routing failure across 10,000 tables without a semantic guide.
DataHub will release Context Intelligence Thursday, building a semantic index from validated SQL query history and exposing it through MCP and agent frameworks.
Instead of guessing from raw schema, agents retrieve “semantic anchors” from golden queries, with domain experts resolving conflicts before publishing.

“Hallucinated joins” is starting to sound less like a model flaw and more like bad plumbing. DataHub is basically turning years of analysts trial and error into a runtime compass, so agents can stop improvising.

No comments yet. Be the first to share your thoughts!

Q&A

If query history reflects past mistakes, how does a semantic index avoid encoding bad joins forever?

DataHub’s approach relies on filtering for “golden queries” and human validation in Context Hub, including flagging conflicting metric definitions for review before updates.

Why does raw schema fail specifically for “join routing,” even when table names look obvious?

Schema says what exists, not what business question it answers. DataHub maps query history patterns into semantic anchors so agents pick the entities that previously solved the same intent.

What happens when teams change definitions and the old join logic stops matching reality?

A continuously refreshed semantic layer can reindex newer validated queries, while Context Hub surfaces disagreements and lets experts resolve conflicts before the runtime index is used.

Could platform specific tools replace this, or is the integration strategy the point?

DataHub positions the layer as platform neutral by provisioning context into existing endpoints like Snowflake semantic views and Microsoft Fabric IQ rather than forcing a single replacement system.

Why might governance teams care more than developers here, even though the product targets agents?

Lineage tracking and audit readiness depend on trusted context. When agents reuse validated joins, it changes how consistently decisions and reported metrics match governance expectations.

DataHub targets SQL query log context to stop hallucinated joins

Key Takeaways:

Q&A

If query history reflects past mistakes, how does a semantic index avoid encoding bad joins forever?

Why does raw schema fail specifically for “join routing,” even when table names look obvious?

What happens when teams change definitions and the old join logic stops matching reality?

Could platform specific tools replace this, or is the integration strategy the point?

Why might governance teams care more than developers here, even though the product targets agents?

Top in AI

Apple adds Nvidia confidential compute to Google Gemini Siri

Has the hunt for AI compute uncovered the next Cerebras?

Databricks warns enterprise AI deals hinge on operational trust

RSI chases AGI style acceleration, but definitions blur

Apple readies Siri for ChatGPT pressure on iPhone