Domain-Specific Data Enrichment Benchmark

Comparing Kirha's domain-specific knowledge against standard web search.

Kirha Score

87/100

Web Search Score

61/100

Across all 100 tests, Kirha injected 233,920 tokens into LLM context, compared to 4,604,853 for Web Search.

Kirha uses 95% less tokens.

Methodology

Datasets

This benchmark compares Kirha against standard web search on domain-specific queries where Kirha has specialized knowledge integrations: Company Data, Insurance, and Crypto/Blockchain.

Kirha uses web search as a fallback when it doesn't have domain-specific knowledge. This benchmark tests domains where Kirha's specialized integrations provide structured, accurate, and actionable data compared to generic web results.

Evaluation Process

Each query is executed in parallel against both Kirha and a standard web search. The raw results are then processed through a summarization step using Gemini 2.5 Flash to extract the most relevant information and normalize the output format.

The summarized responses are evaluated using an LLM-as-Judge approach with Gemini 2.5 Flash and extended thinking enabled. The judge scores each response on 5 criteria (0-100) and determines a winner based on the total score.

A common best practice with LLM-as-a-Judge is to cross-reference scores against human evaluation and aim for a high correlation.
For this v1 we took a lighter approach: we asked Claude to review all results alongside their judge scores and flag inconsistencies.
Read the full report.

Evaluation Criteria

Relevance — How well does the response address the query?

Accuracy — Is the information correct and verifiable?

Completeness — Does it cover all aspects of the request?

Freshness — Is the data current and up-to-date?

Actionability — Can the user act on this information directly?

Score Comparison

Kirha

Web Search

By Metric

Performance Profile

Test Results (100)

Tap on a row to see detailed results and raw outputs

#	Query	Kirha	Web Search	Winner
1	list Construction and Real Estate latest 10 tenders in France and Germany above 10m euros.	94%	18%	Kirha
2	Show the volume trend for OpenSea over the past month.	97%	48%	Kirha
3	Give me the last Bitcoin news about the top 3 Bitcoin company holders.	90%	79%	Kirha
4	Find 5 contributors to C++ OCR projects.	89%	70%	Kirha
5	Summarize 5 latest Nvidia SEC filings.	93%	63%	Kirha
6	Find latest fundraising in crypto.	90%	70%	Kirha
7	List the 5 latest authors of research papers on RAG.	84%	70%	Kirha
8	find top 10 wallets that related with Tornado Cash (contract 0x910Cbd523D972eb0a6f4cAe4618aD62622b39DbF).	93%	54%	Kirha
9	Check the delay for flight U25112 on 7 December 2025.	94%	28%	Kirha
10	Insurance Risk related to Weather at Valence (spain), the 29 october 2024.	77%	87%	Web Search