
- Open Deep Search (ODS) is an open-source agentic framework that wraps around an LLM of choice for closing the gap between proprietary search AI solutions and their open-source counterparts
- 75.3% accuracy on the FRAMES benchmark outperforming OpenAI's GPT-4o Search Preview (65.6%) by nearly 10 percentage points
- 88.3% accuracy on SimpleQA, nearly matching GPT-4o Search Preview's 90.0%
- We invite developers and builders to build upon our framework to innovate, starting from nearly the top of the leaderboards in search AI
There isn't much of a gap between closed and open offerings in LLMs (think of GPT4 vs DeepSeek R1). However, there is a vast gulf between closed and open counterparts for advanced/agentic/AI search (think of Gemini or Perplexity vs no-real-options). In work, Sentient Research bridges this vast gap. We release Open Deep Search (ODS), an open-source agent framework for closing the performance gap between proprietary AI search solutions and their open-source alternatives. ODS eliminates the restrictions of closed AI by delivering a fully open-source AI search framework that matches, and often surpasses, the performance of proprietary offerings. By democratizing access to state-of-the-art search technology, we're leveling the playing field for developers and builders to advance the state-of-the-art search AI.
Results: Matching Closed-Source Giants

When combined with the open-source DeepSeek-R1 model, ODS achieves:
- 75.3% accuracy on the FRAMES benchmark outperforming OpenAI's GPT-4o Search Preview (65.6%) by nearly 10 percentage
- 88.3% accuracy on SimpleQA, nearly matching GPT-4o Search Preview's 90.0%
The FRAMES evaluation benchmark from DeepMind is challenging, multi-hop questions that require both correct retrieval from multiple sources of information and accurate aggregation of the information retrieved. We prefer FRAMES because this benchmark is challenging enough that the SOTA proprietary solutions still struggle, and the evaluation data has not been exhausted.

Another benchmark of SimpleQA by OpenAIis simpler but more popular. The above leaderboard contains several self-reported scores found on the web. We did not verify these scores but trust the sources. On the other hand, since ODS is open-source, anyone can verify our scores. One thing is that DeepSeek-R1 has memorized a lot of facts and it achieves 82.4% accuracy without access to the web. In that sense, SimpleQA is probably not the best evaluation benchmark of the advanced reasoning capabilities of the AI being tested. Each question only tests for the factuality of a single piece of information. ODS leverages the latest advances in open-source reasoning LLMs, such as DeepSeek-R1, to boost its performance in search.
Open Deep Search
OpenDeepSearch is a lightweight yet powerful search tool designed for seamless integration with AI agents. It enables deep web search and retrieval, optimized for use with Hugging Face's SmolAgents ecosystem. ODS consists of two components that work with a base large language model (LLM) chosen by the user: the Open Search Tool and the Open Reasoning Agent. The Open Reasoning Agent interprets the given task and completes it by orchestrating a sequence of actions that includes calling tools, one of which is the Open Search Tool. The Open Search Tool is a novel web search tool that outperforms proprietary counterparts. Both are fully open source and available to test via the GitHub repository below.

Open Search Tool
Unlike existing open-source alternatives that simply pass raw search engine results to an LLM, our Open Search Tool implements a sophisticated search process that:
- Intelligently rephrases queries when necessary to capture the user's implicit intent
- Extracts and consolidates context from the top search snippets
- Applies advanced chunking and re-ranking to filter content above relevance thresholds
- Implements custom website handling for major sources like Wikipedia, ArXiv, and PubMed
These improvements dramatically enhance the quality of retrieved information, providing LLMs with much better context to work with.
In the example below, from SimpleQA, ODS leverages the high-quality retrieved context from the Open Search Tool to identify the correct answer, by cross-checking multiple sources. On the other hand, Perplexity Sonar Reasoning Pro fails to retrieve the relevant information with its search engine.

Open Reasoning Agent
We provide two variants of our reasoning agent:
- ODS-v1: Based on the Chain-of-Thought and ReAct frameworks, enhancing the model's reasoning capabilities through step-by-step thought/action/observation cycles and tool integration. The ODS-v1 agent has access to three separate tools: the Open Search Tool, a calculator based on Wolfram-Alpha, and the “continue thinking” tool.
- ODS-v2: Based on the Chain-of-Code and CodeAct frameworks, enabling the model to generate executable Python code for more precise reasoning and tool use. The ODS-v2 agent has access to the Open Search Tool.
Both agents can intelligently orchestrate a sequence of actions to adapt to the complexity of each query. ODS-v1 takes advantage of few-shot samples of ReAct reasoning prompts which interweave Thought, Action, and Observation steps as in-context examples to instruct the model to call tools. Unlike fixed approaches that use the same number of searches for every query, both ODS-v1 and ODS-v2 judiciously determine when additional searches are needed.
These optimizations maximize both efficiency and accuracy, allowing us to surpass the performance of proprietary closed-source counterparts. When the performance improvements exhibited by ODS do not come about from the Open Search Tool, they are due to the Open Reasoning Agent.
In the example below, from FRAMES, ODS realizes that a second search is necessary, searches again to find the birth year of the lead singer of King Crimson, and correctly answers the question with “1946”. On the other hand, Perplexity could not figure out the leader of the band King Crimson.

Conclusion
Open Deep Search represents more than just a technical achievement – it's a fundamental shift in the accessibility of cutting-edge search AI technology. By closing the gap between open and closed-source solutions, we are:
- Democratizing access to powerful search AI capabilities, enabling developers worldwide to build on and innovate with state-of-the-art technology
- Fostering transparency by making the inner workings of search AI systems visible and modifiable
- Encouraging innovation by lowering the barriers to entry for startups and independent developers
The repository for both versions of ODS is available at https://github.com/sentient-agi/OpenDeepSearch, and we invite the community to build upon, improve, and extend our work.
As new open-source reasoning LLMs are released, ODS provides a plug-and-play framework that will seamlessly integrate these advancements, ensuring that open-source search AI solutions remain competitive with – and even surpass – their proprietary counterparts.
The era of closed-source dominance in search AI is coming to an end. Open Deep Search is just the beginning of what's possible when we democratize access to powerful AI technologies and put them in the hands of the global developer community.
Try Open Deep Search today and join us in shaping the future of AI search!
References
- [1] OpenAI. https://github.com/openai/simple-evals, 2025
- [2] xAI. https://x.ai/blog/grok-3, 2025.
- [3] PerplexityAI. https://www.perplexity.ai/hub/blog/introducing-the-sonar-pro-api, 2025.
- [4] Perplexity AI, Inc. Perplexity AI, 2024. https://www.perplexity.ai.
- [5] OpenAI. https://platform.openai.com/docs/models/gpt-4o-search-preview, 2025.
- [6] Will Bryk. https://exa.ai/blog/api-evals, 2025.
- [7] Philippe Mizrahi. https://www.linkup.so/blog/linkup-establishes-sota-performance-on-simpleqa, 2025
- [8] PerplexityAI. https://www.perplexity.ai/hub/blog/introducing-perplexity-deep-research, 2025.
- [9] DeepSeek. https://www.deepseek.com, 2025.
- [10] Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu,Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948, 2025.

