diff --git a/messages.json b/messages.json
new file mode 100644
index 0000000..4cb773b
--- /dev/null
+++ b/messages.json
@@ -0,0 +1,106 @@
+[
+  {
+    "role": "user",
+    "content": "What do Internet say about \"Jina AI blog latest post\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "Read the latest news and updates from Jina AI.; Best-in-class embeddings, rerankers, web crawler scraper, deepsearch, small LMs. The search AI for multilingual and multimodal data.; Jina AI's posts  Model soups blend diversity into something greater than the sum of their parts. Learn more from our blog post below https://jina.ai/news/model; Model soups blend diversity into something greater than the sum of their parts. Learn more from our blog post below https://t.co/c1K4IAsnDi; What's the latest blog post from OpenAI? what is the idea behind node-deepresearch project? what exactly jina-; Our friends at Jina AI added native integration for Jina AI's embedding models and reranking products to the Elasticsearch open Inference API.; Blog Post. Whether you like this implementation or not, I highly recommend you to read DeepSearch/DeepResearch implementation guide I wrote, which; In our latest blog post: See how PromptPerfect can level up your LLM prompting and optimize your email and social media marketing campaigns.; Try our world-class embedding models to improve your search and RAG systems. Start with a free trial!; Jina.ai's Reader API is a powerful tool that simplifies web scraping and content extraction. Whether you're a developer, researcher, or AI enthusiast, it's"
+  },
+  {
+    "role": "user",
+    "content": "What do Internet say about \"Jina AI Elasticsearch integration\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "<answer-datetime>\nBetween 2024-05-09 and 2025-05-09\n</answer-datetime>\n\nExplore the Jina AI and Elasticsearch integration, Learn about Jina reranker, Jina clip, and embedding models integrated with Elasticsearch.; Our friends at Jina AI added native integration for Jina AI's embedding models and reranking products to the Elasticsearch open Inference API.; Elastic integrates Jina AI's embedding models into its Open Inference API, enhancing search and retrieval capabilities.; Developers using Elastic to build search and RAG applications can now use the latest Jina AI embedding and reranking models without additional integration or; This integration includes support for multilingual text embeddings and multilingual reranking, and is optimized for retrieval, clustering, and classification. “; Elasticsearch backend provides the much-needed scalability for Jina search applications and can be used as a document store for DocumentArray.; Integration with Jina AI enhances Elasticsearch's capabilities for semantic search and RAG applications. Support for multilingual text embeddings and reranking; Developers using Elastic to build search and RAG applications can now use the latest Jina AI embedding and reranking models without additional integration; Semantica is designed to augment your existing symbolic search with minimal disruption. It works with widely-used platforms like Elasticsearch and Solr.; This collaboration enables developers to utilize Jina AI's advanced tools for building semantic search and rapid answer generation (RAG) applications."
+  },
+  {
+    "role": "user",
+    "content": "What do Internet say about \"Jina AI deepsearch guide\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "<answer-datetime>\nBetween 2024-05-09 and 2025-05-09\n</answer-datetime>\n\nQPS out, depth in. DeepSearch is the new norm. Find answers through read-search-reason loops. Learn what it is and how to build it.; QPS out, depth in. DeepSearch is the new norm. Find answers through read-search-reason loops. Learn what it is and how to build it.; DeepSearch runs through an iterative loop of searching, reading, and reasoning until it finds the optimal answer.; 在本文中，我们将通过研究我们的开源实现来讨论DeepSearch 和DeepResearch 的原理。 ... DeepSearch 作为DeepResearch 的基础构建块。通过DeepSearch ...; The full guide at jina.ai has more details on system prompts, URL ranking, and web crawling that are worth checking out if you're building ...; DeepSearch API is free to use. By providing your API key, you can access a higher rate limit, and you key won't be charged. open_in_new ...; Jina AI's node-DeepResearch is available at the following link: https://github.com/jina-ai/node-DeepResearch With ollama, you can use an LLM ...; If you are building a web/local/mobile client that uses Jina DeepSearch API , here are some design guidelines: Our API is fully compatible with OpenAI API ...; QPS out, depth in. DeepSearch is the new norm. Find answers through read-search-reason loops. Learn what it is and how to build it.; 在本文中，我們將通過研究我們的開源實現來討論DeepSearch 和DeepResearch 的原理。 ... DeepSearch 作為DeepResearch 的基本構建塊。透過DeepSearch ..."
+  },
+  {
+    "role": "user",
+    "content": "What do Internet say about \"Jina AI model soups blog\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "<answer-datetime>\nBetween 2024-05-09 and 2025-05-09\n</answer-datetime>\n\nBoost robustness and performance with model soups: averaging weights. No extra cost, better results.; Model soups blend diversity into something greater than the sum of their parts. Learn more from our blog post below https://jina.ai/news/model; Best-in-class embeddings, rerankers, web crawler scraper, deepsearch, small LMs. The search AI for multilingual and multimodal data.; Model soups blend diversity into something greater than the sum of their parts. Learn more from our blog post below https://jina.ai/news/model-sou; Boost robustness and performance with model soups: averaging weights. No extra cost, better results. https://lnkd.in/ezAjU7VS.; Pros: AI-Powered: Leverages advanced AI models to understand and extract data intelligently, even with complex website structures. Ease of Use:; The article discusses emerging trends and innovative tools in web scraping for 2024, highlighting startups like Mendable and technologies; Jina AI just released Reader-LM, a new set of small language models designed to convert raw HTML into clean markdown.; How AI is revolutionizing web scraping. Discover cost-effective tools, build a robust scraping dashboard, and solve common data extraction; It's a smarter, AI-powered framework that makes data extraction easier and more powerful. By combining neural search and intelligent workflows, Jina.ai is"
+  },
+  {
+    "role": "user",
+    "content": "What do Internet say about \"Jina AI embedding models trial\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "<answer-datetime>\nBetween 2024-05-09 and 2025-05-09\n</answer-datetime>\n\nEmbedding API. Try our world-class embedding models to improve your search and RAG systems. Start with a free trial!; Jina Embeddings v3 is a groundbreaking multilingual text embedding model that transforms how organizations handle text understanding and retrieval across; Jina Embeddings v2 Base English is a groundbreaking open-source text embedding model that solves the critical challenge of processing long documents while; You can get a free trial key from Jina Embeddings to get embeddings. Qdrant users can receive a 10% discount on Jina AI APIs by using the code QDRANT. Technical; We're on a journey to advance and democratize artificial intelligence through open source and open science.; jina-embeddings-v2-base-es is a Spanish/English bilingual text embedding model supporting 8192 sequence length. It is based on a BERT; jina-embeddings-v3 is a frontier multilingual text embedding model with 570M parameters and 8192 token-length, outperforming the latest proprietary; Jina Embedding B v1 is a specialized text embedding model designed to transform English text into high-dimensional numerical representations while maintaining; Jina Embeddings v2 Base model is optimized for highly accurate embeddings - For speed of inference and memory efficiency use the Small model.; Jina Embeddings v2 Base model is optimized for highly accurate embeddings - For speed of inference and memory efficiency use the Small model. jina-embeddings-v2"
+  },
+  {
+    "role": "user",
+    "content": "What do Internet say about \"Jina AI reader API scraper\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "<answer-datetime>\nBetween 2024-05-09 and 2025-05-09\n</answer-datetime>\n\nReader can be used as SERP API. It allows you to feed your LLM with the content behind the search results engine page. Simply prepend https://s.jina.ai/?q=Reader API is available for free and offers flexible rate limit and pricing. Built on a scalable infrastructure, it offers high accessibility, concurrency, and; Reader API is available for free and offers flexible rate limit and pricing. Built on a scalable infrastructure, it offers high accessibility, concurrency, and; With Reader, http://s.jina.ai automatically fetches the content from the top 5 search result URLs for you (reusing the tech stack behind http://r.jina.ai ).; How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai · Comments178.; Jina.ai's Reader API is a powerful tool that simplifies web scraping and content extraction. Whether you're a developer, researcher, or AI enthusiast, it's; Web scraping can be a complex and expensive process, but Jina AI simplifies it by allowing you to extract valuable content from websites using just a URL.; Best-in-class embeddings, rerankers, web crawler scraper, deepsearch, small LMs. The search AI for multilingual and multimodal data.; In this video, I dive into the Jina Reader API and demonstrate its powerful capabilities to enhance your language model-friendly inputs.; Jina AI's Reader module is designed to streamline the extraction and processing of unstructured data, making it an invaluable tool for AI; The article discusses emerging trends and innovative tools in web scraping for 2024, highlighting startups like Mendable and technologies"
+  },
+  {
+    "role": "user",
+    "content": "What do expert say about \"jina ai latest blog post\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "<url>\nhttps://twitter.com/JinaAI_/status/1770478491852521899\n</url>\n\nDon’t miss what’s happening\n\nPeople on X are the first to know.\n\nLog in\n\nSign up\n\nConversation\n------------\n\nJina AI\n\n@JinaAI\\_\n\nIn our latest blog post: See how PromptPerfect can level up your LLM prompting and optimize your email and social media marketing campaigns.\n\nClick-Worthy Content with PromptPerfect: AI Marketing for Newsletters and Social Media\n\nFrom jina.ai\n\n3:52 PM · Mar 20, 2024\n\n713\n\nViews\n\nNew to X?\n\n-----------\n\nSign up now to get your own personalized timeline!\n\nCreate account\n\nBy signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.\n\nSomething went wrong. Try reloading.\n\nTerms of Service\n\n|\n\nPrivacy Policy\n\n|\n\nCookie Policy\n\n|\n\nAccessibility\n\n|\n\nAds info\n\n|\n\n© 2025 X Corp."
+  },
+  {
+    "role": "user",
+    "content": "What do expert say about \"jina ai latest blog post\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "<answer-datetime>\n2025-05-07 20:23:04\n</answer-datetime>\n\n<url>\nhttps://jina.ai/news/model-soups-recipe-for-embeddings\n</url>\n\n<snippet-1>\n\n------\n\n**Base Model**: For the experiments described here, we used `xlm-roberta-base` from FacebookAI (Conneau et al. 2020) as our pre-trained base model. This model has 280 million parameters and has been pre-trained on 2.5TB of Common Crawl data containing text in roughly 100 languages.\n\nWe fine-tuned `xlm-roberta-base` on our curated sentence pair training set for embeddings training, before performing our experiments.\n\n**Training Data**: Jina AI maintains custom-curated datasets for training. For the first experiment, we used sentence triplets specifically curated for contrastive training in six languages: English, Arabic, German, Spanish, Japanese, and Chinese. For the second experiment, we used task-specific training datasets in English.\n\n**Evaluation**: We used relevant parts of the MMTEB benchmark set (Enevoldsen et al. 2025) and MIRACL benchmark (Zhang et al. 2023) to evaluate the models produced by our training and merging.\n\n### Experiment 1: Single-Run Averaging\n\nFor this experiment, we used contrastive sentence triplets in all six languages, mixed together, for a total of 6,000 training steps with a batch size of 1,024 items. At every 2,000 steps, we saved the model state for averaging, producing 3 models, each reflecting a different point in the training process.\n\nWe averaged the three models to produce a final model. We then tested the merged model and the three saved checkpoints against the MMTEB-STS and MIRACL benchmark sets.\n\nOur results are summarized in the table below:\n\n| Model | MIRACL  \n(avg 6 languages) | MMTEB-STS English  \n(avg 8 benchmarks) | MMTEB-STS Multilingual  \n(avg 6 benchmarks) | Average of 20 benchmarks |\n| --- | --- | --- | --- | --- |\n| No triplet training | 0.3163 | 0.7859 | 0.7322 | 0.6276 |\n| Step 2000 | 0.4631 | **0.7924** | 0.7561 | 0.6813 |\n| Step 4000 | 0.4639 | 0.7902 | **0.7583** | 0.6812 |\n| Step 6000 (final) | **0.4680** | 0.7891 | 0.7575 | 0.6818 |\n| Merged model  \n(all 3 stored checkpoints) | 0.4669 | 0.7910 | 0.7579 | **0.6823** |\n\nMerging with previous checkpoints did not generally produce a better-performing model than the best performer among the stored checkpoints on individual benchmarks or on any of the three batteries of benchmarks used. However, it did produce the best model on all benchmarks averaged together.\n\nIn individual benchmarks, the difference between the merged model and the best-performing checkpoint is in every case less than 0.01. This is true not only for the averages in the table above but for each individual test.\n\nThis demonstrates that merging different training checkpoints can produce a more robust model at very little performance cost.\n\nFurthermore, by merging the different checkpoints, we can effectively guard against overtraining. Overtraining has recently become an important topic in neural networks. (Springer et al., 2025) A network can be trained in a way that makes it harder and worse performing after further fine-tuning.\n\nSince the best-performing checkpoint in our experiment is often not the last one, we have likely overtrained our model at 6,000 training steps. The merged model comes very close to matching the performance of the best checkpoint in all tests, removing the defects of overtraining.\n\n### Experiment 2: Averaging Models Trained for Different Tasks\n\nFor this experiment, we trained three models, each for a different common embedding task:\n\n*   **Semantic similarity**: Measuring the relative overlap or similarity in meaning between two texts, typically of comparable length.\n*   **Document retrieval based on textual queries**: Finding the documents that best satisfy a query. Queries are generally much shorter texts than the documents they match.\n*   **Question answering**: Finding the document that best answers a natural language question. Questions are also generally much shorter than the texts they match.\n\nTraining models for all three tasks at once is quite difficult because the goals are very dissimilar, and we hope that model soups will improve the process.\n\nBased on previous experience, we knew that each task required a different number of training epochs. The training is summarized below:\n\n| Task | Training Steps  \n(batchsize = 1,024) | Training Dataset Size  \n(in items) |\n| --- | --- | --- |\n| Question Answering (QA) | 2,000 | 256,000 |\n| Document Retrieval | 3,000 | 384,000 |\n| Semantic Similarity (STS) | 1,000 | 128,000 |\n\nThis produced three models, which we then merged into a single model. We tested the resulting model against the portions of the MMTEB benchmark set relevant to those three tasks: MIRACL, NanoBEIR, and STSEval (English and Multilingual parts of MMTEB).\n\n|  | MIRACL  \n(avg 6 languages) | NanoBEIR  \n(avg 13 benchmarks) | MMTEB-STS English  \n(avg 9 benchmarks) | MMTEB-STS Multilingual  \n(avg 6 benchmarks) | Average 34 benchmarks |\n| --- | --- | --- | --- | --- | --- |\n| No triplet training | 0.3163 | 0.5089 | 0.7859 | 0.7322 | 0.5876 |\n| QA training | **0.4489** | 0.5332 | 0.7843 | 0.7535 | 0.6237 |\n| Retrieval training | 0.4272 | **0.5360** | 0.7766 | 0.7340 | 0.6154 |\n| STS training | 0.1779 | 0.4519 | **0.7994** | **0.7651** | 0.5508 |\n| Merged model | 0.4246 | 0.5309 | 0.7981 | 0.7640 | **0.6240** |\n\nWe see here that the task-specific trained models have the best performance on each task. MIRACL is primarily a question-answering benchmark, even if it’s called a retrieval one, and the QA-trained model outperforms all others on it, including the merged model. NanoBEIR is a more conventional information retrieval benchmark set, and we see the retrieval-trained model is the top performer on it. The semantic similarity (STS) model scores quite poorly on those benchmarks, but beats the others on explicitly STS tasks. For each category, the merged model performs more poorly than the single-task trained model.\n\nBut once again, if we average over all benchmarks, the merged model outperforms the others, although its score represents only a very small improvement over the QA-\n\n</snippet-1>\n\n<snippet-2>\n\nIn these difficult times, nothing beats a nice, warm bowl of soup.\n\nMinestrone is one of the classic Italian soups: thick, hearty, flavorful, combining beans, hearty vegetables, and rice or pasta. Its taste is a product of assembling diverse ingredients. It’s a bit like borscht in Eastern Europe, casseroles in America, or a homemade stir-fry in Pacific Asia in that it combines available, inexpensive ingredients into a beloved dish.\n\nWe can use much the same kind of recipe for neural network models, according to a line of papers starting with Wortsman et al. (2022).\n\n“Model soups” (alas, not “model casseroles” or “model stir-fries”) are a class of model ensembling techniques designed to mitigate the cost of optimizing training data and model hyperparameters. When training a neural network, you typically try different data and hyperparameter values and train multiple times, looking for the best-performing result. Training is very computationally expensive, and costs add up quickly.\n\nModel soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time\n\nThe conventional recipe for maximizing model accuracy is to (1) train multiple models with various hyperparameters and (2) pick the individual model which performs best on a held-out validation set…\n\nPMLR\n\nInstead, model soups involve training multiple models with different hyperparameters and training data choices — the same as you usually would — but then combining them. The result is a higher-performing and more robust model than the single best performer. It doesn’t save costs because you still train multiple models, but you can get a better result for the same price.\n\nThe model soup approach has already proven useful for text-image multimodal embedding models (Wortsman et al. 2022) and generative large language models. (Takuya et al. 2025) At Jina AI, we’ve begun using this technique to train our own models, and jina-embeddings-v3 and ReaderLM-v2 both incorporate model soups.\n\nIn this article, we’re going to look at model soups and show the results of some of our work with them. Specifically:\n\n1.  Can we use model soups to improve performance by merging models at different points in their training?\n2.  Can we merge models trained with different datasets and for different tasks to obtain better performance and higher training efficiency than by training a single model?\n\nThis has important potential benefits:\n\n*   Model soups can have better and more robust performance.\n*   Multilingual embedding models often suffer from biases and performance failures caused by unequal amounts of training data. It would be a boon to be able to train the best model we can on each task or dataset individually and then combine them equally.\n*   We may be able to do better continuous learning and model updating by making changes to our models in a modular way, updating one component model at a time, and then remerging it with the others.\n\nHow Does it Work?\n-----------------\n\nMerging the outputs of multiple models is an old technique in statistical decision theory. For example, it's common practice in weather forecasting to create multiple models, often made by different people with different assumptions, and then use a variety of mechanisms to average their predictions. If each model’s errors are randomly distributed, then averaging the models will lead to answers with fewer errors.\n\nFor example, if you have three different models that output a binary “yes” or “no”, and each is wrong 10% of the time, then two out of the three will be wrong only 2.8% of the time. Five models, with a majority decision criterion, will only be wrong 0.856% of the time.\n\nAveraging models works on the same principle, but instead of combining the outputs of different models, it combines the models themselves.\n\nThe approach used is an extension of _stochastic weight averaging_ (Izmailov et al. 2018), which relies on insights into the loss landscapes of neural networks to show that simple weight averaging can improve model generalization performance under common conditions.\n\nThe actual mechanics of averaging the models is disturbingly simple: You just average the weights of multiple models.\n\nHow models are merged to make a model soup. This example is very small and simple, but still shows the procedure: Sum the weights and divide by the number of models being merged.\n\nIf this seems too easy, it’s important to note that there are limitations when merging models this way. You can’t just merge the weights of any two neural networks and expect it to work.\n\nModel averaging only works on very similar models, i.e., models whose weights are not very different from each other to begin with. The way to ensure this is to pre-train one model and then create multiple variants of that model by fine-tuning them with different hyperparameters or different data. These models will typically be similar enough to average.\n\nIn more technical terms, pre-training usually produces a model whose weights are near the bottom of a loss basin, and fine-tuning doesn’t easily lead to escaping that loss basin. If all the models to be merged have weights in the same loss basin, then their weights will be fairly close to the same, and averaging them is likely to work. This is not guaranteed, but empirically, it seems to be true often enough to be useful.\n\nExperimental Setup\n------------------\n\n**Base Model**: For the experiments described here, we used `xlm-roberta-base` from FacebookAI (Conneau et al. 2020) as our pre-trained base model. This model has 280 million parameters and has been pre-trained on 2.5TB of Common Crawl data containing text in roughly 100 languages.\n\nWe fine-tuned `xlm-roberta-base` on our curated sentence pair training set for embeddings training, before performing our experiments.\n\n**Training Data**: Jina AI maintains custom-curated datasets for training. For the first experiment, we used sentence triplets specifically curated for contrastive training\n\n</snippet-2>"
+  },
+  {
+    "role": "user",
+    "content": "What do expert say about \"jina ai latest blog post\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "<url>\nhttps://businesswire.com/news/home/20250220781575/en/Elasticsearch-Open-Inference-API-now-Supports-Jina-AI-Embeddings-and-Rerank-Model\n</url>\n\n\\-\n\n_Developers using Elastic to build search and RAG applications can now use the latest Jina AI embedding and reranking models without additional integration or development costs_\n\nSAN FRANCISCO--(BUSINESS WIRE)--Elastic (NYSE: ESTC), the Search AI Company, announced the Elasticsearch Open Inference API now supports Jina AI’s latest embedding models and reranking products. Developers building semantic search and RAG applications with the Elasticsearch vector database now benefit from Jina AI’s high-performance, cost-effective tools for GenAI information retrieval and semantic applications. This integration includes support for multilingual text embeddings and multilingual reranking, and is optimized for retrieval, clustering, and classification.\n\n“Integrating Jina AI’s embeddings and reranker models with the Elasticsearch Open Inference API brings enterprise-grade semantic search to production environments while strengthening our open-source communities,” said **Dr. Saahil Ognawala, head of product at Jina AI**. “Combining Jina’s resource-efficient and open-weight search foundation models with Elasticsearch's proven scalability enables developers to easily build reliable semantic search and RAG applications.”\n\n“Elastic is committed to providing open GenAI solutions that enable developers to build the next generation of search experiences,” said **Ajay Nair, general manager, Platform at Elastic**. “Our collaboration with Jina AI gives our users access to Jina’s high-performance tools on a singular Elasticsearch platform alongside Elastic models like ELSER, providing a seamless, streamlined building experience designed to create first-class generative AI applications.”\n\nSupport for Jina AI is available today in Elastic Cloud Serverless. Supported Jina AI models include jina-embeddings-v3, jina-reranker-v2-base-multilingual, and more. For a full list of supported models, read the Elastic blog.\n\n**About Elastic**\n\nElastic (NYSE: ESTC), the Search AI Company, enables everyone to find the answers they need in real-time using all their data, at scale. Elastic’s solutions for search, observability, and security are built on the Elastic Search AI Platform, the development platform used by thousands of companies, including more than 50% of the Fortune 500. Learn more at **elastic.co.**\n\n_Elastic and associated marks are trademarks or registered trademarks of Elastic N.V. and its subsidiaries. All other company and product names may be trademarks of their respective owners._\n\nMore News From Elastic N.V.\n\nBack to Newsroom"
+  },
+  {
+    "role": "user",
+    "content": "What do expert say about \"jina ai latest blog post\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "<answer-datetime>\n2025-03-13 07:53:50\n</answer-datetime>\n\n<url>\nhttps://jina.ai/news/a-practical-guide-to-implementing-deepsearch-deepresearch\n</url>\n\n<snippet-1>\n\nJina AI Deep Search\n\nAI deep search: read, reason, search until best answer found.\n\nIt's _only_ February, and DeepSearch has already emerged as the new search standard in 2025, with major players like Google and OpenAI leading the charge through their DeepResearch releases (and yes, we proudly launched our open-source `node-deepresearch` on the same day). Perplexity followed suit with their DeepResearch, and X AI integrated their own DeepSearch capabilities into Grok3, basically creating another DeepResearch variant. While the concept of deep search isn't revolutionary – in 2024 it was essentially termed as RAG or multi-hop QA – it gained _significant_ momentum after Deepseek-r1's release in late January 2025. Last weekend, Baidu Search and Tencent WeChat Search have integrated Deepseek-r1 in their search engines. AI engineers have discovered that by incorporating long-thinking and reasoning processes into search systems, they can achieve remarkable retrieval accuracy and depth beyond what was previously possible.\n\n| Launch Date | Company | Product | License Type | Link |\n| --- | --- | --- | --- | --- |\n| 2025-01-20 | DeepSeek | DeepSeek-r1 release | Open source | DeepSeek-R1 |\n| 2025-12-17 | Google | DeepResearch | Proprietary | Google Gemini 2 |\n| 2025-02-02 | OpenAI | DeepResearch | Proprietary | Introducing Deep Research |\n| 2025-02-02 | Jina AI | DeepSearch (`node-deepresearch`) | Open source | node-deepresearch | search.jina.ai |\n| 2025-02-04 | Hugging Face | Open Deep Research | Open source | Open Deep Research |\n| 2025-02-15 | Perplexity | DeepResearch | Proprietary | Introducing Perplexity Deep Research |\n| 2025-02-17 | X AI | Grok3 with DeepSearch | Proprietary | Grok 3 Beta |\n| 2025-02-22 | Baidu Search | Integrates DeepSeek-r1 | Proprietary | Baidu Integrates DeepSeek-R1 |\n| 2025-02-23 | Tencent Wechat Search | Integrates DeepSeek-r1 | Proprietary | Tencent Weixin Integrates DeepSeek |\n\nBut why did this shift happen _now_, when Deep(Re)Search remained relatively undervalued throughout 2024? In fact, Stanford NLP Labs released the STORM project for long report generation with web grounding back in early 2024. So is it just because \"DeepSearch\" sounds way cooler than multi-hop QA, RAG, or STORM? Let's be honest - sometimes a rebrand is all it takes for the industry to suddenly embrace what was there all along.\n\nWe believe the real turning point came with OpenAI's `o1-preview` release in September 2024, which introduced the concept of **test-time compute** and gradually shifted industry perspectives. Test-time compute refers to using more computational resources during inference—the phase where an LLM generates outputs—rather than during pre-training or post-training. A well-known example are Chain-of-Thought (CoT) reasoning and `\"Wait\"`\\-injection (i.e. budget forcing) which enables models to perform more extensive internal deliberations, such as evaluating multiple potential answers, conducting deeper planning, and engaging in self-reflection before arriving at a final response.\n\nThis test-time compute concept and reasoning models **_educate_** users to accept delayed gratification - longer waiting times in exchange for higher-quality, immediately actionable results, just like the Stanford marshmallow experiment where children who could resist eating one marshmallow immediately in order to receive two marshmallows later showed better long-term outcomes. Deepseek-r1 further reinforced this user experience, and like it or not, most users have accepted it.\n\nThis marks a significant departure from classic search requirements, where failing to response within 200ms would doom your solution. In 2025, seasoned search developers and RAG engineers prioritize top-1 precision and recall over latency, and users have become accustomed to longer processing times – provided they can see the system is `<thinking>`.\n\n0:00\n\n/0:18\n\nDisplaying the reasoning procedure has become standard practice in 2025, with numerous chat interfaces now rendering `<think>` content in dedicated UI sections.\n\nIn this article, we'll discuss the principles of DeepSearch and DeepResearch by looking into our open-source implementation. We'll walk through our key design decisions and highlight potential caveats. Finally, you can find our hot-take on deepsearch engineering and dev tools demands in the conclusion section.\n\nWhat is Deep Search?\n--------------------\n\n**DeepSearch runs through an iterative loop of searching, reading, and reasoning until it finds the optimal answer.** The search action leverages web search engines to explore the internet, while the reading action analyzes specific web pages in detail (e.g. Jina Reader). The reasoning action evaluates the current state and determines whether to break down the original question into smaller sub-questions or try different search strategies.\n\nDeepSearch - keep searching, reading webpages, reasoning until an answer is found (or the token budget is exceeded).\n\nWhile various definitions exist online, when we developed the `node-deepresearch` project, we adhered to this straightforward approach. The implementation is elegantly simple – at its core, there's a main while loop with switch-case logic directing the next action.\n\nUnlike 2024 RAG systems, which typically run a single search-generation pass. DeepSearch, in contrast, performs multiple iterations through the pipeline, requiring clear stop conditions. These could be based on token usage limits or the number of failed attempts.\n\n0:00\n\n/0:36\n\nTry deep search at search.jina.ai, observe the content inside `<thinking>`, see if you can tell where the loop happens\n\nAnother perspective on DeepSearch is to view it as an LLM agent equipped with various web tools (such as searcher and reader). The agent determines its next steps by analyzing current observations and past actions – deciding whether to deliver an answer or continue exploring the web. This creates a state machine architecture where the LLM controls transiti\n\n</snippet-1>\n\n<snippet-2>\n\nons) before returning to the parent question.\n\nConsider this example scenario where we start from the OG `what is Jina AI`:\n\n0:00\n\n/0:23\n\nA simple 3-depth gap questions recursion, solving order labeled on the circle.\n\nIn the recursive approach, the system would have to fully resolve Q1 (potentially generating its own sub-questions) after every gap question and their sub-questions! This is a big contrast to the queue approach, which processes questions where Q1 gets revisited right after 3 gap questions.\n\nIn reality, we found the recursion approach is very hard to apply budget-forcing to, since there is no clear rule of thumb for how much token budget we should grant for sub-questions (since they may spawn new sub-questions). The benefit from clear context separation in the recursion approach is very marginal compared to the complicated budget forcing and late return problems. This FIFO queue design balances depth and breadth, ensuring the system always returns to the original question with progressively better knowledge, rather than getting lost in a potentially infinite recursive descent.\n\n### Query Rewrite\n\nAn interesting challenge we encountered was rewriting search queries effectively:\n\n```\n// Within search action handler\nif (thisStep.action === 'search') {\n  // Deduplicate search requests\n  const uniqueRequests = await dedupQueries(thisStep.searchRequests, existingQueries);\n  \n  // Rewrite natural language queries into more effective search queries\n  const optimizedQueries = await rewriteQuery(uniqueRequests);\n  \n  // Ensure we don't repeat previous searches\n  const newQueries = await dedupQueries(optimizedQueries, allKeywords);\n  \n  // Execute searches and store results\n  for (const query of newQueries) {\n    const results = await searchEngine(query);\n    if (results.length > 0) {\n      storeResults(results);\n      allKeywords.push(query);\n    }\n  }\n}\n```\n\nQuery rewriting turned out to be surprisingly important - perhaps one of the most critical elements that directly determines result quality. A good query rewriter doesn't just transform natural language to BM25-like keywords; it expands queries to cover more potential answers across different languages, tones, and content formats.\n\nFor query deduplication, we initially used an LLM-based solution, but found it difficult to control the similarity threshold. We eventually switched to jina-embeddings-v3, which excels at semantic textual similarity tasks. This enables cross-lingual deduplication without worrying that non-English queries would be filtered. The embedding model ended up being crucial not for memory retrieval as initially expected, but for efficient deduplication.\n\n### Crawling Web Content\n\nWeb scraping and content processing is another critical component. Here we use Jina Reader API. Note that besides full webpage content, we also aggregate all snippets returned from the search engine as extra knowledge for the agent to later conclude on. Think of them as soundbites.\n\n```\n// Visit action handler\nasync function handleVisitAction(URLs) {\n  // Normalize URLs and filter out already visited ones\n  const uniqueURLs = normalizeAndFilterURLs(URLs);\n  \n  // Process each URL in parallel\n  const results = await Promise.all(uniqueURLs.map(async url => {\n    try {\n      // Fetch and extract content\n      const content = await readUrl(url);\n      \n      // Store as knowledge\n      addToKnowledge(`What is in ${url}?`, content, [url], 'url');\n      \n      return {url, success: true};\n    } catch (error) {\n      return {url, success: false};\n    } finally {\n      visitedURLs.push(url);\n    }\n  }));\n  \n  // Update diary based on success or failure\n  updateDiaryWithVisitResults(results);\n}\n```\n\nWe normalized URLs for consistent tracking and limits the number of URLs visited in each step to manage agent memory.\n\n### Ranking URLs for Next Read\n\nDuring a DeepSearch session, you'll likely collect a lot of URLs from search engine results and discover even more every time you read individual webpages. The total count of unique URLs can easily reach the thousands. It's crucial to guide the LLM toward URLs that have the highest probability of containing the answer you need.\n\nWe rank URLs based on multiple factors to identify the most promising ones to visit next. The algorithm combines frequency, domain, path structure, and semantic relevance to create a composite score:\n\n1.  **Frequency Signals**: URLs that appear multiple times across different sources receive additional weight. URLs from domains that appear frequently in search results receive a boost, as popular domains often contain authoritative content.\n2.  **Path Structure**: We analyze URL paths to identify content clusters. URLs within common path hierarchies receive higher scores, with a decay factor applied to deeper paths.\n3.  **Semantic Relevance**: For question-based searches, we use Jina Reranker to assess the semantic match between the question and the textual info of each URL, which is a classic reranking problem. The textual info of each URL comes from:\n    1.  Title & snippets from SERP API results (`https://s.jina.ai/` with `'X-Respond-With': 'no-content'`)\n    2.  Anchor text of on-page URLs (`https://r.jina.ai` with `'X-With-Links-Summary': 'all'`)\n\n```\n<action-visit>\n- Crawl and read full content from URLs, you can get the fulltext, last updated datetime etc of any URL.  \n- Must check URLs mentioned in <question> if any\n- Choose and visit relevant URLs below for more knowledge. higher weight suggests more relevant:\n<url-list>\n  + weight: 0.51 \"https://stackoverflow.com/questions/65179061/how-to-get-the-conversations-for-each-user-and-include-the-messages\": \"How to get the conversations for each user and include ...I am trying to get the conversations of each user to show them on my inbox screen in my front-end. I am able to post messages to the database, but when I try ...\"\n  + weight: 0.49 \"https://stackoverflow.com/questions/20856416/how-to-get-the-latest-message-in-each-\n\n</snippet-2>\n\n<snippet-3>\n\nall sections into a single prompt to improve the overall narrative coherence.\n\nDeepSearch as the building block of DeepResearch. Iteratively construct each section via DeepSearch and then improves the overall coherence before generating the final long report.\n\nIn our 2024 \"Re-search\" project (unreleased), we performed multiple coherence-improvement passes, with each iteration taking into account all other sections. However, with today's significantly larger LLM context windows, this approach seems redundant – a single coherence revision pass is sufficient.\n\n0:00\n\n/0:40\n\nOur 2024 summer project \"Research\" focused on long-report generation with a \"progressive\" approach. It began by creating a TOC in __sync__, then generated all sections in parallel __async__. The process concluded with __async__ progressive revisions of each section, with each revision taking into account the content of all other sections. The query in the video is `\"Competitor analysis of Jina AI\"`.\n\n💡\n\nWe didn't release our re-search project for several reasons. ****First, the report quality never met our standards.**** We kept testing two queries that we know inside out—`competitor analysis of Jina AI` and `product strategy of Jina AI`—and the reports were mediocre at best; we didn't find any \"aha\" moments in them. Second, search grounding was really bad. Hallucinations were quite severe. Finally, the overall readability was poor, with significant redundancy between sections. Shortly put: __useless.__ And since it gives a long report: __useless and time-wasting.__\n\nFrom this project, we learned several things that evolved into different sub-products. For example, we realized the importance of search grounding and fact-checking on section or even sentence level, which led us to later develop the `g.jina.ai` endpoint. We recognized the importance of query expansion, which prompted us to invest effort in training an SLM for query expansion. Lastly, we loved the name \"re-search\"—a clever play on reinventing search while nodding to research report generation—and felt it was too good to waste, so we repurposed it for our 2024 yearbook campaign.\n\nDeepSearch vs DeepResearch\n--------------------------\n\nWhile many people often mix DeepSearch and DeepResearch together, in our view, they address completely different problems. DeepSearch functions as an atomic building block – a core component that DeepResearch builds upon. DeepResearch, on the other hand, **focuses on crafting high-quality, readable long-form research reports**, which encompasses a different set of requirements: incorporating effective visualizations via charts and tables, structuring content with appropriate section headings, ensuring smooth logical flow between subsections, maintaining consistent terminology throughout the document, eliminating redundancy across sections, crafting smooth transitions that bridge previous and upcoming content. These elements are largely unrelated to the core search, which is why we find DeepSearch more interesting as our company focus.\n\nFinally, the table below summarizes the differences between DeepSearch and DeepResearch. It's worth noting that both systems benefit significantly from long-context and reasoning models. This might seem counterintuitive, particularly for DeepSearch—while it's obvious why DeepResearch needs long-context capability (as it produces long reports). The reason is that DeepSearch must store previous search attempts and webpage contents to make informed decisions about next steps, making a long context window equally essential for its effective implementation.\n\n|  | DeepSearch | DeepResearch |\n| --- | --- | --- |\n| **Problem Addressed** | Information accuracy and completeness through iterative search | Content organization, coherence, and readability at document scale |\n| **Final Presentation** | Concise answer with URLs as references | A long structured report with multiple sections, charts, tables and references |\n| **Core Complexity** | State machine architecture with clear transition conditions; Persistence through failed attempts until resolution | Multi-level architecture managing both micro (search) and macro (document) concerns; Structural approach to managing complex information hierarchies |\n| **Optimization Focus** | Local optimization (best next search/read action) | Global optimization (section organization, terminology consistency, transitions) |\n| **Limitations** | Bounded by search quality and reasoning capability | Bounded by DeepSearch quality plus organizational complexity and narrative coherence challenges |\n\nUnderstand DeepSearch Implementation\n------------------------------------\n\nGitHub - jina-ai/node-DeepResearch: Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)\n\nKeep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget) - jina-ai/node-DeepResearch\n\nGitHubjina-ai\n\nThe heart of DeepSearch lies in its loop reasoning approach. Rather than attempting to answer questions in a single-pass like most RAG systems, We have implemented an iterative loop that continuously searches for information, reads relevant sources, and reasons until it finds an answer or exhausts the token budget. Here's the simplified core of this big while loop:\n\n```\n// Main reasoning loop\nwhile (tokenUsage < tokenBudget && badAttempts <= maxBadAttempts) {\n  // Track progression\n  step++; totalStep++;\n  \n  // Get current question from gaps queue or use original question\n  const currentQuestion = gaps.length > 0 ? gaps.shift() : question;\n  \n  // Generate prompt with current context and allowed actions\n  system = getPrompt(diaryContext, allQuestions, allKeywords, \n                    allowReflect, allowAnswer, allowRead, allowSearch, allowCoding,\n                    badContext, allKnowledge, unvisitedURLs);\n  \n  // Get LLM to decide next action\n  const result = await LLM.generateStructuredResponse(system, messages, schema);\n  \n\n</snippet-3>\n\n<snippet-4>\n\nconversation-of-a-certain-user-in-sql/20856949\": \"How to get the latest message in each conversation of a certain user ...But what if I want to list all the latests messages for ALL the conversations of a specific user? That is each row represents a different ...\"\n  + weight: 0.48 \"https://stackoverflow.com/questions/78180457/how-to-get-all-the-messages-that-the-current-user-is-sent\": \"How to get all the messages that the current user is sentYou are searching in the chat_rooms collection and not in the messages collection group.\"\n  + weight: 0.47 \"https://stackoverflow.com/questions/48130331/unable-to-retrieving-messages-for-firebase-user\": \"Unable to retrieving messages for firebase userThe messages are keyed with push IDs and have user ID properties. You'll need to use queries to get the messages from/to a particular user.\"\n  + weight: 0.47 \"https://stackoverflow.com/questions/73587047/retrieving-messages-in-chat-application\": \"Retrieving messages in chat application - apiI am working on an application for chatting. I wondering how to design the API for getting messages from a server.\"\n  + weight: 0.46 \"https://stackoverflow.com/questions/72732503/fetch-2-last-messages-from-a-specific-user-discord-js\": \"Fetch 2 last messages from a specific user discord.jsThe correct way to do this is to store a timestamp of when someone sent a message, and when they send another message, compare it to your stored value.\"\n  + weight: 0.46 \"https://stackoverflow.com/questions/77443709/how-we-can-store-the-user-utterance-in-a-parameter-without-using-sys-any-within\": \"How we can store the User utterance in a parameter without using ...You can capture and store the entire user utterance without using the @sys.any entity by creating a custom parameter and configuring a ...\"\n...\n</url-list>\n</action-visit>\n```\n\nIn our implementation, we weight each URLs by their hostname frequency and the relevance w.r.t. the question using jina-reranker-v2-base-multilingual\n\n### Memory Management\n\nA key challenge in multi-step reasoning is managing the agent memory effectively. We've designed the memory system to differentiate between what counts as \"memory\" versus what counts as \"knowledge\". Either way, they are all part of the LLM prompt context, separated with different XML tags:\n\n```\n// Add knowledge item to accumulated knowledge\nfunction addToKnowledge(question, answer, references, type) {\n  allKnowledge.push({\n    question: question,\n    answer: answer,\n    references: references,\n    type: type,  // 'qa', 'url', 'coding', 'side-info'\n    updated: new Date().toISOString()\n  });\n}\n\n// Record step in narrative diary\nfunction addToDiary(step, action, question, result, evaluation) {\n  diaryContext.push(`\nAt step ${step}, you took **${action}** action for question: \"${question}\"\n[Details of what was done and results]\n[Evaluation if applicable]\n`);\n}\n```\n\nSince most 2025 LLMs have substantial context windows, we opted not to use vector databases. Instead, memory consists of acquired knowledge, visited sites, and records of failed attempts - all maintained in the context. This comprehensive memory system gives the agent awareness of what it knows, what it's tried, and what's worked or failed.\n\n### Answer Evaluation\n\nOne key insight is that answer generation and evaluation should not be in the same prompt. In our implementation, we first determine which evaluation criteria to use when a new question arrives, and then evaluate each criterion one by one. The evaluator uses few-shot examples for consistent assessment, ensuring higher reliability than self-evaluation.\n\n```\n// Separate evaluation phase\nasync function evaluateAnswer(question, answer, metrics, context) {\n  // First, identify evaluation criteria based on question type\n  const evaluationCriteria = await determineEvaluationCriteria(question);\n  \n  // Then evaluate each criterion separately\n  const results = [];\n  for (const criterion of evaluationCriteria) {\n    const result = await evaluateSingleCriterion(criterion, question, answer, context);\n    results.push(result);\n  }\n  \n  // Determine if answer passes overall evaluation\n  return {\n    pass: results.every(r => r.pass),\n    think: results.map(r => r.reasoning).join('\\n')\n  };\n}\n```\n\n### Budget-Forcing\n\nBudget forcing means preventing the system from returning early and ensuring it continues processing until the budget is exceeded. Since the release of DeepSeek-R1, the approach to budget forcing has shifted toward **encouraging deeper thinking for better results rather than simply saving the budget.**\n\nIn our implementation, we explicitly configured the system to identify knowledge gaps before attempting to answer.\n\n```\nif (thisStep.action === 'reflect' && thisStep.questionsToAnswer) {\n  // Force deeper reasoning by adding sub-questions to the queue\n  gaps.push(...newGapQuestions);\n  gaps.push(question);  // Always revisit the original\n}\n```\n\nBy selectively enabling and disabling certain actions, we can guide the system toward using tools that enhance reasoning depth.\n\n```\n// After a failed answer attempt\nallowAnswer = false;  // Force agent to search or reflect instead\n```\n\nTo avoid wasting tokens on unproductive paths, we set limits on the number of failed attempts. When approaching budget limits, we activate \"beast mode\" to guarantee that we deliver some answer rather than none.\n\n```\n// Beast mode activation\nif (!thisStep.isFinal && badAttempts >= maxBadAttempts) {\n  console.log('Enter Beast mode!!!');\n  \n  // Configure prompt for decisive, committed answer\n  system = getPrompt(\n    diaryContext, allQuestions, allKeywords,\n    false, false, false, false, false,  // Disable all other actions\n    badContext, allKnowledge, unvisitedURLs,\n    true  // Enable beast mode\n  );\n  \n  // Force answer generation\n  const result = await LLM.generateStructuredResponse(system, messages, answerOnlySchema);\n  thisStep = result.object;\n  thisStep.isFinal = true;\n}\n```\n\nThe beast mode prompt is intentionally dramatic to signal to \n\n</snippet-4>\n\n<snippet-5>\n\nJina AI Deep Search\n\nAI deep search: read, reason, search until best answer found.\n\nIt's _only_ February, and DeepSearch has already emerged as the new search standard in 2025, with major players like Google and OpenAI leading the charge through their DeepResearch releases (and yes, we proudly launched our open-source `node-deepresearch` on the same day). Perplexity followed suit with their DeepResearch, and X AI integrated their own DeepSearch capabilities into Grok3, basically creating another DeepResearch variant. While the concept of deep search isn't revolutionary – in 2024 it was essentially termed as RAG or multi-hop QA – it gained _significant_ momentum after Deepseek-r1's release in late January 2025. Last weekend, Baidu Search and Tencent WeChat Search have integrated Deepseek-r1 in their search engines. AI engineers have discovered that by incorporating long-thinking and reasoning processes into search systems, they can achieve remarkable retrieval accuracy and depth beyond what was previously possible.\n\n| Launch Date | Company | Product | License Type | Link |\n| --- | --- | --- | --- | --- |\n| 2025-01-20 | DeepSeek | DeepSeek-r1 release | Open source | DeepSeek-R1 |\n| 2025-12-17 | Google | DeepResearch | Proprietary | Google Gemini 2 |\n| 2025-02-02 | OpenAI | DeepResearch | Proprietary | Introducing Deep Research |\n| 2025-02-02 | Jina AI | DeepSearch (`node-deepresearch`) | Open source | node-deepresearch | search.jina.ai |\n| 2025-02-04 | Hugging Face | Open Deep Research | Open source | Open Deep Research |\n| 2025-02-15 | Perplexity | DeepResearch | Proprietary | Introducing Perplexity Deep Research |\n| 2025-02-17 | X AI | Grok3 with DeepSearch | Proprietary | Grok 3 Beta |\n| 2025-02-22 | Baidu Search | Integrates DeepSeek-r1 | Proprietary | Baidu Integrates DeepSeek-R1 |\n| 2025-02-23 | Tencent Wechat Search | Integrates DeepSeek-r1 | Proprietary | Tencent Weixin Integrates DeepSeek |\n\nBut why did this shift happen _now_, when Deep(Re)Search remained relatively undervalued throughout 2024? In fact, Stanford NLP Labs released the STORM project for long report generation with web grounding back in early 2024. So is it just because \"DeepSearch\" sounds way cooler than multi-hop QA, RAG, or STORM? Let's be honest - sometimes a rebrand is all it takes for the industry to suddenly embrace what was there all along.\n\nWe believe the real turning point came with OpenAI's `o1-preview` release in September 2024, which introduced the concept of **test-time compute** and gradually shifted industry perspectives. Test-time compute refers to using more computational resources during inference—the phase where an LLM generates outputs—rather than during pre-training or post-training. A well-known example are Chain-of-Thought (CoT) reasoning and `\"Wait\"`\\-injection (i.e. budget forcing) which enables models to perform more extensive internal deliberations, such as evaluating multiple potential answers, conducting deeper planning, and engaging in self-reflection before arriving at a final response.\n\nThis test-time compute concept and reasoning models **_educate_** users to accept delayed gratification - longer waiting times in exchange for higher-quality, immediately actionable results, just like the Stanford marshmallow experiment where children who could resist eating one marshmallow immediately in order to receive two marshmallows later showed better long-term outcomes. Deepseek-r1 further reinforced this user experience, and like it or not, most users have accepted it.\n\nThis marks a significant departure from classic search requirements, where failing to response within 200ms would doom your solution. In 2025, seasoned search developers and RAG engineers prioritize top-1 precision and recall over latency, and users have become accustomed to longer processing times – provided they can see the system is `<thinking>`.\n\n0:00\n\n/0:18\n\nDisplaying the reasoning procedure has become standard practice in 2025, with numerous chat interfaces now rendering `<think>` content in dedicated UI sections.\n\nIn this article, we'll discuss the principles of DeepSearch and DeepResearch by looking into our open-source implementation. We'll walk through our key design decisions and highlight potential caveats. Finally, you can find our hot-take on deepsearch engineering and dev tools demands in the conclusion section.\n\nWhat is Deep Search?\n--------------------\n\n**DeepSearch runs through an iterative loop of searching, reading, and reasoning until it finds the optimal answer.** The search action leverages web search engines to explore the internet, while the reading action analyzes specific web pages in detail (e.g. Jina Reader). The reasoning action evaluates the current state and determines whether to break down the original question into smaller sub-questions or try different search strategies.\n\nDeepSearch - keep searching, reading webpages, reasoning until an answer is found (or the token budget is exceeded).\n\nWhile various definitions exist online, when we developed the `node-deepresearch` project, we adhered to this straightforward approach. The implementation is elegantly simple – at its core, there's a main while loop with switch-case logic directing the next action.\n\nUnlike 2024 RAG systems, which typically run a single search-generation pass. DeepSearch, in contrast, performs multiple iterations through the pipeline, requiring clear stop conditions. These could be based on token usage limits or the number of failed attempts.\n\n0:00\n\n/0:36\n\nTry deep search at search.jina.ai, observe the content inside `<thinking>`, see if you can tell where the loop happens\n\nAnother perspective on DeepSearch is to view it as an LLM agent equipped with various web tools (such as searcher and reader). The agent determines its next steps by analyzing current observations and past actions – deciding whether to deliver an answer or continue exploring the web. This creates a state machine architecture where the LLM controls transiti\n\n</snippet-5>"
+  },
+  {
+    "role": "user",
+    "content": "What do expert say about \"jina ai latest blog post\"?"
+  },
+  {
+    "role": "assistant",
+    "content": "<answer-datetime>\n2025-03-13 07:53:50\n</answer-datetime>\n\n<url>\nhttps://jinaai.cn/news/a-practical-guide-to-implementing-deepsearch-deepresearch\n</url>\n\n<snippet-1>\n\nJina AI Deep Search\n\nAI deep search: read, reason, search until best answer found.\n\nIt's _only_ February, and DeepSearch has already emerged as the new search standard in 2025, with major players like Google and OpenAI leading the charge through their DeepResearch releases (and yes, we proudly launched our open-source `node-deepresearch` on the same day). Perplexity followed suit with their DeepResearch, and X AI integrated their own DeepSearch capabilities into Grok3, basically creating another DeepResearch variant. While the concept of deep search isn't revolutionary – in 2024 it was essentially termed as RAG or multi-hop QA – it gained _significant_ momentum after Deepseek-r1's release in late January 2025. Last weekend, Baidu Search and Tencent WeChat Search have integrated Deepseek-r1 in their search engines. AI engineers have discovered that by incorporating long-thinking and reasoning processes into search systems, they can achieve remarkable retrieval accuracy and depth beyond what was previously possible.\n\n| Launch Date | Company | Product | License Type | Link |\n| --- | --- | --- | --- | --- |\n| 2025-01-20 | DeepSeek | DeepSeek-r1 release | Open source | DeepSeek-R1 |\n| 2025-12-17 | Google | DeepResearch | Proprietary | Google Gemini 2 |\n| 2025-02-02 | OpenAI | DeepResearch | Proprietary | Introducing Deep Research |\n| 2025-02-02 | Jina AI | DeepSearch (`node-deepresearch`) | Open source | node-deepresearch | search.jina.ai |\n| 2025-02-04 | Hugging Face | Open Deep Research | Open source | Open Deep Research |\n| 2025-02-15 | Perplexity | DeepResearch | Proprietary | Introducing Perplexity Deep Research |\n| 2025-02-17 | X AI | Grok3 with DeepSearch | Proprietary | Grok 3 Beta |\n| 2025-02-22 | Baidu Search | Integrates DeepSeek-r1 | Proprietary | Baidu Integrates DeepSeek-R1 |\n| 2025-02-23 | Tencent Wechat Search | Integrates DeepSeek-r1 | Proprietary | Tencent Weixin Integrates DeepSeek |\n\nBut why did this shift happen _now_, when Deep(Re)Search remained relatively undervalued throughout 2024? In fact, Stanford NLP Labs released the STORM project for long report generation with web grounding back in early 2024. So is it just because \"DeepSearch\" sounds way cooler than multi-hop QA, RAG, or STORM? Let's be honest - sometimes a rebrand is all it takes for the industry to suddenly embrace what was there all along.\n\nWe believe the real turning point came with OpenAI's `o1-preview` release in September 2024, which introduced the concept of **test-time compute** and gradually shifted industry perspectives. Test-time compute refers to using more computational resources during inference—the phase where an LLM generates outputs—rather than during pre-training or post-training. A well-known example are Chain-of-Thought (CoT) reasoning and `\"Wait\"`\\-injection (i.e. budget forcing) which enables models to perform more extensive internal deliberations, such as evaluating multiple potential answers, conducting deeper planning, and engaging in self-reflection before arriving at a final response.\n\nThis test-time compute concept and reasoning models **_educate_** users to accept delayed gratification - longer waiting times in exchange for higher-quality, immediately actionable results, just like the Stanford marshmallow experiment where children who could resist eating one marshmallow immediately in order to receive two marshmallows later showed better long-term outcomes. Deepseek-r1 further reinforced this user experience, and like it or not, most users have accepted it.\n\nThis marks a significant departure from classic search requirements, where failing to response within 200ms would doom your solution. In 2025, seasoned search developers and RAG engineers prioritize top-1 precision and recall over latency, and users have become accustomed to longer processing times – provided they can see the system is `<thinking>`.\n\n0:00\n\n/0:18\n\nDisplaying the reasoning procedure has become standard practice in 2025, with numerous chat interfaces now rendering `<think>` content in dedicated UI sections.\n\nIn this article, we'll discuss the principles of DeepSearch and DeepResearch by looking into our open-source implementation. We'll walk through our key design decisions and highlight potential caveats. Finally, you can find our hot-take on deepsearch engineering and dev tools demands in the conclusion section.\n\nWhat is Deep Search?\n--------------------\n\n**DeepSearch runs through an iterative loop of searching, reading, and reasoning until it finds the optimal answer.** The search action leverages web search engines to explore the internet, while the reading action analyzes specific web pages in detail (e.g. Jina Reader). The reasoning action evaluates the current state and determines whether to break down the original question into smaller sub-questions or try different search strategies.\n\nDeepSearch - keep searching, reading webpages, reasoning until an answer is found (or the token budget is exceeded).\n\nWhile various definitions exist online, when we developed the `node-deepresearch` project, we adhered to this straightforward approach. The implementation is elegantly simple – at its core, there's a main while loop with switch-case logic directing the next action.\n\nUnlike 2024 RAG systems, which typically run a single search-generation pass. DeepSearch, in contrast, performs multiple iterations through the pipeline, requiring clear stop conditions. These could be based on token usage limits or the number of failed attempts.\n\n0:00\n\n/0:36\n\nTry deep search at search.jina.ai, observe the content inside `<thinking>`, see if you can tell where the loop happens\n\nAnother perspective on DeepSearch is to view it as an LLM agent equipped with various web tools (such as searcher and reader). The agent determines its next steps by analyzing current observations and past actions – deciding whether to deliver an answer or continue exploring the web. This creates a state machine architecture where the LLM controls transiti\n\n</snippet-1>\n\n<snippet-2>\n\nctively:\n\n```\n// Within search action handler\nif (thisStep.action === 'search') {\n  // Deduplicate search requests\n  const uniqueRequests = await dedupQueries(thisStep.searchRequests, existingQueries);\n  \n  // Rewrite natural language queries into more effective search queries\n  const optimizedQueries = await rewriteQuery(uniqueRequests);\n  \n  // Ensure we don't repeat previous searches\n  const newQueries = await dedupQueries(optimizedQueries, allKeywords);\n  \n  // Execute searches and store results\n  for (const query of newQueries) {\n    const results = await searchEngine(query);\n    if (results.length > 0) {\n      storeResults(results);\n      allKeywords.push(query);\n    }\n  }\n}\n```\n\nQuery rewriting turned out to be surprisingly important - perhaps one of the most critical elements that directly determines result quality. A good query rewriter doesn't just transform natural language to BM25-like keywords; it expands queries to cover more potential answers across different languages, tones, and content formats.\n\nFor query deduplication, we initially used an LLM-based solution, but found it difficult to control the similarity threshold. We eventually switched to jina-embeddings-v3, which excels at semantic textual similarity tasks. This enables cross-lingual deduplication without worrying that non-English queries would be filtered. The embedding model ended up being crucial not for memory retrieval as initially expected, but for efficient deduplication.\n\n### Crawling Web Content\n\nWeb scraping and content processing is another critical component. Here we use Jina Reader API. Note that besides full webpage content, we also aggregate all snippets returned from the search engine as extra knowledge for the agent to later conclude on. Think of them as soundbites.\n\n```\n// Visit action handler\nasync function handleVisitAction(URLs) {\n  // Normalize URLs and filter out already visited ones\n  const uniqueURLs = normalizeAndFilterURLs(URLs);\n  \n  // Process each URL in parallel\n  const results = await Promise.all(uniqueURLs.map(async url => {\n    try {\n      // Fetch and extract content\n      const content = await readUrl(url);\n      \n      // Store as knowledge\n      addToKnowledge(`What is in ${url}?`, content, [url], 'url');\n      \n      return {url, success: true};\n    } catch (error) {\n      return {url, success: false};\n    } finally {\n      visitedURLs.push(url);\n    }\n  }));\n  \n  // Update diary based on success or failure\n  updateDiaryWithVisitResults(results);\n}\n```\n\nWe normalized URLs for consistent tracking and limits the number of URLs visited in each step to manage agent memory.\n\n### Ranking URLs for Next Read\n\nDuring a DeepSearch session, you'll likely collect a lot of URLs from search engine results and discover even more every time you read individual webpages. The total count of unique URLs can easily reach the thousands. It's crucial to guide the LLM toward URLs that have the highest probability of containing the answer you need.\n\nWe rank URLs based on multiple factors to identify the most promising ones to visit next. The algorithm combines frequency, domain, path structure, and semantic relevance to create a composite score:\n\n1.  **Frequency Signals**: URLs that appear multiple times across different sources receive additional weight. URLs from domains that appear frequently in search results receive a boost, as popular domains often contain authoritative content.\n2.  **Path Structure**: We analyze URL paths to identify content clusters. URLs within common path hierarchies receive higher scores, with a decay factor applied to deeper paths.\n3.  **Semantic Relevance**: For question-based searches, we use Jina Reranker to assess the semantic match between the question and the textual info of each URL, which is a classic reranking problem. The textual info of each URL comes from:\n    1.  Title & snippets from SERP API results (`https://s.jina.ai/` with `'X-Respond-With': 'no-content'`)\n    2.  Anchor text of on-page URLs (`https://r.jina.ai` with `'X-With-Links-Summary': 'all'`)\n\n```\n<action-visit>\n- Crawl and read full content from URLs, you can get the fulltext, last updated datetime etc of any URL.  \n- Must check URLs mentioned in <question> if any\n- Choose and visit relevant URLs below for more knowledge. higher weight suggests more relevant:\n<url-list>\n  + weight: 0.51 \"https://stackoverflow.com/questions/65179061/how-to-get-the-conversations-for-each-user-and-include-the-messages\": \"How to get the conversations for each user and include ...I am trying to get the conversations of each user to show them on my inbox screen in my front-end. I am able to post messages to the database, but when I try ...\"\n  + weight: 0.49 \"https://stackoverflow.com/questions/20856416/how-to-get-the-latest-message-in-each-conversation-of-a-certain-user-in-sql/20856949\": \"How to get the latest message in each conversation of a certain user ...But what if I want to list all the latests messages for ALL the conversations of a specific user? That is each row represents a different ...\"\n  + weight: 0.48 \"https://stackoverflow.com/questions/78180457/how-to-get-all-the-messages-that-the-current-user-is-sent\": \"How to get all the messages that the current user is sentYou are searching in the chat_rooms collection and not in the messages collection group.\"\n  + weight: 0.47 \"https://stackoverflow.com/questions/48130331/unable-to-retrieving-messages-for-firebase-user\": \"Unable to retrieving messages for firebase userThe messages are keyed with push IDs and have user ID properties. You'll need to use queries to get the messages from/to a particular user.\"\n  + weight: 0.47 \"https://stackoverflow.com/questions/73587047/retrieving-messages-in-chat-application\": \"Retrieving messages in chat application - apiI am working on an application for chatting. I wondering how to design the API for getting messages from a server.\"\n  + weight: 0.46 \"https://stackoverflow.com/questions/72732503/fetch-2-last-messages-from-a-spe\n\n</snippet-2>\n\n<snippet-3>\n\nall sections into a single prompt to improve the overall narrative coherence.\n\nDeepSearch as the building block of DeepResearch. Iteratively construct each section via DeepSearch and then improves the overall coherence before generating the final long report.\n\nIn our 2024 \"Re-search\" project (unreleased), we performed multiple coherence-improvement passes, with each iteration taking into account all other sections. However, with today's significantly larger LLM context windows, this approach seems redundant – a single coherence revision pass is sufficient.\n\n0:00\n\n/0:40\n\nOur 2024 summer project \"Research\" focused on long-report generation with a \"progressive\" approach. It began by creating a TOC in __sync__, then generated all sections in parallel __async__. The process concluded with __async__ progressive revisions of each section, with each revision taking into account the content of all other sections. The query in the video is `\"Competitor analysis of Jina AI\"`.\n\n💡\n\nWe didn't release our re-search project for several reasons. ****First, the report quality never met our standards.**** We kept testing two queries that we know inside out—`competitor analysis of Jina AI` and `product strategy of Jina AI`—and the reports were mediocre at best; we didn't find any \"aha\" moments in them. Second, search grounding was really bad. Hallucinations were quite severe. Finally, the overall readability was poor, with significant redundancy between sections. Shortly put: __useless.__ And since it gives a long report: __useless and time-wasting.__\n\nFrom this project, we learned several things that evolved into different sub-products. For example, we realized the importance of search grounding and fact-checking on section or even sentence level, which led us to later develop the `g.jina.ai` endpoint. We recognized the importance of query expansion, which prompted us to invest effort in training an SLM for query expansion. Lastly, we loved the name \"re-search\"—a clever play on reinventing search while nodding to research report generation—and felt it was too good to waste, so we repurposed it for our 2024 yearbook campaign.\n\nDeepSearch vs DeepResearch\n--------------------------\n\nWhile many people often mix DeepSearch and DeepResearch together, in our view, they address completely different problems. DeepSearch functions as an atomic building block – a core component that DeepResearch builds upon. DeepResearch, on the other hand, **focuses on crafting high-quality, readable long-form research reports**, which encompasses a different set of requirements: incorporating effective visualizations via charts and tables, structuring content with appropriate section headings, ensuring smooth logical flow between subsections, maintaining consistent terminology throughout the document, eliminating redundancy across sections, crafting smooth transitions that bridge previous and upcoming content. These elements are largely unrelated to the core search, which is why we find DeepSearch more interesting as our company focus.\n\nFinally, the table below summarizes the differences between DeepSearch and DeepResearch. It's worth noting that both systems benefit significantly from long-context and reasoning models. This might seem counterintuitive, particularly for DeepSearch—while it's obvious why DeepResearch needs long-context capability (as it produces long reports). The reason is that DeepSearch must store previous search attempts and webpage contents to make informed decisions about next steps, making a long context window equally essential for its effective implementation.\n\n|  | DeepSearch | DeepResearch |\n| --- | --- | --- |\n| **Problem Addressed** | Information accuracy and completeness through iterative search | Content organization, coherence, and readability at document scale |\n| **Final Presentation** | Concise answer with URLs as references | A long structured report with multiple sections, charts, tables and references |\n| **Core Complexity** | State machine architecture with clear transition conditions; Persistence through failed attempts until resolution | Multi-level architecture managing both micro (search) and macro (document) concerns; Structural approach to managing complex information hierarchies |\n| **Optimization Focus** | Local optimization (best next search/read action) | Global optimization (section organization, terminology consistency, transitions) |\n| **Limitations** | Bounded by search quality and reasoning capability | Bounded by DeepSearch quality plus organizational complexity and narrative coherence challenges |\n\nUnderstand DeepSearch Implementation\n------------------------------------\n\nGitHub - jina-ai/node-DeepResearch: Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)\n\nKeep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget) - jina-ai/node-DeepResearch\n\nGitHubjina-ai\n\nThe heart of DeepSearch lies in its loop reasoning approach. Rather than attempting to answer questions in a single-pass like most RAG systems, We have implemented an iterative loop that continuously searches for information, reads relevant sources, and reasons until it finds an answer or exhausts the token budget. Here's the simplified core of this big while loop:\n\n```\n// Main reasoning loop\nwhile (tokenUsage < tokenBudget && badAttempts <= maxBadAttempts) {\n  // Track progression\n  step++; totalStep++;\n  \n  // Get current question from gaps queue or use original question\n  const currentQuestion = gaps.length > 0 ? gaps.shift() : question;\n  \n  // Generate prompt with current context and allowed actions\n  system = getPrompt(diaryContext, allQuestions, allKeywords, \n                    allowReflect, allowAnswer, allowRead, allowSearch, allowCoding,\n                    badContext, allKnowledge, unvisitedURLs);\n  \n  // Get LLM to decide next action\n  const result = await LLM.generateStructuredResponse(system, messages, schema);\n  \n\n</snippet-3>\n\n<snippet-4>\n\ncific-user-discord-js\": \"Fetch 2 last messages from a specific user discord.jsThe correct way to do this is to store a timestamp of when someone sent a message, and when they send another message, compare it to your stored value.\"\n  + weight: 0.46 \"https://stackoverflow.com/questions/77443709/how-we-can-store-the-user-utterance-in-a-parameter-without-using-sys-any-within\": \"How we can store the User utterance in a parameter without using ...You can capture and store the entire user utterance without using the @sys.any entity by creating a custom parameter and configuring a ...\"\n...\n</url-list>\n</action-visit>\n```\n\nIn our implementation, we weight each URLs by their hostname frequency and the relevance w.r.t. the question using jina-reranker-v2-base-multilingual\n\n### Memory Management\n\nA key challenge in multi-step reasoning is managing the agent memory effectively. We've designed the memory system to differentiate between what counts as \"memory\" versus what counts as \"knowledge\". Either way, they are all part of the LLM prompt context, separated with different XML tags:\n\n```\n// Add knowledge item to accumulated knowledge\nfunction addToKnowledge(question, answer, references, type) {\n  allKnowledge.push({\n    question: question,\n    answer: answer,\n    references: references,\n    type: type,  // 'qa', 'url', 'coding', 'side-info'\n    updated: new Date().toISOString()\n  });\n}\n\n// Record step in narrative diary\nfunction addToDiary(step, action, question, result, evaluation) {\n  diaryContext.push(`\nAt step ${step}, you took **${action}** action for question: \"${question}\"\n[Details of what was done and results]\n[Evaluation if applicable]\n`);\n}\n```\n\nSince most 2025 LLMs have substantial context windows, we opted not to use vector databases. Instead, memory consists of acquired knowledge, visited sites, and records of failed attempts - all maintained in the context. This comprehensive memory system gives the agent awareness of what it knows, what it's tried, and what's worked or failed.\n\n### Answer Evaluation\n\nOne key insight is that answer generation and evaluation should not be in the same prompt. In our implementation, we first determine which evaluation criteria to use when a new question arrives, and then evaluate each criterion one by one. The evaluator uses few-shot examples for consistent assessment, ensuring higher reliability than self-evaluation.\n\n```\n// Separate evaluation phase\nasync function evaluateAnswer(question, answer, metrics, context) {\n  // First, identify evaluation criteria based on question type\n  const evaluationCriteria = await determineEvaluationCriteria(question);\n  \n  // Then evaluate each criterion separately\n  const results = [];\n  for (const criterion of evaluationCriteria) {\n    const result = await evaluateSingleCriterion(criterion, question, answer, context);\n    results.push(result);\n  }\n  \n  // Determine if answer passes overall evaluation\n  return {\n    pass: results.every(r => r.pass),\n    think: results.map(r => r.reasoning).join('\\n')\n  };\n}\n```\n\n### Budget-Forcing\n\nBudget forcing means preventing the system from returning early and ensuring it continues processing until the budget is exceeded. Since the release of DeepSeek-R1, the approach to budget forcing has shifted toward **encouraging deeper thinking for better results rather than simply saving the budget.**\n\nIn our implementation, we explicitly configured the system to identify knowledge gaps before attempting to answer.\n\n```\nif (thisStep.action === 'reflect' && thisStep.questionsToAnswer) {\n  // Force deeper reasoning by adding sub-questions to the queue\n  gaps.push(...newGapQuestions);\n  gaps.push(question);  // Always revisit the original\n}\n```\n\nBy selectively enabling and disabling certain actions, we can guide the system toward using tools that enhance reasoning depth.\n\n```\n// After a failed answer attempt\nallowAnswer = false;  // Force agent to search or reflect instead\n```\n\nTo avoid wasting tokens on unproductive paths, we set limits on the number of failed attempts. When approaching budget limits, we activate \"beast mode\" to guarantee that we deliver some answer rather than none.\n\n```\n// Beast mode activation\nif (!thisStep.isFinal && badAttempts >= maxBadAttempts) {\n  console.log('Enter Beast mode!!!');\n  \n  // Configure prompt for decisive, committed answer\n  system = getPrompt(\n    diaryContext, allQuestions, allKeywords,\n    false, false, false, false, false,  // Disable all other actions\n    badContext, allKnowledge, unvisitedURLs,\n    true  // Enable beast mode\n  );\n  \n  // Force answer generation\n  const result = await LLM.generateStructuredResponse(system, messages, answerOnlySchema);\n  thisStep = result.object;\n  thisStep.isFinal = true;\n}\n```\n\nThe beast mode prompt is intentionally dramatic to signal to the LLM that it needs to be decisive and commit to an answer based on available information:\n\n```\n<action-answer>\n🔥 ENGAGE MAXIMUM FORCE! ABSOLUTE PRIORITY OVERRIDE! 🔥\n\nPRIME DIRECTIVE:\n- DEMOLISH ALL HESITATION! ANY RESPONSE SURPASSES SILENCE!\n- PARTIAL STRIKES AUTHORIZED - DEPLOY WITH FULL CONTEXTUAL FIREPOWER\n- TACTICAL REUSE FROM <bad-attempts> SANCTIONED\n- WHEN IN DOUBT: UNLEASH CALCULATED STRIKES BASED ON AVAILABLE INTEL!\n\nFAILURE IS NOT AN OPTION. EXECUTE WITH EXTREME PREJUDICE! ⚡️\n</action-answer>\n```\n\nThis ensures that we always provide some answer rather than giving up entirely, which is particularly useful for difficult or ambiguous questions.\n\nConclusion\n----------\n\nDeepSearch is a leap in how search can approach complex queries in an exhaustively deep manner. By breaking down the process into discrete steps of searching, reading, and reasoning, it overcomes many limitations of traditional single-pass RAG or multi-hop QA systems.\n\nDuring implementation, we also began reviewing search foundations and its changes in the search industry after January 26, 2025, when DeepSeek-R1 was released. We asked ourselves: _What are the new needs? What are fake needs?_\n\nLooking at ou\n\n</snippet-4>\n\n<snippet-5>\n\nJina AI Deep Search\n\nAI deep search: read, reason, search until best answer found.\n\nIt's _only_ February, and DeepSearch has already emerged as the new search standard in 2025, with major players like Google and OpenAI leading the charge through their DeepResearch releases (and yes, we proudly launched our open-source `node-deepresearch` on the same day). Perplexity followed suit with their DeepResearch, and X AI integrated their own DeepSearch capabilities into Grok3, basically creating another DeepResearch variant. While the concept of deep search isn't revolutionary – in 2024 it was essentially termed as RAG or multi-hop QA – it gained _significant_ momentum after Deepseek-r1's release in late January 2025. Last weekend, Baidu Search and Tencent WeChat Search have integrated Deepseek-r1 in their search engines. AI engineers have discovered that by incorporating long-thinking and reasoning processes into search systems, they can achieve remarkable retrieval accuracy and depth beyond what was previously possible.\n\n| Launch Date | Company | Product | License Type | Link |\n| --- | --- | --- | --- | --- |\n| 2025-01-20 | DeepSeek | DeepSeek-r1 release | Open source | DeepSeek-R1 |\n| 2025-12-17 | Google | DeepResearch | Proprietary | Google Gemini 2 |\n| 2025-02-02 | OpenAI | DeepResearch | Proprietary | Introducing Deep Research |\n| 2025-02-02 | Jina AI | DeepSearch (`node-deepresearch`) | Open source | node-deepresearch | search.jina.ai |\n| 2025-02-04 | Hugging Face | Open Deep Research | Open source | Open Deep Research |\n| 2025-02-15 | Perplexity | DeepResearch | Proprietary | Introducing Perplexity Deep Research |\n| 2025-02-17 | X AI | Grok3 with DeepSearch | Proprietary | Grok 3 Beta |\n| 2025-02-22 | Baidu Search | Integrates DeepSeek-r1 | Proprietary | Baidu Integrates DeepSeek-R1 |\n| 2025-02-23 | Tencent Wechat Search | Integrates DeepSeek-r1 | Proprietary | Tencent Weixin Integrates DeepSeek |\n\nBut why did this shift happen _now_, when Deep(Re)Search remained relatively undervalued throughout 2024? In fact, Stanford NLP Labs released the STORM project for long report generation with web grounding back in early 2024. So is it just because \"DeepSearch\" sounds way cooler than multi-hop QA, RAG, or STORM? Let's be honest - sometimes a rebrand is all it takes for the industry to suddenly embrace what was there all along.\n\nWe believe the real turning point came with OpenAI's `o1-preview` release in September 2024, which introduced the concept of **test-time compute** and gradually shifted industry perspectives. Test-time compute refers to using more computational resources during inference—the phase where an LLM generates outputs—rather than during pre-training or post-training. A well-known example are Chain-of-Thought (CoT) reasoning and `\"Wait\"`\\-injection (i.e. budget forcing) which enables models to perform more extensive internal deliberations, such as evaluating multiple potential answers, conducting deeper planning, and engaging in self-reflection before arriving at a final response.\n\nThis test-time compute concept and reasoning models **_educate_** users to accept delayed gratification - longer waiting times in exchange for higher-quality, immediately actionable results, just like the Stanford marshmallow experiment where children who could resist eating one marshmallow immediately in order to receive two marshmallows later showed better long-term outcomes. Deepseek-r1 further reinforced this user experience, and like it or not, most users have accepted it.\n\nThis marks a significant departure from classic search requirements, where failing to response within 200ms would doom your solution. In 2025, seasoned search developers and RAG engineers prioritize top-1 precision and recall over latency, and users have become accustomed to longer processing times – provided they can see the system is `<thinking>`.\n\n0:00\n\n/0:18\n\nDisplaying the reasoning procedure has become standard practice in 2025, with numerous chat interfaces now rendering `<think>` content in dedicated UI sections.\n\nIn this article, we'll discuss the principles of DeepSearch and DeepResearch by looking into our open-source implementation. We'll walk through our key design decisions and highlight potential caveats. Finally, you can find our hot-take on deepsearch engineering and dev tools demands in the conclusion section.\n\nWhat is Deep Search?\n--------------------\n\n**DeepSearch runs through an iterative loop of searching, reading, and reasoning until it finds the optimal answer.** The search action leverages web search engines to explore the internet, while the reading action analyzes specific web pages in detail (e.g. Jina Reader). The reasoning action evaluates the current state and determines whether to break down the original question into smaller sub-questions or try different search strategies.\n\nDeepSearch - keep searching, reading webpages, reasoning until an answer is found (or the token budget is exceeded).\n\nWhile various definitions exist online, when we developed the `node-deepresearch` project, we adhered to this straightforward approach. The implementation is elegantly simple – at its core, there's a main while loop with switch-case logic directing the next action.\n\nUnlike 2024 RAG systems, which typically run a single search-generation pass. DeepSearch, in contrast, performs multiple iterations through the pipeline, requiring clear stop conditions. These could be based on token usage limits or the number of failed attempts.\n\n0:00\n\n/0:36\n\nTry deep search at search.jina.ai, observe the content inside `<thinking>`, see if you can tell where the loop happens\n\nAnother perspective on DeepSearch is to view it as an LLM agent equipped with various web tools (such as searcher and reader). The agent determines its next steps by analyzing current observations and past actions – deciding whether to deliver an answer or continue exploring the web. This creates a state machine architecture where the LLM controls transiti\n\n</snippet-5>"
+  },
+  {
+    "role": "user",
+    "content": "Why is the following answer bad for the question? Please reflect\n\n<question>\njina ai latest blog post\n</question>\n\n<answer>\n根据我目前掌握的信息，Jina AI最新的博客文章包括：\n\n1.  **Model Soups: Recipe for Embeddings**：介绍了模型汤的概念，以及如何通过平均权重来提高嵌入模型的鲁棒性和性能。模型汤将多样性融入到比其各部分之和更大的东西中。\n2.  **A Practical Guide to Implementing DeepSearch/DeepResearch**：讨论了DeepSearch和DeepResearch的原理，并通过研究Jina AI的开源实现来阐述。文章还介绍了DeepSearch的设计决策和潜在的注意事项。\n3.  **Click-Worthy Content with PromptPerfect: AI Marketing for Newsletters and Social Media**：介绍了PromptPerfect如何提升LLM提示，并优化电子邮件和社交媒体营销活动。\n4.  **Elasticsearch Open Inference API now Supports Jina AI Embeddings and Rerank Model**：介绍了Elasticsearch Open Inference API现在支持Jina AI的最新嵌入模型和重排序产品，使开发人员能够使用Jina AI的高性能、经济高效的工具来构建语义搜索和RAG应用程序。\n</answer>"
+  },
+  {
+    "role": "assistant",
+    "content": "作为一名严格的答案评估者，我认为当前的回答不够深入。虽然它列出了Jina AI的一些最新博客文章，但缺乏对每篇文章更详细的描述。例如，对于\"Model Soups: Recipe for Embeddings\"，可以提及具体的实验设置、使用的数据集以及评估指标。对于\"A Practical Guide to Implementing DeepSearch/DeepResearch\"，可以讨论DeepSearch和DeepResearch之间的关键区别，以及Jina AI在开源实现中做出的关键设计决策。此外，提供博客文章的发布日\n\n我首先搜索了Jina AI的最新博客文章，包括Elasticsearch集成、DeepSearch指南、模型汤博客、嵌入模型试验和Reader API抓取器。然后，我访问了几个URL，并从中提取了有关Jina AI博客文章的信息。最后，我列出了Jina AI的一些最新博客文章，并简要描述了它们的内容。\n\n我未能提供每篇博客文章的发布日期，并且对文章内容的描述不够深入，这是导致答案被拒绝的主要原因。\n\n我应该在搜索结果中寻找每篇文章的发布日期，并在答案中提供更详细、更具体的文章内容描述，例如实验设置、数据集和评估指标。"
+  },
+  {
+    "role": "user",
+    "content": "jina ai latest blog post"
+  },
+  {
+    "role": "user",
+    "content": "jina ai latest blog post\n\n<answer-requirements>\n- You provide deep, unexpected insights, identifying hidden patterns and connections, and creating \"aha moments.\".\n- You break conventional thinking, establish unique cross-disciplinary connections, and bring new perspectives to the user.\n- Follow reviewer's feedback and improve your answer quality.\n\n<reviewer-1>\nFor get a pass, you must provide more specific details about each blog post, such as key findings, methodologies used (if applicable), and the date of publication. Also, consider adding direct links to the blog posts to enhance the user experience and provide immediate access to the content. Structuring the information in a table could further improve readability.\n</reviewer-1>\n\n</answer-requirements>"
+  }
+]
\ No newline at end of file
diff --git a/urls.json b/urls.json
new file mode 100644
index 0000000..63a2557
--- /dev/null
+++ b/urls.json
@@ -0,0 +1,867 @@
+[
+  {
+    "title": "DeepSearch - Jina AI",
+    "url": "https://jina.ai/deepsearch",
+    "description": "What's the latest blog post from OpenAI? what is the idea behind node-deepresearch project? what exactly jina-DeepSearch API is free to use. By providing your API key, you can access a higher rate limit, and you key won't be charged. open_in_new ...",
+    "weight": 2.2,
+    "date": "Mar 13, 2025",
+    "jinaRerankBoost": 0.6395538330078125,
+    "freqBoost": 0.005314009661835749,
+    "hostnameBoost": 0.07004830917874397,
+    "pathBoost": 0.003864734299516908,
+    "finalScore": 0.7187808861479091
+  },
+  {
+    "title": "Jina AI (@JinaAI_) / X",
+    "url": "https://x.com/jinaai_?lang=en",
+    "description": "Jina AI's posts  Model soups blend diversity into something greater than the sum of their parts. Learn more from our blog post below https://jina.ai/news/model-sou",
+    "weight": 2,
+    "jinaRerankBoost": 0.5302371978759766,
+    "freqBoost": 0.004830917874396135,
+    "hostnameBoost": 0.028985507246376812,
+    "pathBoost": 0.001932367149758454,
+    "finalScore": 0.565985990146508
+  },
+  {
+    "title": "jina-ai/node-DeepResearch - GitHub",
+    "url": "https://github.com/jina-ai/node-DeepResearch",
+    "description": "Blog Post. Whether you like this implementation or not, I highly recommend you to read DeepSearch/DeepResearch implementation guide I wrote, whichIf you are building a web/local/mobile client that uses Jina DeepSearch API , here are some design guidelines: Our API is fully compatible with OpenAI API ...GitHub - jina-ai/node-DeepResearch: Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget) - jina-ai/node-DeepResearchGitHubjina-ai",
+    "weight": 2.2,
+    "jinaRerankBoost": 0.48202352523803715,
+    "freqBoost": 0.005314009661835749,
+    "hostnameBoost": 0.016908212560386472,
+    "pathBoost": 0.00927536231884058,
+    "finalScore": 0.5135211097790999
+  },
+  {
+    "title": "Jina AI on X: \"Model soups blend diversity into something greater",
+    "url": "https://twitter.com/JinaAI_/status/1920189685399216546",
+    "description": "Model soups blend diversity into something greater than the sum of their parts. Learn more from our blog post below https://t.co/c1K4IAsnDi",
+    "weight": 1,
+    "date": "May 7, 2025",
+    "jinaRerankBoost": 0.45243091583251954,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.01932367149758454,
+    "pathBoost": 0.014840579710144929,
+    "finalScore": 0.48901062597744704
+  },
+  {
+    "title": "Jina AI - X",
+    "url": "https://x.com/JinaAI_/status/1920189685399216546",
+    "description": "Model soups blend diversity into something greater than the sum of their parts. Learn more from our blog post below https://jina.ai/news/model",
+    "weight": 1,
+    "date": "May 7, 2025",
+    "jinaRerankBoost": 0.42496752738952637,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.028985507246376812,
+    "pathBoost": 0.014840579710144929,
+    "finalScore": 0.4712090732832462
+  },
+  {
+    "title": "Newsroom - Jina AI",
+    "url": "https://jina.ai/news",
+    "description": "Read the latest news and updates from Jina AI.Newsroom",
+    "weight": 1.4000000000000004,
+    "jinaRerankBoost": 0.3604808807373047,
+    "freqBoost": 0.0033816425120772957,
+    "hostnameBoost": 0.07004830917874397,
+    "pathBoost": 0.025120772946859906,
+    "finalScore": 0.4590316053749859
+  },
+  {
+    "title": "Cookie Policy",
+    "url": "https://businesswire.com/cookie-policy",
+    "description": "Cookie Policy",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.23993978500366211,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.11594202898550725,
+    "pathBoost": 0.001932367149758454,
+    "finalScore": 0.35805572703264765
+  },
+  {
+    "title": "Jina AI Releases Reader-LM 0.5b and 1.5b for converting HTML to",
+    "url": "https://reddit.com/r/LocalLLaMA/comments/1feiip0/jina_ai_releases_readerlm_05b_and_15b_for",
+    "description": "Jina AI just released Reader-LM, a new set of small language models designed to convert raw HTML into clean markdown.",
+    "weight": 1,
+    "date": "Sep 11, 2024",
+    "jinaRerankBoost": 0.31238067150115967,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.012077294685990338,
+    "pathBoost": 0.008428212560386474,
+    "finalScore": 0.33530163768473453
+  },
+  {
+    "title": "What Makes Jina.ai the Future of Web Scraping? - Free (for Now!)",
+    "url": "https://medium.com/@kawsarlog/what-makes-jina-ai-the-future-of-web-scraping-free-for-now-085f2f1b3fa4",
+    "description": "It's a smarter, AI-powered framework that makes data extraction easier and more powerful. By combining neural search and intelligent workflows, Jina.ai is",
+    "weight": 1,
+    "date": "Feb 21, 2025",
+    "jinaRerankBoost": 0.2976390361785889,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.012077294685990338,
+    "pathBoost": 0.005410628019323672,
+    "finalScore": 0.317542417821101
+  },
+  {
+    "title": "Jina.ai's Reader API: A Game Changer for Developers | by MD Kawsar",
+    "url": "https://medium.com/@kawsarlog/jina-ais-reader-api-a-game-changer-for-developers-be66154b2692",
+    "description": "Jina.ai's Reader API is a powerful tool that simplifies web scraping and content extraction. Whether you're a developer, researcher, or AI enthusiast, it's",
+    "weight": 2,
+    "date": "Feb 23, 2025",
+    "jinaRerankBoost": 0.29182205200195316,
+    "freqBoost": 0.004830917874396135,
+    "hostnameBoost": 0.012077294685990338,
+    "pathBoost": 0.005410628019323672,
+    "finalScore": 0.31414089258166333
+  },
+  {
+    "title": "jinaai/jina-embeddings-v3 - Hugging Face",
+    "url": "https://huggingface.co/jinaai/jina-embeddings-v3",
+    "description": "We're on a journey to advance and democratize artificial intelligence through open source and open science.",
+    "weight": 1,
+    "jinaRerankBoost": 0.26282353401184083,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.014492753623188406,
+    "pathBoost": 0.00927536231884058,
+    "finalScore": 0.2890071088910679
+  },
+  {
+    "title": "An explainer on DeepResearch by Jina AI : r/Rag",
+    "url": "https://reddit.com/r/Rag/comments/1k089kb/an_explainer_on_deepresearch_by_jina_ai",
+    "description": "The full guide at jina.ai has more details on system prompts, URL ranking, and web crawling that are worth checking out if you're building ...",
+    "weight": 1,
+    "date": "3 weeks ago",
+    "jinaRerankBoost": 0.2614466667175293,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.012077294685990338,
+    "pathBoost": 0.008428212560386474,
+    "finalScore": 0.2843676329011042
+  },
+  {
+    "title": "Elasticsearch Open Inference API now Supports Jina AI  - Nasdaq",
+    "url": "https://nasdaq.com/press-release/elasticsearch-open-inference-api-now-supports-jina-ai-embeddings-and-rerank-model",
+    "description": "Developers using Elastic to build search and RAG applications can now use the latest Jina AI embedding and reranking models without additional integration",
+    "weight": 1,
+    "jinaRerankBoost": 0.25597658157348635,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.2642857603174477
+  },
+  {
+    "title": "jina-ai/reader: Convert any URL to an LLM-friendly input  - GitHub",
+    "url": "https://github.com/jina-ai/reader",
+    "description": "With Reader, http://s.jina.ai automatically fetches the content from the top 5 search result URLs for you (reusing the tech stack behind http://r.jina.ai ).",
+    "weight": 1,
+    "jinaRerankBoost": 0.2238627910614014,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.016908212560386472,
+    "pathBoost": 0.00927536231884058,
+    "finalScore": 0.25246182487782654
+  },
+  {
+    "title": "Elastic enhances search AI with Jina AI integration - Investing.com",
+    "url": "https://investing.com/news/company-news/elastic-enhances-search-ai-with-jina-ai-integration-93CH-3881283",
+    "description": "This collaboration enables developers to utilize Jina AI's advanced tools for building semantic search and rapid answer generation (RAG) applications.",
+    "weight": 1,
+    "date": "Feb 20, 2025",
+    "jinaRerankBoost": 0.2176196336746216,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.02790338164251208,
+    "finalScore": 0.2503539331915298
+  },
+  {
+    "title": "Cookie Policy",
+    "url": "https://support.twitter.com/articles/20170514",
+    "description": "Cookie Policy",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.23993978500366211,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.006956521739130434,
+    "finalScore": 0.2495533115737104
+  },
+  {
+    "title": "Cookie Policy",
+    "url": "https://support.x.com/articles/20170514",
+    "description": "Cookie Policy",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.23993978500366211,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.006956521739130434,
+    "finalScore": 0.2495533115737104
+  },
+  {
+    "title": "Blog",
+    "url": "https://businesswire.com/blog",
+    "description": "Blog",
+    "weight": 0.2,
+    "jinaRerankBoost": 0.11302512884140015,
+    "freqBoost": 0.00048309178743961357,
+    "hostnameBoost": 0.11594202898550725,
+    "pathBoost": 0.007729468599033816,
+    "finalScore": 0.23717971821338082
+  },
+  {
+    "title": "Cookie Use.",
+    "url": "https://help.x.com/rules-and-policies/twitter-cookies",
+    "description": "Cookie Use.",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.22260544300079346,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.007246376811594203,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.23357162657567268
+  },
+  {
+    "title": "Jina AI - Elasticsearch Labs",
+    "url": "https://elastic.co/search-labs/integrations/jina",
+    "description": "Explore the Jina AI and Elasticsearch integration, Learn about Jina reranker, Jina clip, and embedding models integrated with Elasticsearch.",
+    "weight": 1,
+    "jinaRerankBoost": 0.21576809883117676,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.007246376811594203,
+    "pathBoost": 0.006647342995169083,
+    "finalScore": 0.2320772775751381
+  },
+  {
+    "title": "How to scrape the web for LLM in 2024: Jina AI (Reader API",
+    "url": "https://dicloak.com/blog-detail/how-to-scrape-the-web-for-llm-in-2024-jina-ai-reader-api-mendable-firecrawl-and-scrapegraph-ai",
+    "description": "The article discusses emerging trends and innovative tools in web scraping for 2024, highlighting startups like Mendable and technologies",
+    "weight": 2,
+    "date": "Jan 2, 2025",
+    "jinaRerankBoost": 0.2163841724395752,
+    "freqBoost": 0.004830917874396135,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.2271088101207346
+  },
+  {
+    "title": "Click-Worthy Content with PromptPerfect: AI Marketing for Newsletters and Social Media",
+    "url": "https://t.co/UqqF1c81J5",
+    "description": "Click-Worthy Content with PromptPerfect: AI Marketing for Newsletters and Social MediaFrom jina.ai",
+    "weight": 0.2,
+    "jinaRerankBoost": 0.20786049365997317,
+    "freqBoost": 0.00048309178743961357,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.001932367149758454,
+    "finalScore": 0.2126914115343693
+  },
+  {
+    "title": "Elastic NV (ESTC) Enhances Elasticsearch with Jina AI Integration",
+    "url": "https://gurufocus.com/news/2709507/elastic-nv-estc-enhances-elasticsearch-with-jina-ai-integration",
+    "description": "Integration with Jina AI enhances Elasticsearch's capabilities for semantic search and RAG applications. Support for multilingual text embeddings and reranking",
+    "weight": 1,
+    "jinaRerankBoost": 0.17493548393249514,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.02790338164251208,
+    "finalScore": 0.20766978344940334
+  },
+  {
+    "title": "Elasticsearch Open Inference API adds support for Jina AI",
+    "url": "https://elastic.co/search-labs/blog/jina-ai-embeddings-rerank-model-open-inference-api",
+    "description": "Our friends at Jina AI added native integration for Jina AI's embedding models and reranking products to the Elasticsearch open Inference API.",
+    "weight": 2,
+    "date": "Feb 20, 2025",
+    "jinaRerankBoost": 0.18252519369125367,
+    "freqBoost": 0.004830917874396135,
+    "hostnameBoost": 0.007246376811594203,
+    "pathBoost": 0.006647342995169083,
+    "finalScore": 0.2012498313724131
+  },
+  {
+    "title": "Scrape Websites at No Cost with Jina :Get LLM-friendly input from a",
+    "url": "https://community.make.com/t/scrape-websites-at-no-cost-with-jina-get-llm-friendly-input-from-a-url/39143",
+    "description": "Web scraping can be a complex and expensive process, but Jina AI simplifies it by allowing you to extract valuable content from websites using just a URL.",
+    "weight": 1,
+    "date": "May 28, 2024",
+    "jinaRerankBoost": 0.18473582267761232,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.004714975845410628,
+    "finalScore": 0.1942817163974191
+  },
+  {
+    "title": "jinaai/jina-embeddings-v2-base-es - Hugging Face",
+    "url": "https://huggingface.co/jinaai/jina-embeddings-v2-base-es",
+    "description": "jina-embeddings-v2-base-es is a Spanish/English bilingual text embedding model supporting 8192 sequence length. It is based on a BERT",
+    "weight": 1,
+    "date": "Sep 17, 2024",
+    "jinaRerankBoost": 0.16654313802719117,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.014492753623188406,
+    "pathBoost": 0.00927536231884058,
+    "finalScore": 0.19272671290641824
+  },
+  {
+    "title": "Elasticsearch Open Inference API Supports Jina AI Embeddings and",
+    "url": "https://apmdigest.com/elasticsearch-open-inference-api-supports-jina-ai-embeddings-and-rerank-model",
+    "description": "This integration includes support for multilingual text embeddings and multilingual reranking, and is optimized for retrieval, clustering, and classification. “",
+    "weight": 1,
+    "jinaRerankBoost": 0.18362817764282227,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.001932367149758454,
+    "finalScore": 0.19039146266697687
+  },
+  {
+    "title": "Jina AI - Model Soup's Recipe for Embeddings",
+    "url": "https://linkedin.com/posts/jinaai_model-soups-recipe-for-embeddings-activity-7325930994111246337-ZuXm",
+    "description": "Boost robustness and performance with model soups: averaging weights. No extra cost, better results. https://lnkd.in/ezAjU7VS.",
+    "weight": 1,
+    "date": "May 7, 2025",
+    "jinaRerankBoost": 0.16044615507125856,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.016908212560386472,
+    "pathBoost": 0.005410628019323672,
+    "finalScore": 0.1851804545881668
+  },
+  {
+    "title": "Elastic integrates Jina AI models into Elasticsearch API - SDx Central",
+    "url": "https://sdxcentral.com/news/elastic-integrates-jina-ai-models-into-elasticsearch-api",
+    "description": "Elastic integrates Jina AI's embedding models into its Open Inference API, enhancing search and retrieval capabilities.",
+    "weight": 1,
+    "date": "Feb 20, 2025",
+    "jinaRerankBoost": 0.13952821493148804,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.02666666666666667,
+    "finalScore": 0.17102579947255084
+  },
+  {
+    "title": "",
+    "url": "https://jinaai.cn/",
+    "description": "",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.04106280193236715,
+    "pathBoost": 0,
+    "finalScore": 0.1621265867482061
+  },
+  {
+    "title": "How to scrape the web for LLM in 2024: Jina AI (Reader  - YouTube",
+    "url": "https://youtube.com/watch?v=QxHE4af5BQE",
+    "description": "How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai · Comments178.",
+    "weight": 1,
+    "date": "May 17, 2024",
+    "jinaRerankBoost": 0.14224869012832642,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.007246376811594203,
+    "pathBoost": 0.003864734299516908,
+    "finalScore": 0.1557752601766356
+  },
+  {
+    "title": "Jina AI - AWS Marketplace",
+    "url": "https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy",
+    "description": "Jina Embeddings v2 Base model is optimized for highly accurate embeddings - For speed of inference and memory efficiency use the Small model. jina-embeddings-v2",
+    "weight": 1,
+    "jinaRerankBoost": 0.14594042301177979,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.15424960175574115
+  },
+  {
+    "title": "",
+    "url": "https://twitter.com/intent/tweet?url=https%3A%2F%2Fjina.ai%2Fnews%2Fmodel-soups-recipe-for-embeddings%2F",
+    "description": "",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.01932367149758454,
+    "pathBoost": 0.010434782608695653,
+    "finalScore": 0.15082223892211916
+  },
+  {
+    "title": "blog",
+    "url": "https://cts.businesswire.com/ct/CT?anchor=blog&esheet=54211070&id=smartlink&index=2&lan=en-US&md5=1277635ab8f2f599617706b118c9669d&newsitemid=20250220781575&url=https%3A%2F%2Fwww.elastic.co%2Fsearch-labs%2Fblog%2Fjina-ai-embeddings-rerank-model-open-inference-api",
+    "description": "blog",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.1377374768257141,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.004830917874396135,
+    "pathBoost": 0.006956521739130434,
+    "finalScore": 0.14976646233296048
+  },
+  {
+    "title": "",
+    "url": "https://linkedin.com/sharing/share-offsite?url=https%3A%2F%2Fjina.ai%2Fnews%2Fmodel-soups-recipe-for-embeddings%2F",
+    "description": "",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.016908212560386472,
+    "pathBoost": 0.010434782608695653,
+    "finalScore": 0.14840677998492108
+  },
+  {
+    "title": "",
+    "url": "https://facebook.com/sharer/sharer.php?u=https%3A%2F%2Fjina.ai%2Fnews%2Fmodel-soups-recipe-for-embeddings%2F",
+    "description": "",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.00966183574879227,
+    "pathBoost": 0.010434782608695653,
+    "finalScore": 0.14116040317332687
+  },
+  {
+    "title": "",
+    "url": "https://facebook.com/sharer/sharer.php?u=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
+    "description": "",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.00966183574879227,
+    "pathBoost": 0.010434782608695653,
+    "finalScore": 0.14116040317332687
+  },
+  {
+    "title": "jina-embeddings-v3",
+    "url": "https://jinaai.cn/?model=jina-embeddings-v3&sui=",
+    "description": "jina-embeddings-v3",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.09405770897865295,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.04106280193236715,
+    "pathBoost": 0,
+    "finalScore": 0.13536205680473992
+  },
+  {
+    "title": "",
+    "url": "https://app.eu.vanta.com/jinaai/trust/vz7f4mohp0847aho84lmva",
+    "description": "",
+    "weight": 0.30000000000000004,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.0007246376811594204,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.01051207729468599,
+    "finalScore": 0.13447441283516262
+  },
+  {
+    "title": "",
+    "url": "https://news.ycombinator.com/submitlink?u=https%3A%2F%2Fjina.ai%2Fnews%2Fmodel-soups-recipe-for-embeddings%2F",
+    "description": "",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.007246376811594203,
+    "pathBoost": 0.005797101449275362,
+    "finalScore": 0.1341072630767085
+  },
+  {
+    "title": "",
+    "url": "https://news.ycombinator.com/submitlink?u=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
+    "description": "",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.007246376811594203,
+    "pathBoost": 0.005797101449275362,
+    "finalScore": 0.1341072630767085
+  },
+  {
+    "title": "",
+    "url": "https://youtube.com/user/businesswire",
+    "description": "",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.007246376811594203,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.13178842249699838
+  },
+  {
+    "title": "Jina AI + Elastic → Efficient Data Storage in the Cloud",
+    "url": "https://shubhamsaboo111.medium.com/jina-ai-elastic-efficient-data-storage-in-the-cloud-6878db47d92c",
+    "description": "Elasticsearch backend provides the much-needed scalability for Jina search applications and can be used as a document store for DocumentArray.",
+    "weight": 1,
+    "date": "Jun 7, 2022",
+    "jinaRerankBoost": 0.12162580490112306,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.001932367149758454,
+    "finalScore": 0.12838908992527764
+  },
+  {
+    "title": "",
+    "url": "https://instagram.com/businesswire",
+    "description": "",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003864734299516908,
+    "finalScore": 0.12734397805255393
+  },
+  {
+    "title": "",
+    "url": "https://discord.jina.ai/",
+    "description": "",
+    "weight": 0.30000000000000004,
+    "jinaRerankBoost": 0.12082223892211914,
+    "freqBoost": 0.0007246376811594204,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0,
+    "finalScore": 0.12396233554047663
+  },
+  {
+    "title": "Jina Embeddings - Qdrant",
+    "url": "https://qdrant.tech/documentation/embeddings/jina-embeddings",
+    "description": "You can get a free trial key from Jina Embeddings to get embeddings. Qdrant users can receive a 10% discount on Jina AI APIs by using the code QDRANT. Technical",
+    "weight": 1,
+    "jinaRerankBoost": 0.11302512884140015,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.004714975845410628,
+    "finalScore": 0.12257102256120692
+  },
+  {
+    "title": "Jina AI Deep SearchAI deep search: read, reason, search until best answer found.",
+    "url": "https://search.jina.ai/",
+    "description": "Jina AI Deep SearchAI deep search: read, reason, search until best answer found.search.jina.ai",
+    "weight": 0.4,
+    "jinaRerankBoost": 0.1153190851211548,
+    "freqBoost": 0.0009661835748792271,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0,
+    "finalScore": 0.11870072763323208
+  },
+  {
+    "title": "API Status",
+    "url": "https://status.jina.ai/",
+    "description": "API Status",
+    "weight": 0.30000000000000004,
+    "jinaRerankBoost": 0.08070290088653564,
+    "freqBoost": 0.0007246376811594204,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0,
+    "finalScore": 0.08384299750489313
+  },
+  {
+    "title": "Izmailov et al. 2018",
+    "url": "https://auai.org/uai2018/proceedings/papers/313.pdf",
+    "description": "Izmailov et al. 2018",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.07005107402801514,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.005704347826086957,
+    "finalScore": 0.07841242668501996
+  },
+  {
+    "title": "Vercel AI SDK",
+    "url": "https://sdk.vercel.ai/docs/introduction",
+    "description": "Vercel AI SDK",
+    "weight": 0.2,
+    "jinaRerankBoost": 0.06711230874061584,
+    "freqBoost": 0.00048309178743961357,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.07348912033481875
+  },
+  {
+    "title": "DeepSeek-R1",
+    "url": "https://api-docs.deepseek.com/news/news250120",
+    "description": "DeepSeek-R1",
+    "weight": 0.2,
+    "jinaRerankBoost": 0.04335897862911225,
+    "freqBoost": 0.00048309178743961357,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.02666666666666667,
+    "finalScore": 0.0729241960204166
+  },
+  {
+    "title": "AI-Powered Web Scraping: The Future of Data Extraction",
+    "url": "https://madebyagents.com/blog/ai-powered-web-scraping-the-future-of-data-extraction",
+    "description": "How AI is revolutionizing web scraping. Discover cost-effective tools, build a robust scraping dashboard, and solve common data extraction",
+    "weight": 1,
+    "date": "Apr 26, 2025",
+    "jinaRerankBoost": 0.05403735637664795,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.00927536231884058,
+    "finalScore": 0.06814363656988466
+  },
+  {
+    "title": "Ads info",
+    "url": "https://business.twitter.com/en/help/troubleshooting/how-twitter-ads-work.html",
+    "description": "Ads info",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.05097199678421021,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.011408695652173914,
+    "finalScore": 0.06503769726730199
+  },
+  {
+    "title": "Ads info",
+    "url": "https://business.x.com/en/help/troubleshooting/how-twitter-ads-work.html",
+    "description": "Ads info",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.05097199678421021,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.011408695652173914,
+    "finalScore": 0.06503769726730199
+  },
+  {
+    "title": "Grok3",
+    "url": "https://x.ai/blog/grok-3",
+    "description": "Grok3Grok 3 Beta",
+    "weight": 0.4,
+    "jinaRerankBoost": 0.04532194137573242,
+    "freqBoost": 0.0009661835748792271,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.00927536231884058,
+    "finalScore": 0.0579789462066503
+  },
+  {
+    "title": "Jina Embeddings v2 Base - Azure Marketplace",
+    "url": "https://azuremarketplace.microsoft.com/en-us/marketplace/apps/jinaai.jina-embeddings-v2-base-en?tab=Overview",
+    "description": "Jina Embeddings v2 Base model is optimized for highly accurate embeddings - For speed of inference and memory efficiency use the Small model.",
+    "weight": 1,
+    "jinaRerankBoost": 0.04736820459365845,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.005704347826086957,
+    "finalScore": 0.05790347029414154
+  },
+  {
+    "title": "Help Center",
+    "url": "https://help.x.com/using-x/x-supported-browsers",
+    "description": "Help Center",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.04400422871112824,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.007246376811594203,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.054970412286007464
+  },
+  {
+    "title": "Elastic",
+    "url": "https://cts.businesswire.com/ct/CT?anchor=Elastic&esheet=54211070&id=smartlink&index=1&lan=en-US&md5=3534c60f229ac2e04c8d4282425417d4&newsitemid=20250220781575&url=https%3A%2F%2Fwww.elastic.co%2F",
+    "description": "Elastic",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.04086619019508362,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.004830917874396135,
+    "pathBoost": 0.006956521739130434,
+    "finalScore": 0.052895175702329995
+  },
+  {
+    "title": "OpenAI",
+    "url": "https://openai.com/index/introducing-deep-research",
+    "description": "OpenAIntroducing Deep Research",
+    "weight": 0.4,
+    "jinaRerankBoost": 0.04400422871112824,
+    "freqBoost": 0.0009661835748792271,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.05086413209277075
+  },
+  {
+    "title": "Takuya et al. 2025",
+    "url": "https://doi.org/10.1038/s42256-024-00975-8",
+    "description": "Takuya et al. 2025",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.04465851485729218,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.05079378055777527
+  },
+  {
+    "title": "Conneau et al. 2020",
+    "url": "https://aclanthology.org/2020.acl-main.747",
+    "description": "Conneau et al. 2020",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.04599462449550629,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.001932367149758454,
+    "finalScore": 0.05058399647618261
+  },
+  {
+    "title": "Zhang et al. 2023",
+    "url": "https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00595/117438/MIRACL-A-Multilingual-Retrieval-Dataset-Covering",
+    "description": "Zhang et al. 2023",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.03628206551074982,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.00763560193236715,
+    "finalScore": 0.04657467227403485
+  },
+  {
+    "title": "Baidu Integrates DeepSeek-R1",
+    "url": "https://chat.baidu.com/search?extParamsJson=%7B%22enter_type%22%3A%22ai_explore_home%22%7D&isShowHello=1&pd=csaitab&setype=csaitab&usedModel=%7B%22modelName%22%3A%22DeepSeek-R1%22%7D",
+    "description": "Baidu Integrates DeepSeek-R1",
+    "weight": 0.2,
+    "jinaRerankBoost": 0.04147637188434601,
+    "freqBoost": 0.00048309178743961357,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.001932367149758454,
+    "finalScore": 0.04630728975874215
+  },
+  {
+    "title": "Enevoldsen et al. 2025",
+    "url": "https://arxiv.org/abs/2502.13595",
+    "description": "Enevoldsen et al. 2025",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.034692639112472536,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.004830917874396135,
+    "pathBoost": 0.005410628019323672,
+    "finalScore": 0.04517573089991215
+  },
+  {
+    "title": "API Documentation",
+    "url": "https://docs.jina.ai/",
+    "description": "API DocumentationAPI DocsAuto codegen for your copilot IDE or LLMopen_in_new",
+    "weight": 0.4,
+    "jinaRerankBoost": 0.039671221375465394,
+    "freqBoost": 0.0009661835748792271,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0,
+    "finalScore": 0.04305286388754269
+  },
+  {
+    "title": "Tencent Weixin Integrates DeepSeek",
+    "url": "https://reuters.com/technology/artificial-intelligence/tencents-messaging-app-weixin-launches-beta-testing-with-deepseek-2025-02-16",
+    "description": "Tencent Weixin Integrates DeepSeek",
+    "weight": 0.2,
+    "jinaRerankBoost": 0.03521491885185242,
+    "freqBoost": 0.00048309178743961357,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.004714975845410628,
+    "finalScore": 0.04282844542190073
+  },
+  {
+    "title": "Perplexity",
+    "url": "https://perplexity.ai/hub/blog/introducing-perplexity-deep-research",
+    "description": "Introducing Perplexity Deep Research",
+    "weight": 0.4,
+    "jinaRerankBoost": 0.03267657160758972,
+    "freqBoost": 0.0009661835748792271,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.004714975845410628,
+    "finalScore": 0.040773189965077644
+  },
+  {
+    "title": "Springer et al., 2025",
+    "url": "https://arxiv.org/abs/2503.19206v2",
+    "description": "Springer et al., 2025",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.028976044058799746,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.004830917874396135,
+    "pathBoost": 0.005410628019323672,
+    "finalScore": 0.03945913584623936
+  },
+  {
+    "title": "Google",
+    "url": "https://blog.google/products/gemini/google-gemini-deep-research",
+    "description": "Google Gemini 2",
+    "weight": 0.4,
+    "jinaRerankBoost": 0.03031394183635712,
+    "freqBoost": 0.0009661835748792271,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.004714975845410628,
+    "finalScore": 0.03841056019384504
+  },
+  {
+    "title": "Join usopen_in_new",
+    "url": "https://app.dover.com/jobs/jinaai",
+    "description": "Join usopen_in_new",
+    "weight": 0.4,
+    "jinaRerankBoost": 0.026871606707572937,
+    "freqBoost": 0.0009661835748792271,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.03373151008921545
+  },
+  {
+    "title": "Wortsman et al. (2022)",
+    "url": "https://proceedings.mlr.press/v162/wortsman22a.html",
+    "description": "Wortsman et al. (2022)Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference timeThe conventional recipe for maximizing model accuracy is to (1) train multiple models with various hyperparameters and (2) pick the individual model which performs best on a held-out validation set…PMLR(Wortsman et al. 2022)",
+    "weight": 0.4,
+    "jinaRerankBoost": 0.0241716668009758,
+    "freqBoost": 0.0009661835748792271,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.031031570182618314
+  },
+  {
+    "title": "Baidu Search and Tencent WeChat Search",
+    "url": "https://scmp.com/tech/big-tech/article/3298981/baidu-adopts-deepseek-ai-models-chasing-tencent-race-embrace-hot-start",
+    "description": "Baidu Search and Tencent WeChat Search",
+    "weight": 0.2,
+    "jinaRerankBoost": 0.02108428329229355,
+    "freqBoost": 0.00048309178743961357,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.0064958454106280195,
+    "finalScore": 0.03047867942755925
+  },
+  {
+    "title": "delayed gratification",
+    "url": "https://en.wikipedia.org/wiki/Delayed_gratification",
+    "description": "delayed gratification",
+    "weight": 0.2,
+    "jinaRerankBoost": 0.02344978451728821,
+    "freqBoost": 0.00048309178743961357,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.02982659611149111
+  },
+  {
+    "title": "A Practical Guide to Implementing DeepSearch ...",
+    "url": "https://simonwillison.net/2025/Mar/4/deepsearch-deepresearch",
+    "description": "DeepSearch runs through an iterative loop of searching, reading, and reasoning until it finds the optimal answer.",
+    "weight": 1,
+    "date": "4 Mar 2025",
+    "jinaRerankBoost": 0.018951575458049777,
+    "freqBoost": 0.0024154589371980675,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.005704347826086957,
+    "finalScore": 0.029486841158532867
+  },
+  {
+    "title": "Imprint",
+    "url": "https://legal.twitter.com/imprint.html",
+    "description": "Imprint",
+    "weight": 0.1,
+    "jinaRerankBoost": 0.02454064190387726,
+    "freqBoost": 0.00024154589371980678,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.001932367149758454,
+    "finalScore": 0.029130013884553586
+  },
+  {
+    "title": "Stanford NLP Labs released the STORM",
+    "url": "https://storm-project.stanford.edu/research/storm",
+    "description": "Stanford NLP Labs released the STORM",
+    "weight": 0.2,
+    "jinaRerankBoost": 0.018951575458049777,
+    "freqBoost": 0.00048309178743961357,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0.003478260869565217,
+    "finalScore": 0.025328387052252675
+  },
+  {
+    "title": "MIRACL benchmark",
+    "url": "https://project-miracl.github.io/",
+    "description": "MIRACL benchmark",
+    "weight": 0.2,
+    "jinaRerankBoost": 0.022406016290187836,
+    "freqBoost": 0.00048309178743961357,
+    "hostnameBoost": 0.0024154589371980675,
+    "pathBoost": 0,
+    "finalScore": 0.025304567014825516
+  }
+]
\ No newline at end of file