Which is the Best Deep Search Tool for Chinese Content?
An In-Depth Comparison of Leading AI Search Engines and Their Performance with Chinese Language Data
Note: This is a casual, subjective evaluation.
TL;DR Version:
DeepSeek R1 + Web Search (official version) is the simplest and most effective tool.
If you want slightly longer answers, Google, a long-established player, still provides a smoother search experience.
Flowith’s Oracle mode is surprisingly good, feeling similar to ChatGPT O1. However, it seems to handle both domestic and international information sources well, possibly due to search engine tuning.
ChatGPT’s Deep Search, in my tests, yielded mediocre results. It doesn’t live up to the hype from various online influencers. It might be that it struggles with Chinese content. Unfortunately, the search cost is too high; I could only borrow it once.
Very Subjective Scoring Results:
Although casual and subjective, I still set up a few evaluation criteria:
【Accuracy】: Based on the final 12 tools I provided (I initially gave 10 during the search), I checked if it could find 10 AI tools. Of course, if it couldn’t get “ten ai’s deep search,” I’d give it the lowest score directly. If no information source is provided, I’ll deduct 5 points by default.
【Breadth】: Whether it covers the content I requested — product introductions and technical approaches. Basically, all of them did, although some were incorrect.
【Depth】: This is more subjective, scored based on my personal understanding. I personally feel there’s a lot of hallucination here.
【Length】: Nothing much to say here, just the word count.
【Interaction】: For example, whether it allows follow-up questions, pricing, and usage barriers.
【Export】: This is actually very important. For example, some models can only export links or images, which is ridiculous. At the very least, it should allow full-text copying and PDF export.
1. Doubao (豆包)
Total words: 2,918
Doubao has the best engineering. Apart from not being able to find information about Nano Search, there are really no major flaws.
The exported document even has a table of contents. The overall feeling is very comfortable and highly polished — it lives up to its reputation as an “APP super factory.”
Doubao’s only weakness is that it doesn’t yet have a super large model with outstanding intelligence. Therefore, its content also appears relatively shallow, basically feeling like a reskin.
2. n.cn (纳米搜索)
Total word count: 1606
Nami (n.cn) comes from the infamous 360. Nami is a legendary “Frankenstein.” At first glance, it seems to have a lot of features, and it even has DeepSeek R1 added. The initial description of OpenAI is well done. However, the introductions to the AI deep search products I requested are incomplete, and the content is very short. But it highlights the unique features better than other models, and its summarization ability is quite good. It includes some search products beyond my knowledge, which might not be very AI-focused, but are inspiring.
However, Nami Search doesn’t support follow-up questions, and sharing is only supported via links and images (not including the full text). It’s clear that this product has a strong marketing focus.
3. Ima.Copilot from Tencent
Total word count: 1417
Tencent released a search + knowledge base tool a long time ago. At that time, the Hunyuan large model’s intelligence was average, but fortunately, the information sources were relatively valuable because they were all from WeChat Official Accounts. Now, with the addition of DeepSeek R1 for deep search, the quality is noticeably different.
The biggest advantage is that you can find a series of related WeChat Official Accounts and add them to the knowledge base. Then you can ask questions based on the knowledge base, which is great. WeChat Official Accounts are its greatest asset, because when we use products from other companies, we often need to click on the WeChat Official Account, jump out, and then save it.
However, information from WeChat Official Accounts has a certain lag compared to the public domain. Also, due to strong censorship, many new things, especially the lack of link circulation, cause problems with search sometimes. It feels a bit closed off when it comes to the world outside of WeChat Official Accounts.
So, this tool was actually the most disappointing in this evaluation, because many of the things it searched were completely irrelevant. Especially since I was mainly looking for AI deep search, it gave me a lot of materials on traditional search architecture.
For specific fields, this is a must-have, but for open challenges, I think it needs to be more aggressive and distinctive.
Also, export is only supported via copy and paste.
4. iflow (心流 AI 助手)
Total word count: 1399
Rumored to be a product of Alibaba, it provides quite a few features.
For example, it starts with a mind map and can also generate a male-female dialogue podcast similar to NotebookLLM, which is very suitable for making AI podcasts.
Although there aren’t many search products listed, at least the names are relatively correct. However, the comparison data in the table isn’t very accurate, but compared to other tools, it’s already a good start.
Although the word count isn’t very high, the generated styles are quite rich, with tables and images, so it seems very long. It’s just that some of the images are not very relevant, too abstract.
The thinking process is long enough, very exposed, and the source information is well-marked.
The biggest problem is that sharing and exporting are not very convenient. When copying the content with both text and images, the formatting is completely messed up.
5. ChatGPT O3 Deep Search
Total word count: 2865
The genuine Deep Search is the second most disappointing here. The output content is still relatively short, and it doesn’t feel worth the $200/month membership fee.
Of course, when communicating with the friend who helped me ask the questions, it was mentioned that there might be two reasons:
Putting too many constraints on the reasoning large model might actually hinder its performance. The prompt wasn’t optimized well.
GPT is inherently not very good at handling Chinese information. It might be better to, for example, write the search in English and have it answer in Chinese.
However, the genuine Deep Search still has a few redeeming qualities:
When asking a question, it will first ask you a few questions to get a more specific direction. This avoids wasting resources or going in the wrong direction. For example, my original prompt was very simple, and after it asked me back, I refined it a lot. I then packaged these two parts of the prompt together as a new benchmark to provide to all AI deep search tools. So, what impressed me the most was that these follow-up questions were really well-done. I think this could be a standard practice for future AI search engineering.
The content output by O3 is more like a complete article, with a more coherent logic. You can see that long text + reasoning is a very high barrier to entry. Many searches now incorporate DeepSeek R1 for deep thinking, but because DeepSeek R1’s context is extremely limited, only 32K, it’s obvious that everyone is basically filling in the blanks in an outline. Although this is understandable, it would be much more comfortable if it could be logically coherent like O3.
6. Official DeepSeek
Total word count: 1625
DeepSeek’s deep thinking + web connection results are quite good, especially its resource matching. It can find niche and new software. But it’s clear that due to context limitations, not all products are fully explored, although their characteristics are well-presented and match my impressions.
With the official version gradually stabilizing, I personally think that DeepSeek’s official R1 + web connection is currently the lowest-barrier option for ordinary people to get relatively good answers.
However, DeepSeek’s hallucinations are actually quite significant. If the official version could strengthen source labeling and increase the context, I think it would be better. Of course, the speed still needs to be optimized.
7. Flowith.ai’s Oracle Mode
Total word count: 5369
Flowith.ai is a whiteboard + knowledge base service that I introduced before. Actually, its early promotion mainly focused on the Oracle mode. This is an Agent where you pose a question, and the AI breaks it down into several questions and steps. You then modify and confirm them before it proceeds with searching and organizing.
You can see that when Flowith did a broader search in the second row, I don’t know what model they used here, possibly Gemini, the context is relatively large. It’s the only one that can completely encompass the introductions of the 10 AI tools I requested. This deserves praise. Moreover, its initial follow-up questions are similar to the interaction between OpenAI Deep Search and us at the beginning.
However, there aren’t many things you can control during the process. In fact, most tools don’t offer much control during the process, but here, the process is displayed, giving the illusion that we can deeply participate.
Also, I think it still didn’t search for OpenAI’s Deep Search correctly. It feels more like searching for a single keyword without associating it with OpenAI. This is a pity, because other tools basically answered it easily. Here, it further demonstrates the importance of OpenAI’s own O3 long text + reasoning large model.
I look forward to continued optimization of engineering with Claude 4.0, O3, or the future DeepSeek R2 API, which can provide us with greater imagination.
8. Genspark
Total word count: 3406
Genspark became popular years ago for its AI Agent + search, presenting results like visually appealing notes similar to Xiaohongshu (Little Red Book). But at that time, the model’s capabilities were still too poor, the content quality was very low, and it wasn’t very timely. Almost a year later, they recently launched their own Deep Search.
Looking back, I found that its capabilities have indeed improved a lot. Their product has always been relatively mature and easy to use. For example, it thinks for a long time, searches for a lot of things, and then notifies you by email when it’s finished. The introduction to the O3 version of Deep Search is quite good, but overall, it still belongs to the toy category. The presented content has a lot of nonsense, and the product introductions I wanted are also lacking, possibly due to a lack of Chinese information.
It’s worth mentioning that it’s the only one that provides links to videos and thumbnail previews, although the videos from YouTube cannot be played directly with a single click and require opening external links.
It’s also worth noting that it’s not possible to directly export files or copy; you can only share them as pages on their website.
9. Kimi
Total word count: 1400
Kimi has a very intriguing setting. Because I chose different routes, it kept showing me English. I had to insist on answering in Chinese later.
Kimi’s answer is actually okay. It correctly answered 5 out of the 10 AI tools, and their organization is good. The introduction to Deep Search at the beginning is also decent. It’s a pity that it didn’t look at many of the products I mentioned, and it ignored the links I provided.
Also, Kimi cannot directly export to a document.
I used to really like this company’s long text, although it was silly, but having unlimited long text was quite interesting. Now, the intelligence has increased a lot, and it seems to have multimodality. I hope the intelligence can be further improved.
10. Storm
Total word count: 733
This Storm architecture from Stanford University has been around for a while. It seems to have been optimized recently, but it’s clear that their capabilities are a bit behind the times. First of all, the output word count is too small, and the content required for each part is written very generally, without details.
Perhaps because they are all free public APIs and usage, they are not as aggressive as other companies.
I can only say, quite disappointing.
It’s worth mentioning that you can only enter a topic of up to 20 words, and then write what you want to use it for.
11. Metaso.cn (秘塔搜索)
Total word count: 1259
If you include the links, it’s actually close to 10,000 words, but that wouldn’t be fair.
Meta Search is still quite good, especially as the first one that can look at so many web pages. This time, it even looked at 374 web pages.
Some niche products are correct, but the quantity is still too small.
The funniest thing is that a WeChat group QR code is placed very prominently in the article.
However, overall, the depth of the article is still very shallow. It’s a bit of a pity for such a large amount of reading. It’s like the embarrassment of having read so much but still not understanding.
12. Gemini & Ai Studio
Total word count: 8690
Speaking of search, how can we not mention Google? (Of course, we can skip Baidu)
Overall, Google’s answer is very good, but it still only found 6 out of the 10 AI tools. Although it’s above average, I think Google can do better.
Google’s new model is still very powerful, for example:
It supports multimodal models with 1 or 2 million contexts and can also output much more content than other models (except ChatGPT O1, O3).
It can support web searches of Google’s ecosystem, such as YouTube.
And the response speed is very fast.
But Google made two big mistakes here:
Sometimes it doesn’t output in a good format. For example, in the screenshot, the text is output in a code format, and the formatting is all messed up.
It surprisingly didn’t display external links and related YouTube recommendations.
There’s an interesting detail. You can click on the three dots to have the AI recheck the question. But I don’t think it’s very useful.
13. Perplexity
Total word count:1931
Perplexity has the most user-friendly export function. It embeds links within the text itself, preventing them from appearing separately. This suggests that Perplexity may have the most refined markdown optimization.
Perplexity handles widely-known products effectively, but it has limited coverage of niche products and largely ignores Chinese information sources.
Summary
With DeepSeek R1, each company can quickly deploy a seemingly impressive AI deep search. They provide the search, and DeepSeek provides the depth. However, combining them still requires a lot of engineering effort. If you don’t want to do so much, you still need a large model that can “brute force” its way through.
DeepSeek will not make your content more accurate; it just makes your content appear more accurate.
At this stage, as of February 16, 2025, and even for the next few months, I personally believe that it still takes a lot of effort to quickly obtain and organize information from the internet.
Finally, it would be great if DeepSeek R2 had a context of 1 million or more, supported multimodality, and had a faster response speed.
















