[Paper][Tongyi]: Tongyi DeepResearch: A New Era of Open-Source AI Researchers
链接
https://tongyi-agent.github.io/zh/blog/introducing-tongyi-deep-research/
https://github.com/Alibaba-NLP/DeepResearch
系统论文共11 篇
一句话总结
WebWalker: Benchmarking LLMs in Web Traversal
https://arxiv.org/pdf/2501.07572
https://github.com/Alibaba-NLP/DeepResearch/tree/main/WebAgent/WebWalker
构建了一个多跳多源且单源探索度高的复杂网页遍历Benchmark(WebWalkerQA,包含4个场景680个query共涉1373个网页),并提出一个多智能体框架(WebWorker,ExplorerAgent+CrticAgent),多个主流LLM在此框架下在此Benchmark表现最高准确率小于40%
https://huggingface.co/spaces/callanwu/WebWalkerQALeaderboard