top of page

AI Research Papers

or

No Items Found

First page of the research paper Qwen3 Technical Report .

Qwen3 Technical Report

Qwen Team

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper Evaluating LLM Metrics Through Real-World Capabilities.

Evaluating LLM Metrics Through Real-World Capabilities

University of Sydney

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper R-Bench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation.

R-Bench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation

Tsinghua University, Stanford University, Carnegie Mellon University, University of Pennsylvania, Tencent Hunyuan X, Fitten

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper VideoLLM Benchmarks and Evaluation: A Survey.

VideoLLM Benchmarks and Evaluation: A Survey

Indian Institute of Technology Jodhpur

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks.

Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks

Fudan University, Nanyang Technological University, Singapore Management University, Tsinghua
University, Singapore University of Technology and Design, University of California Davis, National
University of Singapore, University of Illinois Urbana-Champaign, Australian National University

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper HalluLens: LLM Hallucination Benchmark.

HalluLens: LLM Hallucination Benchmark

FAIR at Meta, GenAI at Meta, HKUST

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions.

Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions

University of Georgia, University of Texas
at Arlington, Harvard University, Carnegie Mellon University, Vanderbilt University, Mayo Clinic Arizona, Augusta University

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper Self-Correction Makes LLMs Better Parsers.

Self-Correction Makes LLMs Better Parsers

Soochow University

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Shanghai AI Laboratory, SenseTime Research, Tsinghua University, Nanjing University, Fudan University, The Chinese University of Hong Kong, Shanghai Jiao Tong University

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs.

KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs

University of Southern California, Independent Researcher, University of California, Riverside

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper DeepSeek-R1 Thoughtology: Let’s Think About LLM Reasoning.

DeepSeek-R1 Thoughtology: Let’s Think About LLM Reasoning

Mila - Quebec AI Institute, McGill University, University of Copenhagen, Canada CIFAR AI Chair

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents.

Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents

Multimodal Artificial Intelligence Systems, Institute of Automation, University of Chinese Academy of Science, Wuhan AI Research

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper Qwen2.5-Omni Technical Report.

Qwen2.5-Omni Technical Report

Qwen Team

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper Survey on Evaluation of LLM-based Agents.

Survey on Evaluation of LLM-based Agents

The Hebrew University of Jerusalem, IBM Research, Yale University

average rating is 3 out of 5, based on 150 votes, Ratings
First page of the research paper A Survey on Post-training of Large Language Models.

A Survey on Post-training of Large Language Models

Huazhong University, Lehigh University, University of Hong Kong, Jilin University, Southern University, Worcester Polytechnic Institute, LinkedIn, Squirrel Ai Learning, University of Georgia, Duke University, Michigan State University Salesforce, University of Illinois, Microsoft

average rating is 3 out of 5, based on 150 votes, Ratings
  • Page 3

A screen width greater than 1000px is required for viewing our search and directory listing pages.

bottom of page