Research Papers | Affarico 3/5

AI Research Papers

No Items Found

Qwen3 Technical Report

Qwen Team

average rating is 3 out of 5, based on 150 votes, Ratings

Evaluating LLM Metrics Through Real-World Capabilities

University of Sydney

average rating is 3 out of 5, based on 150 votes, Ratings

R-Bench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation

Tsinghua University, Stanford University, Carnegie Mellon University, University of Pennsylvania, Tencent Hunyuan X, Fitten

average rating is 3 out of 5, based on 150 votes, Ratings

VideoLLM Benchmarks and Evaluation: A Survey

Indian Institute of Technology Jodhpur

average rating is 3 out of 5, based on 150 votes, Ratings

Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks

Fudan University, Nanyang Technological University, Singapore Management University, Tsinghua
University, Singapore University of Technology and Design, University of California Davis, National
University of Singapore, University of Illinois Urbana-Champaign, Australian National University

average rating is 3 out of 5, based on 150 votes, Ratings

HalluLens: LLM Hallucination Benchmark

FAIR at Meta, GenAI at Meta, HKUST

average rating is 3 out of 5, based on 150 votes, Ratings

First page of the research paper Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions.

Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions

University of Georgia, University of Texas
at Arlington, Harvard University, Carnegie Mellon University, Vanderbilt University, Mayo Clinic Arizona, Augusta University

average rating is 3 out of 5, based on 150 votes, Ratings

Self-Correction Makes LLMs Better Parsers

Soochow University

average rating is 3 out of 5, based on 150 votes, Ratings

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Shanghai AI Laboratory, SenseTime Research, Tsinghua University, Nanjing University, Fudan University, The Chinese University of Hong Kong, Shanghai Jiao Tong University

average rating is 3 out of 5, based on 150 votes, Ratings

KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs

University of Southern California, Independent Researcher, University of California, Riverside

average rating is 3 out of 5, based on 150 votes, Ratings

DeepSeek-R1 Thoughtology: Let’s Think About LLM Reasoning

Mila - Quebec AI Institute, McGill University, University of Copenhagen, Canada CIFAR AI Chair

average rating is 3 out of 5, based on 150 votes, Ratings

Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents

Multimodal Artificial Intelligence Systems, Institute of Automation, University of Chinese Academy of Science, Wuhan AI Research

average rating is 3 out of 5, based on 150 votes, Ratings

Qwen2.5-Omni Technical Report

Qwen Team

average rating is 3 out of 5, based on 150 votes, Ratings

Survey on Evaluation of LLM-based Agents

The Hebrew University of Jerusalem, IBM Research, Yale University

average rating is 3 out of 5, based on 150 votes, Ratings

A Survey on Post-training of Large Language Models

Huazhong University, Lehigh University, University of Hong Kong, Jilin University, Southern University, Worcester Polytechnic Institute, LinkedIn, Squirrel Ai Learning, University of Georgia, Duke University, Michigan State University Salesforce, University of Illinois, Microsoft

average rating is 3 out of 5, based on 150 votes, Ratings

Page 3

Disclaimers

Terms & Conditions

A screen width greater than 1000px is required for viewing our search and directory listing pages.

Disclaimers

Terms & Conditions

Home

News

A screen width greater than 1000px is required for viewing our search and directory listing pages.