
Executive Summary
The research paper A Survey on Large Language Models for Mathematical Reasoning reveals that while today’s large language models are excellent at reasoning in the short term, they are surprisingly poor at remembering information over time, a limitation that directly impacts how reliably they can serve as digital assistants or enterprise knowledge systems. Developed by researchers at Shanghai AI Lab and Fudan University, the study introduces MEMBench, the first standardized benchmark for measuring long-term memory in AI systems. It shows that even leading models like GPT-4o and Claude 3.5 lose up to 40% of their recall accuracy across extended sessions. For business and government leaders, this research matters because it exposes a key weakness in AI trust and continuity: without robust memory, AI cannot sustain contextual understanding, learn from ongoing user interactions, or manage long-term enterprise knowledge. The findings mark a crucial step toward developing persistent, personalized, and reliable AI systems that can truly integrate into daily operations.
_____
Key point: This paper introduces MEMBench, the first benchmark to measure how well AI models retain information over time, revealing that even top-performing systems suffer significant long-term memory loss, highlighting a critical barrier to building truly reliable, persistent AI assistants.
A Survey on Large Language Models for Mathematical Reasoning
A detailed summary has not yet been uploaded to this record.
Download:
Citation:
Institutions:
Nanjing University, Nanyang Technological University, Skywork AI, Chinese University of Hong Kong
Community Rating
Your Rating
You can rate each item only once.
Thanks! Your rating has been recorded.
Text
You must be a registered site member and logged in to submit a rating.
Share Your Experience
Share your tips, insights, and outcomes in the comments below to help others understand how this resource works in real teams.
You must be registered and logged in to submit comments and view member details.
Copyright & Attribution. All summaries and analyses of this website directory are based on publicly available research papers from sources such as arXiv and other academic repositories, or website blogs if published only in that medium. Original works remain the property of their respective authors and publishers. Where possible, links to the original publication are provided for reference. This website provides transformative summaries and commentary for educational and informational purposes only. Research paper documents are retrieved from original sources and not hosted on this website. Any reuse of original research must comply with the licensing terms stated by the original source.
AI-Generated Content Disclaimer. Some or all content presented on this website directory, including research paper summaries, insights, or analyses, has been generated or assisted by artificial intelligence systems. While reasonable efforts are made to review and verify accuracy, the summaries may contain factual or interpretive inaccuracies. The summaries are provided for general informational purposes only and do not represent the official views of the paper’s authors, publishers, or any affiliated institutions. Users should consult the original research before relying on these summaries for academic, commercial, or policy decisions.



