BEIJING (ANN/CHINA DAILY) – Artificial intelligence, despite its strides in various domains, encountered significant hurdles in the realm of mathematics during the preliminary round of the 2024 Alibaba Global Mathematics Competition.
A total of 563 AI teams grappled with exam questions over a 48-hour span, covering multiple-choice, problem-solving, and proof-based queries.
Surprisingly, none managed to secure scores high enough to progress to the finals, underscoring AI’s current limitations in complex reasoning and rigorous mathematical thinking.
The competition’s organising committee revealed that AI teams averaged a modest score of 18, akin to their human counterparts. However, the highest AI score reached just 34, a considerable distance from the leading human score of 113.
Chen Tianchu from Zhejiang University’s Computer Architecture Laboratory noted that current large language models (LLMs), while proficient in predicting text, struggle with iterative problem-solving and nuanced mathematical analysis.
This limitation, he emphasised, highlights the irreplaceable role of trained human mathematicians.
Participants from renowned institutions like Peking University, Tsinghua University, and international entities such as the University of Oxford and Amazon Web Services fielded AI teams.
Some teams adapted open-source LLMs to tackle advanced maths, while others integrated prompt engineering to enhance problem-solving capabilities using proprietary models like GPT-4.
Tu Jinhao from Jianping High School in Shanghai excelled using AI by employing a method akin to self-debate, iterating through various models for optimal problem-solving strategies.
Despite these setbacks, the top three AI teams earned monetary prizes, highlighting ongoing efforts to harness AI’s potential and explore its boundaries in mathematical applications.
Organisers affirmed their commitment to integrating AI in future competitions, aiming to spur innovation and expand the frontiers of AI capabilities in mathematics.
Yin Wotao, a committee member, emphasised the competition’s role in pushing the envelope for AI research and application possibilities.