AI News Roundup – Reinforcement learning pioneers win Turing Award, McDonald’s brings AI to the fast-food restaurant, “cheating” on AI benchmarks, and more

To help you stay on top of the latest news, our AI practice group has compiled a roundup of the developments we are following.

    • WIRED reports that two pioneers of a key concept in machine learning that underpins nearly all AI systems have been awarded this year’s Turing Award. Andrew Barto, formerly a professor at the University of Massachusetts Amherst, and Rich Sutton, a professor at the University of Alberta, won the highest award in the field of computer science for their work in developing reinforcement learning, commonly used in training machine learning models using positive or negative feedback loops. Since 1998, reinforcement learning has been used for a variety of computing systems but received renewed interest in recent years with the advent of large language models and other AI systems, for which reinforcement learning is often a key part of the training process. Barto and Sutton will share a $1 million prize along with the award.
    • The Wall Street Journal reports that the fast-food chain McDonald’s is rolling out AI tools to help manage its restaurants. In an interview, the company’s chief information officer, Brian Rice, said that technology can help “alleviate the stress” that comes along with the operations of a McDonald’s restaurant. One application is to help predict equipment failures in the kitchen, such as fryers or ice cream machines, through the use of sensors and AI for analysis to search for early signs of an issue. The company is also looking into using computer vision in restaurant cameras to help confirm that orders are correct before they are given to customers. To enable these applications, the company is partnering with Google Cloud to provide “edge computing” resources as opposed to purely cloud-based solutions. McDonald’s has been an early adopter for this sort of technology in the fast-food industry, though it remains to be seen how quickly the technology will be rolled out at its restaurants.
    • The Atlantic reports on a curious phenomenon in measuring the performance of AI models – many AI systems are trained on the very questions that they are tested on. The process of testing the performance of an AI model, called benchmarking, often involves measuring the model’s ability to generalize – that is, answer questions it hasn’t been specifically trained to answer. Thus, benchmark tests should ideally contain questions that AI models have never been trained on. However, recent studies have shown that nearly every major AI model has been trained on the text of major benchmark tests, which are often freely available on the Internet and thus liable to be scraped and folded into training data. This may throw into question the results of benchmark testing, given that the AI system being tested has “cheated” by knowing the answers beforehand. The problem of benchmark contamination has yet to be addressed fully by creators of AI systems, who may need to develop alternative methods of measuring the performance of their models.
    • CNBC reports on Microsoft’s new AI-powered medical assistant. The tool, Dragon Copilot, is a voice-activated AI system that combines a medical dictation tool, Dragon Medical One, and an ambient listening tool, DAX Copilot, into one solution. Dragon Copilot is intended to help doctors retrieve medical information and draft clerical documents that take up much of a medical professional’s time. In a news briefing, a Microsoft executive told reporters that “through this technology, clinicians will have the ability to focus on the patient rather than the computer.” The tool also integrates with several electronic health record providers and will allow a medical professional to edit and draft documents through conversations with the assistant. The system has been tested over the past few months at several hospitals and clinics in Pennsylvania and Maryland. One doctor at a location told CNBC that the new tool was easy to use and that it “allows us to get back to that so we can focus on the patient, truly think about what’s needed.” Dragon Copilot’s pricing structure is not public, but the company said it will be available to U.S. and Canadian customers this coming May.
    • The South China Morning Post reports on Alibaba’s newest open-source AI model, which the company claims outperforms both its Chinese rival DeepSeek’s R1 model as well as OpenAI’s o1 reasoning model. QwQ-32B, the latest member of Alibaba Cloud’s Qwen family of AI models, was shown to outperform a variant of DeepSeek’s R1 model with over 20 times the amount of parameters that QwQ-32B has, which the company said in a blog post “underscores the effectiveness of [reinforcement learning].” Indeed, reinforcement learning was the focus of the training of this newest model. The company said that this model was the first step in a series of reinforcement learning-based models, saying that “we are confident that combining stronger foundation models with RL powered by scaled computational resources will propel us closer to achieving Artificial General Intelligence (AGI).” Alibaba is one among many major players in China’s rapidly growing AI industry, which also features ByteDance (better known for TikTok) and Tencent, who are seeking to catch up to DeepSeek’s progress in developing efficient AI models.