Alibaba researchers unveil Marco-o1, an LLM with advanced reasoning capabilities
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The recent release of OpenAI o1 has brought great attention to large reasoning models (LRMs), and is inspiring new models aimed at solving complex problems classic language models often struggle with. Building on the success of o1 and the concept of LRMs, researchers at Alibaba have introduced Marco-o1, which enhances reasoning capabilities and tackles problems with open-ended solutions where clear standards and quantifiable rewards are absent. OpenAI o1 uses “inference-time scaling” to improve the model’s reasoning ability by giving it “time to think.” Basically, the model uses more compute cycles during inference to generate more tokens and review its responses, which improves its performance on tasks that require reasoning. o1 is renowned for its impressive reasoning capabilities, especially in tasks with standard answers such as mathematics, physics and coding. However, many applications involve open-ended problems that lack clear solutions and quantifiable rewards. “We aimed to push the boundaries of LLMs even further, enhancing their reasoning abilities …