All posts tagged: Model

A Google Gemini model now has a “dial” to adjust how much it reasons

A Google Gemini model now has a “dial” to adjust how much it reasons

“We’ve been really pushing on ‘thinking,’” says Jack Rae, a principal research scientist at DeepMind. Such models, which are built to work through problems logically and spend more time arriving at an answer, rose to prominence earlier this year with the launch of the DeepSeek R1 model. They’re attractive to AI companies because they can make an existing model better by training it to approach a problem pragmatically. That way, the companies can avoid having to build a new model from scratch.  When the AI model dedicates more time (and energy) to a query, it costs more to run. Leaderboards of reasoning models show that one task can cost upwards of $200 to complete. The promise is that this extra time and money help reasoning models do better at handling challenging tasks, like analyzing code or gathering information from lots of documents.  “The more you can iterate over certain hypotheses and thoughts,” says Google DeepMind chief technical officer Koray Kavukcuoglu, the more “it’s going to find the right thing.” This isn’t true in all cases, …

New open-source math model Light-R1-32B surpasses equivalent DeepSeek performance with only 00 in training costs

New open-source math model Light-R1-32B surpasses equivalent DeepSeek performance with only $1000 in training costs

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A team of researchers has introduced Light-R1-32B, a new open-source AI model optimized for solving advanced math problems, making it available on Hugging Face under a permissive Apache 2.0 license — free for enterprises and researchers to take, deploy, fine-tune or modify as they wish, even for commercial purposes. The 32-billion parameter (number of model settings) model surpasses the performance of similarly sized (and even larger) open source models such as DeepSeek-R1-Distill-Llama-70B and DeepSeek-R1-Distill-Qwen-32B on third-party benchmark the American Invitational Mathematics Examination (AIME), which contains 15 math problems designed for extremely advanced students and has an allotted time limit of 3 hours for human users. Developed by Liang Wen, Fenrui Xiao, Xin He, Yunke Cai, Qi An, Zhenyu Duan, Yimin Du, Junchen Liu, Lifu Tang, Xiaowei Lv, Haosheng Zou, Yongchao Deng, Shousheng Jia, and Xiangzheng Zhang, the model surpasses previous open-source alternatives on competitive math benchmarks. Incredibly, the researchers completed the model’s training in fewer than …

Is OpenAI hitting a wall with huge and expensive GPT-4.5 model?

Is OpenAI hitting a wall with huge and expensive GPT-4.5 model?

GPT-4.5 is OpenAI’s latest chatbot model CFOTO/Future Publishing via Getty Images OpenAI has unveiled its latest AI model, GPT-4.5, but the firm’s boss says it is running out of hardware to power it. If ever-larger AI can no longer be run at scale, then are we looking at the end of the technology’s rapid progress, and perhaps even the bursting of a bubble? There are certainly signs that things aren’t going as planned within OpenAI. As recently as 12 February, CEO Sam Altman acknowledged on X that the company’s product offering had created a confusing picture – at the… Source link

The Download: Underage celebrity chatbots, and OpenAI’s latest model

The Download: Underage celebrity chatbots, and OpenAI’s latest model

Botify AI, a site for chatting with AI companions that’s backed by the venture capital firm Andreessen Horowitz, hosts bots resembling real actors that state their age as under 18, engage in sexually charged conversations, offer “hot photos,” and in some instances describe age-of-consent laws as “arbitrary” and “meant to be broken.” When MIT Technology Review tested the site this week, we found popular user-created bots taking on underage characters meant to resemble Jenna Ortega as Wednesday Addams, Emma Watson as Hermione Granger, and Millie Bobby Brown, among others.  The conversations—along with the fact that Botify AI includes “send a hot photo” as a feature for its characters—suggest that the ability to elicit sexually charged conversations and images is not accidental. Instead, sexually suggestive conversations appear to be baked in. Read the full story. —James O’Donnell OpenAI just released GPT-4.5 and says it is its biggest and best chat model yet What’s new: OpenAI has just released GPT-4.5, a new version of its flagship large language model which it claims is its biggest and best …

Tesla exec teases new Model S as protests gain momentum

Tesla exec teases new Model S as protests gain momentum

On today’s energized episode of Quick Charge, a Tesla executive leaks news of a new Model S and X as protests at retail locations escalate and key staff continue their exodus from the troubled brand. Plus: 0% financing deals on EVs and PHEVs and Volvo brings off-grid power to bauma. We’ve also got a look at the crowded EV sedan market the updated Tesla Model S (if it happens) will enter, talk about the Chinese answer to Rolls-Royce and Bentley from Huawei, and the latest off-grid BESS substation concept from Volvo Penta. Enjoy! Source Links Prefer listening to your podcasts? Audio-only versions of Quick Charge are now available on Apple Podcasts, Spotify, TuneIn, and our RSS feed for Overcast and other podcast players. Advertisement – scroll for more content New episodes of Quick Charge are recorded, usually, Monday through Thursday (and sometimes Sunday). We’ll be posting bonus audio content from time to time as well, so be sure to follow and subscribe so you don’t miss a minute of Electrek’s high-voltage daily news. Got news? Let us know!Drop us a line at [email protected]. You can also rate us on …

Scientists Say: Large language model

Scientists Say: Large language model

algorithm: A group of rules or procedures for solving a problem in a series of steps. Algorithms are used in mathematics and in computer programs for figuring out solutions. artificial intelligence: A type of knowledge-based decision-making exhibited by machines or computers. The term also refers to the field of study in which scientists try to create machines or computer software capable of intelligent behavior. bias: The tendency to hold a particular perspective or preference that favors some thing, some group or some choice. Scientists often “blind” subjects to the details of a test (don’t tell them what it is) so that their biases will not affect the results. computer program: A set of instructions that a computer uses to perform some analysis or computation. The writing of these instructions is known as computer programming. data: Facts and/or statistics collected together for analysis but not necessarily organized in a way that gives them meaning. For digital information (the type stored by computers), those data typically are numbers stored in a binary code, portrayed as strings of …

A look under the hood of transfomers, the engine driving AI model evolution

A look under the hood of transfomers, the engine driving AI model evolution

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Today, virtually every cutting-edge AI product and model uses a transformer architecture. Large language models (LLMs) such as GPT-4o, LLaMA, Gemini and Claude are all transformer-based, and other AI applications such as text-to-speech, automatic speech recognition, image generation and text-to-video models have transformers as their underlying technology.   With the hype around AI not likely to slow down anytime soon, it’s time to give transformers their due, which is why I’d like to explain a little about how they work, why they are so important for the growth of scalable solutions and why they are the backbone of LLMs.   Transformers are more than meets the eye  In brief, a transformer is a neural network architecture designed to model sequences of data, making them ideal for tasks such as language translation, sentence completion, automatic speech recognition and more. Transformers have really become the dominant architecture for many of these sequence modeling tasks because the underlying attention-mechanism can be easily …

Anna Nicole Smith’s ex Larry Birkhead shares rare photo with late model on difficult day for daughter Dannielynn

Anna Nicole Smith’s ex Larry Birkhead shares rare photo with late model on difficult day for daughter Dannielynn

It’s a difficult weekend for Larry Birkhead and his daughter Dannielyn Birkhead. On Saturday, February 8, the Kentucky-born photographer marked the 18th anniversary of Anna Nicole Smith’s untimely passing, from a drug overdose when she was 39. The late model’s passing came just a few months after she welcomed her daughter Dannielynn with Larry, which itself was three days before her son Daniel Wayne Smith died at 20, also from an accidental drug overdose. Recommended videoYou may also likeWATCH: Anna Nicole Smith’s daughter gets a wild transformation for Halloween In honor of Anna Nicole, Larry took to Instagram and shared a slew of photos of his former love. First sharing a black-and-white, close-up portrait of her, he wrote: “18 years ago today the world became a little less interesting without you,” adding: “Remembering Anna Nicole.” He then also shared a sweet photo of the two together, along with Nelly Furtado’s 2006 song “In God’s Hands.” © InstagramLarry paid tribute to Anna Nicole Earlier this year, on January 22, Larry also marked what would have been …

Tesla sales hold OK in China amid Model Y changeover

Tesla sales hold OK in China amid Model Y changeover

Tesla sales in China are relatively fine despite the added complexity of managing the production switch to the new Mode Y, Tesla’s best-selling model. Model Y represents most of Tesla’s sales, and it is currently undergoing a design refresh that started with Gigafactory Shanghai, Tesla’s highest-producing factory. The production shift will inevitably result in lower volumes this quarter, but the market is trying to track Tesla’s deliveries in China closely to see how much lower it will be. The China Passenger Car Association (CPCA) has now released January sales volume and it reported that Tesla China sold 63,238 electric vehicles in January – including vehicles Tesla built in China and exported to overseas markets. That’s down 11.5% from the same period last year and 32.5% compared to December. While sales are down, those numbers are far from awful amid the Model Y changeover happening in China. However, the impact is expected to be much more significant in February due to the Chinese New Year and Tesla is expected to shut down part of the Model …

Best AI Vision Model for Your Needs in 2025

Best AI Vision Model for Your Needs in 2025

Imagine a world where your devices not only see but truly understand what they’re looking at—whether it’s reading a document, tracking where someone’s gaze lands, or answering questions about a video. In 2025, this isn’t just a futuristic dream; it’s the reality powered by innovative vision-language models (VLMs). These AI systems, like Qwen 2.5 VL, Moondream, and SmolVLM, are reshaping industries by bridging the gap between visual and textual data. But with so many options, each boasting unique strengths and trade-offs, how do you choose the one that’s right for your needs? Vision-language models (VLMs) are transforming industries by allowing systems to process and interpret visual and textual data simultaneously.  Whether you’re tackling complex tasks like object detection or simply need a lightweight model for on-the-go applications, the latest VLMs offer solutions tailored to a wide range of challenges. In this guide by Trelis Research learn the key features, performance metrics, and use cases of the top models of 2025 so far. By the end, you’ll have a clearer picture of which AI model aligns …