Slacking on the job!
Attention Span
It turns out that not even AI models are immune to a little procrastination.
While its developers were trying to record a coding demonstration, the latest version of Claude 3.5 Sonnet — Anthropic’s current flagship AI — got off track and produced some “amusing” moments, the company said in an announcement.
It’s perilous to anthropomorphize machine learning models, but if this were a human employee, we’d diagnose them with a terminal case of being bored on the job. As seen in a video, Claude decides to blow off writing code, opens Google, and inexplicably browses through beautiful photos of Yellowstone National Park.
In another demo attempt, Claude accidentally stopped a lengthy screen-recording in progress, Anthropic said, causing all the footage to be lost. We’re sure that wasn’t intentional on the AI’s part.
Even while recording these demos, we encountered some amusing moments. In one, Claude accidentally stopped a long-running screen recording, causing all footage to be lost.
Later, Claude took a break from our coding demo and began to peruse photos of Yellowstone National Park. pic.twitter.com/r6Lrx6XPxZ
— Anthropic (@AnthropicAI) October 22, 2024
Special Agents
The upgraded Claude 3.5 Sonnet is Anthropic’s foray in developing an “AI agent,” a broad term that describes productivity-focused AI models that are designed to perform tasks autonomously. A bevy of companies are working on expanding their AI models beyond just serving as chatbots and assistants, including Microsoft, which just released AI agent capabilities of its own.
With Claude, the Amazon-backed startup brags that its latest model can now use “computers the way people do,” such as moving a cursor and inputting keystrokes and mouse clicks. That means Claude can potentially control your entire desktop, interacting with any software and applications you have installed.
It’s clearly far from perfect. Like any AI model, reliability remains elusive, and frequent hallucinations are simply a fact of life, as Anthropic itself admits.
“Even though it’s the current state of the art, Claude’s computer use remains slow and often error-prone,” the company said. “There are many actions that people routinely do with computers (dragging, zooming, and so on) that Claude can’t yet attempt.”
Desktop Danger
The example errors that Anthropic shared were mostly harmless. But given the level of autonomy that Claude purportedly has, it’s more than fair to ask questions about its safety. What happens when the AI agent gets sidetracked not by googling photos, but by opening your social media, for example?
There’s also the obvious potential for it to be misused by humans — risks that Anthropic wants you to know it’s addressing.
“Because computer use may provide a new vector for more familiar threats such as spam, misinformation, or fraud, we’re taking a proactive approach to promote its safe deployment,” Anthropic said. This includes implementing new classifiers to identify when the AI is being used to perform flagged activities, like posting on social media and accessing government websites.
As more people try out the new and improved Claude, though, we expect to see more examples of its computer use gone awry.
More on AI: Teen Dies by Suicide After Becoming Obsessed With AI Chatbot