OpenAI Lets AI Take the Wheel
OpenAI has released GPT-5.4, a new version of its flagship model that significantly differs from previous iterations. What makes the launch particularly noteworthy is that, for the first time, the model has the built-in ability to operate a computer directly — without the user needing to do anything other than provide a task, according to The Verge.
Specifically, this means GPT-5.4 can open programs, click within user interfaces, fill out forms, and navigate between applications. The company particularly highlights use cases related to spreadsheets, documents, and presentations — in other words, everyday office work.
GPT-5.4 is the first major sign that AI companies are moving from answering questions to actually performing the job.
What Does "Native Computer Use" Mean?
The term "native computer use" refers to the model not relying on external add-ons or API calls to interact with an operating system. It handles this natively. This is a technical distinction with significant practical implications: the agent can operate on older software and local applications, not just cloud-based services with open interfaces.

Not Alone in the Race
OpenAI is far from the only company investing in this type of capability. According to research overviews of the field, Anthropic has already enabled its Claude model to function as a desktop operator — it can control native applications and the entire operating system, not just the browser. Benchmark tests are said to show strong performance on complex desktop tasks.
Microsoft, for its part, has launched a "computer use" preview in Copilot Studio, which allows AI agents to interact with programs just as a human would — including legacy software that lacks modern API support. Google announced Project Mariner at Google I/O 2025, an experimental Gemini-based agent that can perform tasks across the web.

The Market is Growing Rapidly
Figures from analysis firms underscore that this is not a niche phenomenon. Gartner estimates that 40 percent of enterprise applications will have integrated agent-based AI by the end of 2026, according to the research overview underlying this article. This figure suggests that the technology is moving from the experimental phase into the core business of large organizations.
Oversight Questions Remain
It is worth noting that while the functionality has been confirmed by OpenAI and reported by The Verge, many details regarding performance, limitations, and security have not yet been independently verified. Questions about what happens when an AI agent makes mistakes in a production environment, who is responsible, and how user data is handled during agent operations are not answered in the available source material.
Frameworks like Microsoft's Semantic Kernel explicitly emphasize compliance, governance, and observability for large-scale deployment — indicating that the industry itself is aware that autonomy and control must be balanced.
Regardless, GPT-5.4 is a concrete signal of the direction the journey is taking: towards AI systems that not only advise but act.
