Amazon Unveils Nova Act, an AI Agent that can Control a Web Browser

On March 31, 2025, Amazon launched a revolutionary addition to its AI platforms: Nova Act - a general-purpose AI agent that can autonomously operate a web browser. The newest product from Amazon's AGI Lab in San Francisco, Nova Act is an important advancement in agentic AI systems built not just for query answerability but also to independently perform actions on behalf of users. On the other hand, alongside its novel AI model, Amazon has introduced the Nova Act SDK toolkit that enables developers to create custom agent prototypes. It promises to be the driving force behind the next Alexa+ upgrade, by far outpacing OpenAI and Anthropic, in the race to dominate the emerging world of AI agents. What, then, really is Nova Act? How does it work? What does all of this imply for the future of technology?

 

A New Breed of AI: What is Nova Act?

 

Nova Act is not your typical chatbot. Unlike traditional AI models that excel at generating text or answering questions, Nova Act is built to act. It can navigate web pages, click buttons, fill out forms, and even select dates on a calendar—all within a browser environment. Think of it as a digital assistant that doesn’t just talk but performs tasks like a human would. Amazon positions Nova Act as a “general-purpose AI agent,” meaning it’s designed to handle a variety of simple, browser-based activities without needing constant supervision.

This capability stems from its training, which focuses on understanding and interacting with user interfaces dynamically. In a demo, Amazon showcased Nova Act searching for apartments within biking distance of a train station, sorting results, and completing the task seamlessly. Such functionality hints at its potential to automate everyday online chores—ordering food, booking reservations, or managing schedules—making it a practical tool for both consumers and businesses.

Nova Act’s debut comes as part of a broader push by Amazon’s AGI Lab, led by former OpenAI researchers David Luan and Pieter Abbeel. The lab’s mission is to advance toward artificial general intelligence—AI that rivals human cognitive abilities. While ordering a salad might seem mundane for an AGI project, Luan argues that mastering these “atomic” tasks is a critical stepping stone to more complex, autonomous systems.

 

The Nova Act SDK: Empowering Developers:

 

Accompanying Nova Act is the Nova Act SDK, a developer toolkit available at nova.amazon.com. This research preview allows programmers to experiment with the AI agent, building prototypes tailored to specific needs. The SDK breaks down workflows into manageable commands—like “search,” “checkout,” or “fill form”—offering developers fine-grained control. It also integrates with tools like Playwright for direct browser manipulation and supports Python scripting for added flexibility.

Amazon’s decision to release the SDK reflects a strategy to foster an ecosystem around Nova Act. Developers can create agents for niche applications, such as automating corporate leave requests, conducting QA testing for web apps, or even streamlining e-commerce processes. The toolkit’s “headless mode” lets agents run silently in the background, while parallelization capabilities enable multiple agents to tackle tasks simultaneously—potentially revolutionizing workflows that demand scale and speed.

This move mirrors the early days of app stores, where opening a platform to developers sparked innovation. By inviting external builders to experiment, Amazon ensures Nova Act’s reach extends beyond its products, potentially amplifying its impact across industries.

 

Powering Alexa+ and Competing with Giants:

 

Nova Act isn’t just a standalone project—it’s a cornerstone of Amazon’s upcoming Alexa+ upgrade. Set to launch later in 2025, Alexa+ will integrate generative AI to enhance the voice assistant’s capabilities, and Nova Act will drive its browser-based functionalities. Imagine saying, “Alexa, book me a table for two tonight,” and having Nova Act navigate a restaurant’s website to secure the reservation. This integration could redefine Alexa as a proactive assistant, strengthening Amazon’s dominance in the smart home and e-commerce arenas.

Amazon claims Nova Act outperforms rival AI agents from OpenAI (Operator) and Anthropic (Claude 3.7 Sonnet) in internal tests. On the ScreenSpot Web Text benchmark, which measures interaction with on-screen text, Nova Act scored 94%, edging out OpenAI’s 88% and Anthropic’s 90%. While it hasn’t been tested against broader standards like WebVoyager, these results suggest a focus on reliability—a key challenge for early AI agents, which often falter with inconsistent performance or complex tasks.

The competitive landscape is crowded, with OpenAI’s Operator and Anthropic’s Computer Use feature also vying for supremacy in agentic AI. Google, too, is testing browser control with its Gemini model. Yet Amazon’s vast user base, bolstered by Alexa and its e-commerce empire, could give Nova Act an edge in real-world adoption.

 

The Bigger Picture: Reliability and the Road to AGI:

 

Early AI agents from competitors have struggled with reliability—slow execution, frequent errors, and difficulty adapting to diverse scenarios. Amazon aims to address this with Nova Act’s design, emphasizing the dependable execution of short, simple tasks. Developers can specify when human intervention is needed, striking a balance between autonomy and oversight. This pragmatic approach contrasts with flashier demos that prioritize capability over consistency.

David Luan sees agents as “the last missing piece” on the path to AGI. By training Nova Act on practical, browser-based environments, Amazon is laying the groundwork for systems that could one-day handle multi-step, open-ended challenges—like planning a wedding or managing IT workflows. The lab plans to incorporate reinforcement learning, moving beyond supervised fine-tuning to create smarter, more adaptable agents.

 

Implications and What's Next:

 

Nova Act’s arrival raises intriguing possibilities. For consumers, it promises a future where mundane online tasks are offloaded to AI, freeing up time and mental bandwidth. For businesses, it offers tools to automate processes without relying on brittle scripts or extensive APIs. And for Amazon, it’s a chance to solidify its AI credentials, catching up to rivals after a slower start in the generative AI race.

Still, challenges remain. The research preview status indicates Nova Act is a work in progress, and its real-world performance will depend on how developers and users push its limits. Privacy concerns could also emerge as AI agents gain access to personal browsing data, though Amazon has yet to detail its safeguards.

As Nova Act rolls out, its success will hinge on delivering on its reliability promise. If it can outshine competitors in practical applications, it might not just power Alexa+ but redefine how we interact with the web. For now, Amazon has fired a bold shot in the AI agent race—one that could reshape the digital landscape by April 2025 and beyond.