OpenAI has launched a new macOS app for agentic coding


AI is already having a seismic impact on how software is written, with much of the grunt work of programming now being done by swarms of agents and subagents. But as developers experiment with new interfaces and form factors for human-AI collaboration, it’s becoming difficult for even the most advanced AI labs to keep up.

The current trend is for agent-based software development – systems where AI agents can work independently on coding tasks – as demonstrated by the Claude Code and Cowork apps. In the meantime, OpenAI is gradually building its Codex tool, launched as a command line tool in April and extended to a web interface a month later.

Now OpenAI is taking an important step towards catching up. On Monday, the company launched a new macOS app for Codex, which combines several agent methods that have become popular in the past year. The new app is designed to work with multiple agents in parallel, integrated agent skills and other state-of-the-art workflows. The launch also comes less than two months later the launch of GPT-5.2-CodexOpenAI’s most powerful coding model, which the company hopes will be enough to tempt Claude Code users.

“If you want to do sophisticated work on something complex, the 5.2 is the strongest model to date,” CEO Sam Altman told reporters in a press call. “However, it’s very difficult to use, so taking that level of model capability and putting it into a more accessible interface, we think is pretty important.”

While Altman’s confidence in GPT-5.2 is understandable, the coding benchmarks tell a more complicated story. GPT-5.2 holds the main site of TerminalBench (a test that measures how well AI handles command-line programming tasks), at least at the time of publication. But agents from Gemini 3 and Claude Opus logged almost identical scores – lower, but within the benchmark’s margin of error. Results from SWE-bankanother coding benchmark that tests AI’s ability to fix real-world software bugs, similarly, shows no clear advantage for GPT-5.2. However, agent use cases are difficult to benchmark effectively, and state-of-the-art models may differ in user experience.

The Codex app also comes with a variety of new features that OpenAI says will help it achieve parity or, in some cases, surpass various Claude applications. The Codex app allows automations that can be set to run in the background on an automatic schedule, with results placed in a queue to be reviewed when the user returns. Users can also choose different personalities for the agent – ​​from pragmatic to empathetic – depending on their working style.

But for the company, its biggest selling point is the speed of progress made possible by AI. “You can use it from a clean sheet of paper, fresh, to create a very sophisticated piece of software in a matter of hours,” Altman said. “At the speed I can type new ideas, that’s the limit of what can be built.”

Techcrunch event

Boston, MA
|
June 23, 2026



Source link

  • Related Posts

    Today’s NYT Strands Hints, Answers and Help for Feb. 3 #702

    Looking for Latest Strands answer? Click here for our daily Strands hintsas well as our daily answers and hints for The New York Times Mini Crossword, Wordle, Connections and Connections:…

    NASA Lets AI Drive a Mars Rover—and It Survives

    February marks the five year anniversary of ENDURANCETime of Mars. But when it comes to testing new things, NASA’s rover is as active as the day it first reached the…

    Leave a Reply

    Your email address will not be published. Required fields are marked *