OpenAI released an upgraded version of its AI coding agent, Codex, on Monday, integrating a new model called GPT-5-Codex. This new iteration features dynamic "thinking" time, allowing it to dedicate varying durations, from seconds to several hours, to a single coding task, resulting in enhanced performance on agentic coding benchmarks.
The GPT-5-Codex model is now being rolled out across Codex products, accessible through terminals, integrated development environments (IDEs), GitHub, and ChatGPT, for all ChatGPT Plus, Pro, Business, Edu, and Enterprise subscribers. OpenAI indicated future availability for API customers. This update positions Codex more competitively within an expanding market for AI coding tools, which has seen significant growth due to user demand. Competitors include Claude Code, Anysphere's Cursor, and Microsoft's GitHub Copilot. The market observed Cursor surpassing $500 million in Annual Recurring Revenue (ARR) earlier in 2025, and Windsurf, a similar code editor, was subject to an acquisition attempt that led to its team being split between Google and Cognition.
OpenAI states that GPT-5-Codex demonstrates superior performance compared to GPT-5 on SWE-bench Verified, a benchmark for agentic coding abilities. The model also showed improved results on a separate benchmark assessing code refactoring tasks from large, established repositories. Furthermore, OpenAI trained GPT-5-Codex for code review functions. Experienced software engineers, evaluating the model's review comments, reportedly found it produced fewer incorrect comments and more "high-impact comments."
Alexander Embiricos, OpenAI's Codex product lead, attributed much of the performance increase to GPT-5-Codex's dynamic "thinking abilities." Unlike models with a fixed router that pre-determines computational resources, GPT-5-Codex can adjust the duration of its task processing in real-time. Embiricos noted that the model could decide, for instance, after five minutes into a problem, that it requires an additional hour, with observed instances of it taking up to seven hours on certain tasks.