OpenAI’s cloud-primarily based AI coding agent ‘Codex’ can autonomously write new features, run tests, and fix bugs in parallel.
OpenAI on Friday, May 16, introduced a new AI tool referred as Codex that is designed to handle more than one software engineering-associated tasks at the same time, from generating code for new capabilities to answering quires about a user’s codebase, fixing bugs, and suggesting pull requests for code review.
The cloud-based, AI agent-driven coding tool runs these tasks in its very own cloud sandbox environment that has been preloaded with a user’s code repository.
Codex has been launched below research preview. However, all ChatGPT Pro, Enterprise, and Team users have get right of access to to the AI coding tool. “Users can have generous access to at no extra cost for the coming weeks so you can discover what Codex can do, after which we’ll roll out rate-restrained access and bendy pricing options that let you purchase additional utilization on-demand,” OpenAI stated in a blog post.
ChatGPT Plus and Edu customers might be given access to at a later date, the Microsoft-backed AI startup brought.
OpenAI’s latest supplying comes at a time when AI is poised to disrupt the software engineering sector, elevating widespread fears of job displacement. Microsoft CEO Satya Nadella lately stated that 30% of the company’s code is now AI-generated. A few weeks later, the tech large introduced it’s miles shedding 6,000 employees or 3% of its workforce, with programmers reportedly being effected the maximum.
“It still remains important for users to manually review and validate all agent-generated code earlier than integration and execution,” OpenAI cited in its Codex announcement blog post.
What is Codex?
With Codex, developers can delegate easy programming tasks to an AI agent. It has its very own precise interface that may be accessed from the side bar within the ChatGPT web app.
Codex is powered through codex-1, an AI model that may be a version of OpenAI’s o3 reasoning model. Except that codex-1 has been specifically skilled on a extensive range of actual-world coding tasks to analyse and generate code “that closely mirrors human style and PR choices, adheres exactly to instructions.”
Its outputs have further been quality-tuned the use of reinforcement learning in order that codex-1 can “iteratively run tests until it receives a passing result.” In terms of overall performance and accuracy, OpenAI said that codex-1 fared better than its o3 AI model whilst evaluated on its inner SWE benchmark in addition to the company’s human-validated version of it (SWE-bench Verified).
How does Codex work?
Codex can read and edit documents as well as run commands inclusive of test harnesses, linters, and type checkers. It commonly takes anywhere between 1 minute to 30 minutes to complete a task relying on the difficulty level, as according to OpenAI.
The AI coding agent performs every task in a distinct, close surroundings that is preloaded with the user’s codebase serving as context. “Like human developers, Codex agents perform quality while provided with configured dev environments, dependable testing out setups, and clear documentation,” OpenAI stated.
Users can make Codex work more effectively for them with the aid of which include AGENTS. Md files positioned within their repository. “These are text documents, comparable to README.Md, where you may inform Codex how to navigate your codebase, which instructions to run for testing, and the way quality to adhere on your venture’s widespread practices,” OpenAI further stated.
Another unique function of Codex is it shows its thinking and work with every step as it goes approximately finishing the task(s). In the past, numerous developers have mentioned that AI coding agents produce coding scripts that don’t follow requirements and are hard to debug.
“Codex gives verifiable proof of its actions thru citations of terminal logs and test outputs, allowing you to trace each step taken at some point of task completion,” OpenAI stated.
Once Codex completes a task, it commits its modifications in its environment. However, customers also can review the results, request similarly revisions, open a GitHub pull request, or directly make modifications inside the local development environment.
How to apply Codex? What are its use cases?
In order for Codex to begin generating code, users need to enter a prompt and click on ‘code’. If they want the AI coding agents to answer questions or offer recommendations, then users need to choose the ‘ask’ choice before submitting the prompt.
When OpenAI opened up early access to Codex for external partners, they used the AI coding agent tool to boost up function development, debug problems, write and execute assessments, and refactor massive codebases. Another early tester used Codes to hurry up small but repetitive tasks like enhancing take a look at insurance and fixing integration failures.”
It also can be used to put in writing debugging tools and help developers recognize surprising parts of the codebase by way of surfacing relevant context and past modifications.
OpenAI developers are also using Codex internally for refactoring, renaming, and writing tests as well as scaffolding new capabilities, wiring additives, solving bugs, and drafting documentation.
“Based on learnings from early testers, we propose assigning well-scoped tasks to more than one agents simultaneously, and experimenting with different forms of tasks and prompts to explore the model’s abilities effectively,” the company stated.
What is the difference between Codex and Codex CLI?
In April this year, OpenAI launched another AI coding agent tool referred to as Codex CLI. It is stated to be an open-source, command-line tool capable of analysing, editing, and running code locally on a user’s terminal.
The coding agent integrates OpenAI’s models with the user’s command-line interface (CLI) used to run programmes, manage documents, and more.
Codex CLI is powered by way of OpenAI’s latest o4-mini model through default. However, users can select their desired OpenAI model via the Responses API alternative. Codex CLI can simply run on macOS and Linux structures for now, with assist for Windows nonetheless within the experimental level.
The enterprise has also simplified the developer log-in process for Codex CLI. Instead of getting to manually generate and configure an API token, developers can now use their ChatGPT account to signal into Codex CLI and pick the API agency they want to use. “Plus, and Pro customers who register to Codex CLI with ChatGPT can also begin redeeming $5 and $50 in free API credit, respectively, later today for the following 30 days,” OpenAI stated.