OpenAI announced on Wednesday that GPT-4.1 now sits in the ChatGPT model picker for Plus, Pro and Team accounts. According to OpenAI, the chat interface already carries GPT-4o, o-series reasoning models and Deep Research. Adding GPT-4.1 brings the total to nine main choices, which even Sam Altman admitted looks “complicated”.

OpenAI tweeted, “By popular request, GPT-4.1 will be available directly in ChatGPT starting today. GPT-4.1 is a specialised model that excels at coding tasks & instruction following. Because it’s faster, it’s a great alternative to OpenAI o3 & o4-mini for everyday coding needs.”

The company keeps GPT-4o as the default because it mixes speed with accuracy and a friendly tone. GPT-4.1 sits in the “more models” menu and targets coding or long-form work. Free users will slide to GPT-4.1 mini once they reach the daily GPT-4o usage cap, while paid plans can open the full or mini versions at will.

Altman wrote that a future GPT-5 will tidy the range, but that pledge arrives after GPT-4.1, so the crowded list stays for now.

 

How Does GPT-4.1 Help With Code?

 

OpenAI’s product post shows GPT-4.1 scoring 54.6% on the SWE-bench Verified benchmark. GPT-4o scores 33.2% on the same test, so the new model closes tickets faster and passes more unit checks. The post also notes an 80% win rate in head-to-head website builds judged by paid graders.

Three editions let builders trade muscle for speed and price. Full gives the best raw strength; mini cuts latency by about half and lowers token price by 83%; nano is tuned for quick jobs such as classification or autocomplete. All three share the same guardrails to avoid extra edits in pull requests, dropping unwanted changes from 9% to 2%.

Early testers back those numbers. Windsurf measured code acceptance rising 30%, and Qodo found GPT-4.1 wrote the stronger code review in 55% of trials.

 

 

Does The Huge Context Window Change Daily Work?

 

Each GPT-4.1 edition can read up to one million tokens in one prompt, roughly 3 000 pages. That reach lets lawyers, auditors and researchers load entire case files or portfolios without chopping them into chunks. OpenAI’s long-context tests show the model picking out hidden text at any point in the input, and a new Graphwalks benchmark rates it at 61.7% on breadth-first search across giant graphs, beating GPT-4o.

Real-world teams have started leaning on that reach. Thomson Reuters saw a 17% jump in multi-document legal review accuracy in its CoCounsel tool. Carlyle recorded 50% better extraction of dense financial data from very large PDFs and spreadsheets.

The context stretch also supports images and video. GPT-4.1 tops the Video-MME long-form test at 72.0%, handling 60-minute clips with no subtitles.

 

What Happens To Older Models And Price Tags?

 

OpenAI plans to retire GPT-4.5 Preview on 14 July 2025. Developers have two months to shift workloads to GPT-4.1 or the o-series. The company argues that GPT-4.1 matches or beats GPT-4.5 while costing less and answering faster.

For API calls, input tokens cost two dollars per million on the full model, forty cents on mini and ten cents on nano. Cached prompts earn a 75% discount, making long chats cheaper. Output tokens land at eight dollars, one sixty and forty cents per million for full, mini and nano, in that order.

Everyday chatters may barely notice the swap: GPT-4o still greets them first, and casual tasks usually fit inside its shorter context window. Power users who juggle large code bases or legal dossiers can pick GPT-4.1, while teams chasing the lowest bill can lean on nano.

OpenAI says all models still occasionally invent facts, so users should double-check any answer before shipping code or signing contracts. That reminder applies whether the chat box shows one model or nine.





Source link

Leave A Reply

Exit mobile version