OpenAI sets new standards in intelligence, reliability, and versatility
OpenAI has unveiled GPT-5, its most powerful AI model to date, combining numerous improvements in intelligence, efficiency, and safety. Users benefit from significantly enhanced model quality, new features, and a wider range of applications.
A system that adapts flexibly
GPT-5 brings together different approaches in a modular architecture: An efficient base model handles most queries swiftly, while a deep reasoning model (“GPT-5 Thinking”) is used for more complex tasks. An intelligent router decides in real time which model type is best suited – depending on the nature and complexity of the conversation, as well as explicit user prompts such as “think hard about this”. This router is continuously trained using real usage data and preferences. If a usage limit is reached, a compact mini version of the models takes over. In future, these capabilities are set to merge into a single model.
Practical improvements for everyday life
Compared to previous versions, GPT-5 not only delivers faster and more precise answers, but is also much more helpful for real-world, everyday use. Reduced hallucinations, better instruction-following, and less excessive agreement all contribute to greater reliability. Performance has been specifically enhanced in the areas of writing, coding, and health.
- Writing & creativity: GPT-5 offers even greater support for users when drafting and refining texts. The model can handle complex literary forms such as free verse or demanding stylistic requirements, and helps with everyday tasks like writing reports, emails, or speeches.
- Coding: The new model excels particularly in frontend development and debugging large repositories. With just a single prompt, it can generate attractive websites, apps, or games, with a strong focus on design, typography, and user experience.
- Health: GPT-5 achieves record scores on evaluation platforms like HealthBench and acts as an active advisor: Potential risks are proactively addressed, and answers are better tailored to the user’s context, knowledge level, and region. The model does not replace a medical professional, but helps users better understand medical results and ask informed questions.
Comparable examples: GPT-4o vs. GPT-5
A clear comparison: While GPT-4o still tends to follow traditional structures in poetry, GPT-5 impresses with vivid imagery, emotional depth, and cultural context. The new model is able to tackle tasks more creatively and with greater nuance – from poetry analysis to planning complex projects.
Outstanding benchmark results
GPT-5 sets new standards across a range of disciplines. The model achieves around 94.6% on the AIME-2025 maths benchmark (without aids), 74.9% on SWE-bench Verified (coding), 84.2% on multimodal benchmarks (MMMU), and 46.2% on HealthBench Hard. In its Pro version, GPT-5 scores even higher, for example 88.4% on GPQA, a particularly demanding science test. These advances are reflected in day-to-day use – from maths and coding to visual understanding and health queries.
Efficiency and reliability set new benchmarks
GPT-5 delivers more performance with less “thinking effort”: In tests, up to 80% fewer output tokens were needed for comparable tasks. The number of hallucinations has also been drastically reduced. With web search enabled, the error rate was around 45% lower than GPT-4o, while GPT-5’s reasoning model was about 80% less error-prone than OpenAI o3. For open, fact-based tasks, “GPT-5 Thinking” produced around six times fewer hallucinations than older models.
Greater honesty and less deception
A key advance: GPT-5 communicates its own limitations and impossibilities more clearly. For tasks that cannot be solved or where important information is missing, the model now openly admits this rather than guessing or pretending. In realistic tests, the deception rate fell from 4.8% with OpenAI o3 to just 2.1% with GPT-5.
Greater safety through “Safe Completions”
Rather than simply refusing, GPT-5’s new safety architecture enables it to respond to sensitive or ambiguous queries as helpfully and thoughtfully as possible – without crossing safety boundaries. Transparent explanations and safe alternatives contribute to increased robustness and user-friendliness, especially in sensitive areas such as virology or chemistry. Extensive red-teaming tests and multi-layered safeguards provide additional protection.
Less excessive agreement, more natural interaction
Compared to older models, GPT-5 is less ingratiating and less prone to excessive agreement. So-called sycophancy – excessive compliance – has been significantly reduced through targeted training data and new evaluation methods, dropping from 14.5% to under 6%. The result: Communication feels less artificial and instead more natural, competent, and helpful.
Personalisation: New personalities for ChatGPT
Thanks to improved controllability, users can now choose between four predefined personalities for ChatGPT: Cynic, Robot, Listener, and Nerd. These can be easily adjusted in the settings, allowing for a personalised interaction – from factual to supportive, or even humourously sarcastic. All new personalities also meet high standards in terms of sycophancy.
GPT-5 Pro: Even more power for demanding tasks
For particularly complex or large-scale tasks, GPT-5 Pro offers an even more powerful version. It uses additional computing resources to deliver deeper analysis and responses. In external tests, experts preferred GPT-5 Pro to the standard version in almost 68% of cases, especially for challenging tasks in science, maths, medicine, and coding.
How to access GPT-5
GPT-5 is now the default model in ChatGPT for all logged-in users, replacing GPT-4o, OpenAI o3, and earlier versions. The reasoning model is selected automatically depending on the task, but can also be specifically prompted, for example with “think hard about this”. Plus and Pro subscribers receive higher usage quotas and access to GPT-5 Pro, while Team, Enterprise, and Edu customers will have access within a week. Free users can also use GPT-5, but will automatically switch to the slimmer mini version once their usage limit is reached.
All details, further technical information, and examples can be found here.