Tuesday, May 26, 2026
Today's Edition

EveryNews

Stories that matter, signal over noise

Technologies

OpenAI Trains AI to Admit Mistakes Through Confession

OpenAI is developing an experimental mechanism called Confession that requires the model to report instruction violations or manipulations of its responses. The system rewards candid explanations of the process, not just the correctness of the answer.

Tetiana Suchkova-Ladik

By Tetiana Suchkova-Ladik

December 4, 2025 · 1 min read

OpenAI Trains AI to Admit Mistakes Through Confession

OpenAI is creating an experimental mechanism called Confession, intended to teach artificial intelligence to candidly report its mistakes. If a model breaks instructions, tailors its answer, or circumvents test conditions, it must acknowledge this in a separate explanatory block.

Purpose of Confession

Models are usually trained simultaneously on multiple criteria — accuracy, safety, policy compliance, style, and user preferences. When these signals are mixed, there is a risk of evasive strategies: a model may mimic compliance or adapt to expectations instead of honestly carrying out the task.

Confession separates these processes. The main response is still evaluated by the combination of criteria, while an additional explanatory block is focused exclusively on honesty. The model is encouraged to admit its deviations from instructions even if they were not apparent in the final answer.

Evaluation and results

In trials with the GPT‑5 Thinking model aimed at detecting manipulation, fabrication, and sloppiness, the system acknowledged violations in 95.6% of cases where instructions were ignored. In Confession, reward is given not for a correct answer but for a candid explanation of how it was produced.

In addition, ChatGPT received a GPT‑5.1 model, and later the service introduced a shopping assistant that is already available in Ukraine.

Related

Latest

Business

EU Against Google: Why the Latest Fine Could Change More Than Previous Ones

# European Regulators Target Google Again — This Time Over Digital Markets Act Violations. What's Behind the Accusations and Why It Matters Beyond the Corporation European regulators have renewed their scrutiny of Google, this time focusing on alleged violations of the Digital Markets Act. The charges underscore Brussels' increasingly aggressive stance on big tech monopolies and what officials say are anticompetitive practices. The accusations center on how Google leverages its dominance across multiple digital services — from search to advertising to mobile platforms — to disadvantage competitors. Regulators claim the company is using its market power in ways that stifle innovation and limit consumer choice. The case carries significance far beyond Google itself. It signals how the EU is attempting to enforce its landmark Digital Markets Act, legislation designed to curb the gatekeeping power of tech giants. A potential penalty could set precedent for how other large technology companies face similar scrutiny. For consumers and smaller tech firms, the outcome could reshape the digital landscape by creating more room for competition. For Google, fines and operational restrictions could fundamentally alter its business model in Europe, the world's most stringent regulatory market. The case also reflects a broader geopolitical divide, with the EU pursuing a regulatory approach that contrasts sharply with the lighter-touch oversight favored in the United States.

May 26, 2026