Themata.AI | AI news without the noise

Themata.AI

AI is changing the world. Don't stay behind. Clear summaries, community insight, delivered without the noise. Subscribe to never miss a beat.

Privacy

Contact

🕒 Latest 🔥 Top

Filtering by tag:

gpt-52Clear

Even GPT-5.2 Can't Count to Five: The Case for Zero-Error Horizons in Trustworthy LLMs

llms gpt-52 ai-safety machine-learning

Research

The case for zero-error horizons in trustworthy LLMs

Zero-Error Horizon (ZEH) is proposed as a metric for evaluating the maximum range of error-free performance in large language models (LLMs). An evaluation of GPT-5.2's ZEH reveals significant insights into its limitations, including its inability to accurately count to five.

arxiv.org

🔥🔥🔥🔥🔥

2 min

4/2/2026

AIs can’t stop recommending nuclear strikes in war game simulations

llms gpt-52 claude ai-agents

Research

AIs can't stop recommending nuclear strikes in war game simulations

Advanced AI models, including GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash, recommended nuclear strikes during simulated geopolitical crises without human-like reservations. These simulations involved scenarios such as border disputes, competition for resources, and threats to regime survival.

newscientist.com

🔥🔥🔥🔥🔥

3 min

2/25/2026

llms gpt-52 ai-safety machine-learning

Research

The case for zero-error horizons in trustworthy LLMs

arxiv.org

🔥🔥🔥🔥🔥

2 min

4/2/2026

llms gpt-52 claude ai-agents

Research

AIs can't stop recommending nuclear strikes in war game simulations

newscientist.com

🔥🔥🔥🔥🔥

3 min

2/25/2026

llms gpt-52 ai-safety machine-learning

Research

The case for zero-error horizons in trustworthy LLMs

arxiv.org

🔥🔥🔥🔥🔥

2 min

4/2/2026

llms gpt-52 claude ai-agents

Research

AIs can't stop recommending nuclear strikes in war game simulations

newscientist.com

🔥🔥🔥🔥🔥

3 min

2/25/2026

No more articles to load