AI is changing the world. Don't stay behind. Clear summaries, community insight, delivered without the noise. Subscribe to never miss a beat.

Privacy

Contact

Back to all news

ai-agents autoresearch machine-learning experimentation

Sakana Fugu

sakana.ai

June 22, 2026

4 min read

🔥🔥🔥🔥🔥

51/100

Summary

Sakana Fugu is a multi-agent system that autonomously enhances a small GPT's training recipe using AutoResearch, which iteratively edits training code and conducts experiments. The AI agent completed 123 experiments over approximately 14 hours on a single H100 GPU, tracking improvements in validation bits-per-byte (BPB).

Key Takeaways

The AI agent Fugu-Ultra autonomously improved a small GPT's training recipe, achieving the best mean bits-per-byte (BPB) of 0.9774 across 123 experiments on a single H100 GPU.
Fugu-Ultra outperformed three frontier models in a reading order task for classical Japanese kana, scoring 0.80 on the NED metric, while the best frontier model scored only 0.24.
In a benchmark for writing a Rubik's Cube solver in Python, Fugu-Ultra successfully solved all 300 cubes, while two other models produced code that crashed without valid solutions.
The results suggest that orchestrating multiple strong models can lead to superior performance in agentic machine learning research compared to individual frontier models.

Read original article

Community Sentiment

Mixed

Positives

Sakana's approach to model orchestration could streamline AI usage, allowing users to select the best model for their needs without deep technical knowledge.
The integration of multiple models checking each other is seen as a promising strategy, potentially leading to better performance and user outcomes.
The team behind Sakana is perceived as intelligent and capable, which raises expectations for their product's success.

Concerns

Concerns about Sakana's involvement in military contracts may deter potential users who prioritize ethical considerations in AI development.
The reliance on commercial models accessed via API instead of open-source alternatives limits the potential for innovation and accessibility in AI applications.
High subscription costs for multiple AI services are seen as unsustainable, leading to fears of a price race to the bottom.