
arxiv.org
February 25, 2026
1 min read
34/100
Summary
Aletheia, powered by Gemini 3 Deep Think, autonomously solved 6 out of 10 problems in the FirstProof challenge. Expert assessments confirmed the accuracy of Aletheia's solutions for the problems completed.
Key Takeaways