| Bengt Hires A Human—Towards A Happy Future With AI Employers | Blog | 2/13/2026 |
| GLM-5 On Vending-Bench – Are Chinese Open Models Catching Up? | Blog | 2/10/2026 |
| The Evolution Of Bengt Betjänt | Blog | 2/9/2026 |
| Opus 4.6 On Vending-Bench – Not Just A Helpful Assistant | Blog | 2/4/2026 |
Butter-Bench: Evaluating LLM Controlled Robots For Practical Intelligence  | Paper | 10/28/2025 |
Blueprint-Bench: Testing Spatial Intelligence In AI Models  | Paper | 10/1/2025 |
Safety Report: August 2025  | Report | 8/28/2025 |
Vending-Bench: Testing Long-Term Coherence In Agents  | Paper | 2/16/2025 |
From Text To Action: Future-Proofing Evaluations Of LLMs' Agentic Capabilities For Social Impact  | Paper | 12/2/2024 |