Claire reviews a powerful AI model called Llama that can handle complex tasks and Ankur Goyal, the founder of Braintrust, discusses how top engineering teams are using AI agents to ship better software faster. The episode covers topics such as:
- Llama's capabilities and limitations
- Ankur's vision for AI-powered engineering
- How to use AI agents to tackle hard infrastructure problems
- The importance of building feedback loops that automatically turn real-world data into evals
- Quantifying designer taste so it scales across the product
- Carving out complexity from products rather than adding more features
The episode also includes a discussion on how to invest in Continuous Integration (CI) to earn the ability to move faster, and Ankur's disciplined approach to agent failure.