Google's open-source model family, Gemma 4, can now run directly on iPhones with full local inference and offline capabilities. This marks a significant step in the development of edge AI deployment, indicating that it's no longer a future priority but a current reality. The 31B variant of Gemma 4 has been benchmarked against Qwen 3.5, showing a relatively close matchup with roughly 4 billion additional parameters.

However, the more compelling aspect of Gemma 4 is its smaller variants, E2B and E4B, which are engineered for mobile deployment and prioritize efficiency over raw capability. Google's own app, AI Edge Gallery, natively supports these variants and encourages users to select them due to their faster and lighter performance.

To get started with Gemma 4 on iPhone, users simply need to download the Google AI Edge Gallery from the App Store and choose a model variant. The gallery includes features such as image recognition, voice interaction, and an extensible Skills framework, positioning it more like a platform for experimentation than just a demo.

The inclusion of offline capability in Gemma 4 has significant implications for enterprise use cases, particularly in field applications and healthcare settings where data privacy is a major concern. Google's move with Gemma 4 on iPhone marks the arrival of the on-device AI era, and it's clear that this technology is becoming commercially viable.