ChatGPT’s new Images 2.0 model is surprisingly good at generating text

ChatGPT's new image generation model, ChatGPT Images 2.0, has significantly improved its ability to create realistic and detailed images. Two years ago, AI models could not generate menus for restaurants without inventing fictional dishes. However, the latest model can produce menu items that are almost indistinguishable from real ones.

The new model uses a mechanism called autoregressive models, which allows it to make predictions about what an image should look like and function more like a large language model (LLM). This has enabled ChatGPT Images 2.0 to generate images with complex details such as small text, iconography, and subtle stylistic constraints.

OpenAI's new model has "thinking capabilities" that allow it to search the web, make multiple images from one prompt, and double-check its creations. This enables Images 2.0 to create marketing assets in various sizes, as well as multi-paneled comic strips. The company also claims that the model has a stronger understanding of non-Latin text rendering in languages such as Japanese, Korean, Hindi, and Bengali.

ChatGPT Images 2.0 will be available to all users starting Tuesday, with paid users able to generate more advanced outputs. The gpt-image-2 API will also be made available, with pricing dependent on the quality and resolution of outputs.