








GLM-Image is Z.ai’s flagship open-source image model built for creators who need images that communicate, not just look pretty. It uses a hybrid “autoregressive + diffusion decoder” design that’s especially strong at instruction following, detail control, and generating knowledge-heavy visuals like posters, slides, and infographic-style layouts.
· Sharp Prompt Understanding: The AR module is initialized from GLM-4-9B-0414, helping it stay aligned with complex instructions.
· Crisp High-Res Output: A diffusion decoder based on a single-stream DiT architecture helps refine details and structure.
· Better Text in Images: The decoder includes a Glyph Encoder module to improve accurate text rendering inside images.
GLM-Image supports a rich set of image-to-image workflows, including:
· Image editing (instruction-based changes).
· Identity-preserving generation.
· Multi-subject consistency.
Generate: posters, covers, thumbnails, product mockups, UI concepts, diagrams, sci-pop visuals, and anything with structured layout.
Edit: upload an image and ask for changes like “replace background”, “change outfit”, “make it cinematic”, “keep the same character but swap the scene”.
1. Choose GLM-Image in the model list.
2. Pick Text to Image to generate, or Image Editing to upload and transform.
3. Write a clear prompt, add layout details if needed, then generate.
· Put any exact on-image text in quotes, and specify placement (top headline, center title, footer).
· Describe layout like a designer: grid, margins, typography, negative space.
· For edits, say what to keep and what to change: “keep face and pose, change background to…”
