Advanced LLM Flux Prompt

230

2025-03-11 02:49:26 Update

#FLUX

This ComfyUI workflow leverages a modular, task-oriented architecture to generate highly cohesive and detailed prompts for CLIP-L and T5XXL models. The workflow employs three specialized Large Language Model (LLM) modules, orchestrated in a sequential and interdependent manner, to streamline and optimize prompt creation.

Workflow Overview
Input Analysis Module:

The workflow begins with a general-purpose LLM responsible for parsing the input description.
It extracts semantic meaning, identifies key visual and contextual elements, and separates high-level intent into two distinct pathways: CLIP-L Prompt and T5XXL Prompt generation.
CLIP-L Prompt Generator:

A second LLM module processes the structured input from the analysis phase to generate a concise, keyword-driven CLIP-L prompt.
This module prioritizes compactness and relevance, ensuring optimal compatibility with the CLIP-L model.
Output includes key components such as main subjects, art style, setting, lighting, and color palette in a comma-separated format (e.g., portrait, photorealistic, sunset, warm tones, detailed shadows).
T5XXL Prompt Generator:

Parallel to the CLIP-L process, a third LLM module produces a richly detailed, natural language description tailored for T5XXL.
This module focuses on generating up to 512 tokens of descriptive content, covering aspects like:
Subject details (appearance, pose, expression, clothing).
Environmental settings (time of day, architectural specifics, props).
Lighting and color dynamics (intensity, contrast, harmony).
Scene composition (foreground, middle ground, background elements).
Atmosphere and mood (emotive and symbolic nuances).
Validation and Synchronization:

Both outputs are validated for semantic and stylistic alignment to ensure consistency between the CLIP-L and T5XXL prompts.
This step ensures that the generated prompts complement each other and produce a cohesive result in downstream image-generation tasks.

Key Features
Hierarchical Prompt Engineering: Utilizes a multi-step, role-specific design for modularity and precision.
Task-Oriented Workflow: Separates keyword extraction (CLIP-L) from detailed scene description (T5XXL) to optimize for model-specific strengths.
Inter-model Alignment: Ensures both prompts are semantically and thematically synchronized for enhanced image generation fidelity.
Scalability: The architecture is adaptable for additional tasks, such as fine-tuning outputs for specific artistic styles or domains.