Train character LoRAs from a single reference image and generate character consistent images across diverse scenes.
- Character Sheet Generation: Generate a diverse character sheet from a single reference image
- Automatic Captioning: Generate detailed captions for training images
- LoRA Training: Train high-quality character LoRAs
- Easy Inference: Generate images of your character in various scenarios
- Python 3.10 or higher
- GPU with at least 48GB VRAM
- At least 60GB RAM
- At least 100GB of free disk space
-
Clone the repository:
git clone https://github.com/RishiDesai/CharForge.git cd CharForge -
Set these API keys and variables in your .env and add funds where appropriate
HF_TOKEN HF_HOME CIVITAI_API_KEY TOGETHER_API_KEY FAL_KEY OPENAI_API_KEY -
Log into Hugging Face and accept their terms of service to download Flux.1-dev
-
Run the setup script
This will:
-
Install submodules including ComfyUI, all required ComfyUI custom nodes, LoRACaptioner, MVAdapter, and ai-toolkit.
-
Show all ComfyUI custom nodes
- comfyui_essentials
- comfyui-advancedliveportrait
- comfyui-ic-light
- comfyui-impact-pack
- comfyui-custom-scripts
- rgthree-comfy
- comfyui-easy-use
- comfyui-impact-subpack
- was-node-suite-comfyui
- ComfyUI_UltimateSDUpscale
- ComfyUI-PuLID-Flux-Enhanced
- comfy-image-saver
- ComfyUI-Image-Filters
- ComfyUI-Detail-Daemon
- ComfyUI-KJNodes
-
Download all necessary models to HF_HOME
-
Set up the character sheet generation pipeline
-
-
Activate venv source .venv/bin/activate
python train_character.py --name "character_name" --input "path/to/reference_image.png"
Show all training options
python train_character.py \
--name "character_name" \
--input "path/to/reference_image.png" \
[--work_dir WORK_DIR] \
[--steps STEPS] \
[--batch_size BATCH_SIZE] \
[--lr LEARNING_RATE] \
[--train_dim TRAIN_DIM] \
[--rank_dim RANK_DIM] \
[--pulidflux_images PULID_FLUX_IMAGES]
- --name (str): Character name (used for folder and model naming)
- --input (str): Path to input image
- --work_dir (str, optional): Working directory (defaults to ./scratch/{name}/)
- --steps (int, optional): Number of training steps (default: 800)
- --batch_size (int, optional): Training batch size (default: 1)
- --lr (float, optional): Learning rate (default: 8e-4)
- --train_dim (int, optional): Training image dimension (default: 512)
- --rank_dim (int, optional): LoRA rank dimension (default: 8)
- --pulidflux_images (int, optional): Number of Pulid-Flux images to include (default: 0)
This command will:
- Generate a character sheet from your input image
- Caption the generated images
- Train a LoRA on Flux.1-dev using the generated dataset
python test_character.py --character_name "character_name" --prompt "A detailed prompt here"
Show all inference options
python test_character.py \
--character_name "character_name" \
--prompt "A detailed prompt here" \
[--work_dir WORK_DIR] \
[--lora_weight LORA_WEIGHT] \
[--test_dim TEST_DIM] \
[--do_optimize_prompt/--no_optimize_prompt] \
[--output_filenames FILE1 FILE2 ...] \
[--batch_size BATCH_SIZE] \
[--num_inference_steps STEPS] \
[--fix_outfit/--no_fix_outfit] \
[--safety_check/--no_safety_check] \
[--face_enhance/--no_face_enhance]
- --character_name (str): Name of the character (used to find LoRA and work_dir)
- --prompt (str): The prompt to use for generation
- --work_dir (str, optional): Working directory (defaults to ./scratch/{character_name}/)
- --lora_weight (float, optional): LoRA strength (default: 0.73)
- --test_dim (int, optional): Image width/height (default: 1024)
- --do_optimize_prompt / --no_optimize_prompt: Whether to optimize the prompt using LoRACaptioner (default: enabled)
- --output_filenames (str, optional): Filenames for output images (space separated list)
- --batch_size (int, optional): Number of images to generate (default: 4)
- --num_inference_steps (int, optional): Steps for generation (default: 30)
- --fix_outfit / --no_fix_outfit: Use the reference image flag in prompt optimization (default: disabled)
- --safety_check / --no_safety_check: Run safety checks on generated images (default: enabled)
- --face_enhance / --no_face_enhance: Enable or disable face enhancement (default: disabled)
This command will:
- Load your LoRA, prompt it, and generate the image(s)
- Optionally do prompt optimization, FaceEnhance outputs, and run a safety check.
Note: The first run of train_character.py and test_character.py will take longer as remaining models will be downloaded.
- The training script runs a ComfyUI server ephemerally.
- All character images and character data are saved in ./scratch/{character_name} for easy access and organization.
- fal.ai is used for upscaling and generating PuLID-Flux images, Together AI is used for image captioning and prompt optimization (via LoRACaptioner), GPT-4o is used for generating prompts for PuLID-Flux.
- The character sheet generation is partly based off Mickmumpitz's Flux character consistency workflow. Specifically upscaling images, facial expressions, and lighting conditions.
- Sections of the workflow were broken up into modular pieces. I used the ComfyUI-to-Python-Extension to re-engineer components for efficiency and function.
- The character sheet includes multi-view images, varied facial expressions, lighting conditions, and (optionally) PuLID-Flux images.
- Images are autocaptioned using LoRACaptioner.
- LoRA is trained using ai-toolkit.
- Inference is handled by diffusers with some speed improvements from the Modal Flux inference guide.
- Training: LoRA rank of 8 and resolution fixed to 512x512 is the right balance of quality and speed.
- Entire training pipeline takes 30-40 minutes on 1 L40S
- Inference: Resolution of 1024x1024 and LoRA weight of 0.65-0.85 gives the best results.
- Batch size of 4 takes 60 seconds on 1 L40S if the models are loaded in memory, 120 seconds otherwise.
- If FaceEnhance is enabled, you will likely need more than 48GB VRAM.
- Training Parameters: You can modify training parameters by passing the relevant CLI arguments to train_character.py, or by editing the YAML config scripts/character_lora.yaml.
- Public LoRA Serving: Use python scripts/serve_lora.py to serve LoRA weights via a FastAPI server, making them publicly accessible (e.g., for fal.ai inference).
- Run ComfyUI Server: Use python scripts/run_comfy.py to launch a ComfyUI server, useful for doing inference manually.
- Symlink LoRAs for ComfyUI: Use bash scripts/symlink_loras.sh to symlink trained LoRA weights from scratch/{character_name}/ to the ComfyUI LoRA directory for easy access.
- Model download issues: Check your Hugging Face or CivitAI credentials
- Out of memory: Use batch_size=1 for GPUs with less than 48GB VRAM