AI Image Editor is getting closer to Photoshop
AI Image Editor can now layer images! Open-sourced by Alibaba Qwen Team. Decomposes a single RGB image into multiple semantically independent RGBA layers, enabling lossless, high-consistency end-to-end image editing.
Input an RGB image, output multiple independent layers with alpha channels.
Qwen-Image-Layered is not just a generative model. By simulating the layer logic of professional design software, it solves the consistency pain points of traditional diffusion models.
Decomposes ordinary images into semantically meaningful RGBA layers. Supports precise separation of complex scenes, text, and semi-transparent objects.
Flexible layer count (3, 8, etc.). Supports recursive decomposition of single layers for infinite hierarchical control.
Each layer can be independently moved, scaled, deleted, or recolored. Physical isolation ensures background consistency during edits.
Supports generating multi-layered images directly from text prompts, providing ready-to-use layered assets for creative design.
Perfectly integrated with Qwen-Image-Edit for advanced inpainting and replacement operations on specific layers.
Apache 2.0 license. Model weights, codebase, and paper are all public. Available on Hugging Face and ModelScope.
The example above shows decomposition of complex images: input on left, multiple RGBA layers (with transparency) on right, separating semantic elements like background, objects, text.
Designed a Variational Autoencoder unifying RGB and RGBA to build a compatible latent space representation for transparency.
Based on Qwen2.5-VL, adopts a Diffusion Transformer architecture supporting Variable-Layer counts.
Starting from large-scale pre-trained image generation models, fine-tuned with carefully designed strategies for multi-layer decomposition.
Built datasets using real PSD files, ensuring the model handles real-world challenges like semi-transparent occlusion and complex layout.
from diffusers import QwenImageLayeredPipeline
import torch
from PIL import Image
# 1. Load model
pipeline = QwenImageLayeredPipeline.from_pretrained("Qwen/Qwen-Image-Layered")
pipeline = pipeline.to("cuda", torch.bfloat16)
# 2. Prepare image
image = Image.open("test.png").convert("RGBA")
# 3. Set params & infer
inputs = {
"image": image,
"layers": 4, # Specify 4 layers
"num_inference_steps": 50,
"resolution": 640,
"true_cfg_scale": 4.0,
}
with torch.inference_mode():
output = pipeline(**inputs)
layers = output.images[0] # Returns list of layers
# 4. Save layers
for i, layer in enumerate(layers):
layer.save(f"layer_{i}.png")Requires transformers>=4.51.3 and latest diffusers