AI/EXPLORER
OutilsCatégoriesSitesLLMsComparerQuiz IAAlternativesPremium
—Outils IA
—Sites & Blogs
—LLMs & Modèles
—Catégories
AI Explorer

Trouvez et comparez les meilleurs outils d'intelligence artificielle pour vos projets.

Fait avecen France

Explorer

  • ›Tous les outils
  • ›Sites & Blogs
  • ›LLMs & Modèles
  • ›Comparer
  • ›Chatbots
  • ›Images IA
  • ›Code & Dev

Entreprise

  • ›Premium
  • ›À propos
  • ›Contact
  • ›Blog

Légal

  • ›Mentions légales
  • ›Confidentialité
  • ›CGV

© 2026 AI Explorer·Tous droits réservés.

AccueilLLMsImagestable diffusion xl 1.0 inpainting 0.1

stable diffusion xl 1.0 inpainting 0.1

par diffusers

Open source · 208k downloads · 364 likes

3.2
(364 avis)ImageAPI & Local
À propos

Stable Diffusion XL 1.0 Inpainting 0.1 est un modèle de génération d'images par intelligence artificielle capable de créer des visuels photo-réalistes à partir de descriptions textuelles, avec une fonctionnalité avancée de retouche ciblée. Grâce à un système de masquage, il permet de modifier ou compléter des zones spécifiques d'une image tout en préservant le reste du contenu, offrant ainsi une grande précision dans les ajustements. Idéal pour les artistes, designers ou créateurs de contenu, il excelle dans la modification d'éléments comme des arrière-plans, des objets ou des détails sans altérer le reste de la composition. Ce modèle se distingue par sa capacité à intégrer harmonieusement les modifications suggérées par le texte, tout en maintenant une cohérence visuelle globale. Ses applications couvrent la création artistique, l'édition d'images professionnelles ou l'expérimentation visuelle, bien qu'il ne garantisse pas une parfaite exactitude ou un réalisme absolu.

Documentation

license: openrail++ base_model: stabilityai/stable-diffusion-xl-base-1.0 tags:

  • stable-diffusion-xl
  • stable-diffusion-xl-diffusers
  • text-to-image
  • diffusers
  • inpainting inference: false

SD-XL Inpainting 0.1 Model Card

inpaint-example

SD-XL Inpainting 0.1 is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask.

The SD-XL Inpainting 0.1 was initialized with the stable-diffusion-xl-base-1.0 weights. The model is trained for 40k steps at resolution 1024x1024 and 5% dropping of the text-conditioning to improve classifier-free classifier-free guidance sampling. For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and, in 25% mask everything.

How to use

Py
from diffusers import AutoPipelineForInpainting
from diffusers.utils import load_image
import torch

pipe = AutoPipelineForInpainting.from_pretrained("diffusers/stable-diffusion-xl-1.0-inpainting-0.1", torch_dtype=torch.float16, variant="fp16").to("cuda")

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"

image = load_image(img_url).resize((1024, 1024))
mask_image = load_image(mask_url).resize((1024, 1024))

prompt = "a tiger sitting on a park bench"
generator = torch.Generator(device="cuda").manual_seed(0)

image = pipe(
  prompt=prompt,
  image=image,
  mask_image=mask_image,
  guidance_scale=8.0,
  num_inference_steps=20,  # steps between 15 and 30 work well for us
  strength=0.99,  # make sure to use `strength` below 1.0
  generator=generator,
).images[0]

How it works:

imagemask_image
drawingdrawing
promptOutput
a tiger sitting on a park benchdrawing

Model Description

  • Developed by: The Diffusers team
  • Model type: Diffusion-based text-to-image generative model
  • License: CreativeML Open RAIL++-M License
  • Model Description: This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses two fixed, pretrained text encoders (OpenCLIP-ViT/G and CLIP-ViT/L).

Uses

Direct Use

The model is intended for research purposes only. Possible research areas and tasks include

  • Generation of artworks and use in design and other artistic processes.
  • Applications in educational or creative tools.
  • Research on generative models.
  • Safe deployment of models which have the potential to generate harmful content.
  • Probing and understanding the limitations and biases of generative models.

Excluded uses are described below.

Out-of-Scope Use

The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.

Limitations and Bias

Limitations

  • The model does not achieve perfect photorealism
  • The model cannot render legible text
  • The model struggles with more difficult tasks which involve compositionality, such as rendering an image corresponding to “A red cube on top of a blue sphere”
  • Faces and people in general may not be generated properly.
  • The autoencoding part of the model is lossy.
  • When the strength parameter is set to 1 (i.e. starting in-painting from a fully masked image), the quality of the image is degraded. The model retains the non-masked contents of the image, but images look less sharp. We're investing this and working on the next version.

Bias

While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.

Liens & Ressources
Spécifications
CatégorieImage
AccèsAPI & Local
LicenceOpen Source
TarificationOpen Source
Note
3.2

Essayer stable diffusion xl 1.0 inpainting 0.1

Accédez directement au modèle