par Minthy
Open source · 8k downloads · 34 likes
RouWei 0.8 est un modèle d'IA spécialisé dans la génération d'images artistiques, notamment dans le style anime et illustrations numériques. Grâce à un entraînement approfondi sur un vaste jeu de données de plus de 13 millions d'images, il excelle dans l'adhérence aux prompts, la fidélité des détails et la diversité des styles, couvrant plus de 50 000 artistes. Le modèle se distingue par sa capacité à éviter les biais et les fuites de tags, offrant des résultats esthétiques et variés sans compromis sur la stabilité. Il propose deux versions, dont une optimisée pour une génération plus fluide, et permet une grande flexibilité dans les styles et les concepts, des personnages aux univers culturels. Son approche épurée, sans watermarks ni artefacts, en fait un outil idéal pour les artistes et créateurs cherchant des rendus haut de gamme et précis.

Dataset of 13M unique pictures (~4M with natural text captions) picked and balanced from over 25M of anime art, covers, digital illustrations, western media and other sources, including private datasets. More detailed description on Civitai
Vpred version is now available. It works flawlessly out of box without any burning or related issues. Consider to use lower CFG (3..5), other generation settings are same. Some exotic and experimental samplers/schedulers untested.
Dataset cut-off - end if April 2025.
When you are prompting artist styles, especially mixing several, their tags MUST BE in a separate CLIP chunk. Add BREAK after it (for A1111 and derivatives), use conditioning concat node (for Comfy) or at least put them in the very end. Otherwise, significant degradation of results is likely.
The model is designed to work both with short booru tag-based and long complex natural text prompts. The best result can be achieved using the combination of tags and some natural text phrases. For tags classic danbooru-style comma-separated tags without underscores were used.
~1..1.5 megapixel for txt2img, any AR with resolution multiple of 64 (1024x1024, 1152x, 1216x832,...). Euler_a, CFG 4..8 for epsilon/3..5 for vpred, 20..28steps. LCM/PCM/DMD untested, cfg++ samplers work fine, some shedulers not working. Highresfix: x1.5 latent + denoise 0.6 or any gan + denoise 0.3..0.55.
Please note that vpred version requires a lower CFG value.
Examples can be found in repo, more on civitai.
There are only 4:
masterpiece, best quality
for positive and
low quality, worst quality
for negative
Nothing else. All except low quality in negative can be ommited.
Meta tags like lowres have been removed, do not use them. Low resolution images have been either removed or upscaled and cleaned with DAT depending on their importance
worst quality, low quality, watermark
For best results keep it as clean as possible. Spamming of popular sequences will not improve results, since all related flaws have been solved, but will only lead to unwanted effects, biases and poor quality.
The model knows over 35k of artist styles. List, grids with example on Mega. Used with by , will not work properly without it.
2.5d, anime screencap, bold line, sketch, cgi, digital painting, flat colors, smooth shading, minimalistic, ink style, oil style, pastel style
Use it in combination with booru tags, works great. Use only natural text after typing styles and quality tags. Use just booru tags and forget about it, it's all up to you. About 4M of pitures from dataset have hybrid natural-text captions made by Claude, GPT, Gemini and ToriiGate Version 0.8 comes with advanced understanding of natural text prompts, providing state of the art performance among SDXL anime models. It doesn't mean that you are obligated to use nl prompts, tags only - completely fine, especially because understanding of tags combinations is also improved.
You can use extra meta tags to control it:
low brightness, high brightness, low saturation, high saturation, low gamma, high gamma, sharp colors, soft colors, hdr, sdr
Vpred version for RouWei-0.8 will come soon.
You can use FP32 version for more accurate merging, or to get some benefits from using text encoders in fp32 mode with Comfy. Epsilon and vpred versions here have a brief aesthetic polishing after main training to improve small details and coherence. If you want to use RouWei in merges, extract or finetune it without bringing that last things - you can use BASE VERSION of RouWei: FP16 FP32
Model tends to generate NSFW images for corresponding prompts, consider to add extra filtering. Outputs may be inacurate and provocative and must not be used as a reference.
Same as illustrious, please check out original page for limitation. Fell free to use in your merges, finetunes, ets. just please leave a link.
A number of anonymous persons, Bakariso, dga, Fi., ello, K., LOL2024, NeuroSenko, rred, Soviet Cat, Sv1., T., TekeshiX and other fellow brothers that helped.
BTC bc1qwv83ggq8rvv07uk6dv4njs0j3yygj3aax4wg6c
ETH/USDT(e) 0x04C8a749F49aE8a56CB84cF0C99CD9E92eDB17db
XMR 47F7JAyKP8tMBtzwxpoZsUVB8wzg2VrbtDKBice9FAS1FikbHEXXPof4PAb42CQ5ch8p8Hs4RvJuzPHDtaVSdQzD6ZbA5TZ