用本地 uncensored AI 做图片、视频大指南 (with step-by-step instructions and examples) (examples 有点擦边球)

By popular request, from my last thread on this topic ( https://www.uscardforum.com/t/topic/475986/14 )

(本来放在性爱,但想了想,也可以做其他东西,所以换到学术。Mod不允许可以换回去).

Step 0:

假如你有时间,还是建议看完这个tutorial https://www.youtube.com/watch?v=HkoRkNLWQzY,它的确很长,但 watching it fully will help you understand what you’re doing much more, and allow you to do your own exploration and try new things

Step 1:

Download ComfyUI-Easy-Install from GitHub: GitHub - Tavris1/ComfyUI-Easy-Install: Portable ComfyUI for Windows, macOS and Linux 🔹 Nvidia GPUs 🔹 Pixaroma Community Edition 🔹

(direct link to the current latest version: https://github.com/Tavris1/ComfyUI-Easy-Install/releases/download/2.01.12/ComfyUI-Easy-Install.zip)

Step 2:

Extract to a folder of your choice. 建议放在至少有1TB free space 的SSD,越快越好。

Step 3:

Run ComfyUI-Easy-Install.bat。假如有 Git not found error,先安装 winget,see Use WinGet to install and manage applications | Microsoft Learn

Step 4:

(optional if you want to download models that can generate NSFW images)

While ComfyUI is installing, go to https://civitai.com/ . Register an account, and then click your name in the upper right, and then select account settings:

Under “Content Moderation”, enable everything:

Step 5:

ComfyUI-Easy-Install.bat 完了会在 Desktop create “ComfyUI-EZi” shortcut. Open that shortcut. A console window will open and eventually start a local server, and then it will open your browser window to connect to that local server. This is the main interface:

Step 6:

Click “Workflow” on the left panel, and then expand “Getting Started”. Click on “5b Z-Image Turbo Fp8 text2img.json”

Step 7:

This is the main workflow you’ll be using. Use your mouse wheel to zoom in and out. Click and drag an empty area to pan.

这个 workflow 需要 3 个 model,一个 image generation,一个 text encoding,和一个 latent space conversion。

Follow the links in the note on the very left to download an abliterated (uncensored) Qwen3 model as the text encoder (CLIP). Be sure to place it in the correct subdirectory under your ComfyUI install folder. Also download the place the latest space converter (VAE). See the full tutorial video for details on what this means.

Step 8:

For the main image generation model, you can either use the suggested model, or download a more flexible one (a model that can generate NSFW) from Civitai.

If you want a more flexible model, go to the Civitai home page, click Models:

And then on the right, filter for the most downloaded Z Image Turbo checkpoints recently:

(results will be NSFW)

Choose one that you like. Choose a fp8 model if you have less than 12-16GB VRAM. Choose a fp16 model (if a model does not have a fp8 label, it’s probably fp16) if you have 12-16GB VRAM or more.

For this example I’ll choose “Moody Porn Mix” ZIT-V6. But you can try other ones.

Moody Porn Mix - ZIT-V6 | ZImageTurbo Checkpoint | Civitai (NSFW link)

Download and put the model’s safetensors file in ComfyUI/models/diffusion_models/z-image

Step 9:

Go back to your browser window with the ComfyUI workflow open. Press F5 on your browser to reload the page. Now click on the model name field in the “Load Diffusion Model” and (optional) click again to select the Moody Porn Mix model instead of the pre-filled default:

CLIP and VAE should already be correct.

The “Empty Latent” box adjusts the resolution of the output:

There’s no real need to go above 1024x1024 because you can add an upscaler to your workflow that’s much more efficient than generating a higher resolution image from the diffuser model.

Click run in the top right.

If everything worked correctly, the model will output a red robot, the default prompt for this sample workflow.

Step 10:

现在总算有点意思了。Change the prompt to something better and click run again. 中文英文都可以。可以用其它LLM帮你写prompt。

比如:

Photorealistic, detailed image of a woman, age 20, petite, skinny, with long hair, black hair, twin tails, east asian features, innocent looking, very cute, brown eyes, raised inner eyebrows, standing outside in an urban alleyway at night. she’s taking off her top. she’s wearing tight pink shorts, a tight black crop shirt, showing her midriff. she’s facing the viewer. use rim light, dramatic shadow, detailed skin, detailed eyes.

Output:

(If you download and drag this image into a ComfyUI browser window, you will load the workflow and prompt, which are saved as metadata in generated images by default. You can download any image from Civitai and load it into ComfyUI to see how it was generated.)

The more specific you are, the more detailed the image will be. 这个 image model 很好,可以incorporate text in output with very high fidelity。再来一个:

photo of a woman, age 20, petite, skinny, with long hair, dyed platinum hair, innocent looking, very cute, east asian features, alluring, brown eyes, big eyes, raised inner eyebrows. she’s sitting on a pink bed in a in a highrise. city lights are visible outside through a window on the left. on the right is a table with a computer monitor. the monitor is displaying the large characters “美卡论坛” in a browser window with a dark gray background. she’s wearing white sweatpants and a tight white crop shirt and showing off her midriff. her hands are behind her head. she has large breasts. she’s sitting facing the viewer with her legs wide open. she has an innocent open mouth smile. use rim light, dramatic shadow, detailed skin, detailed eyes.

Output:

Step 10:

By default, the seed changes every generation, so the features will be different. If you are very specific in your prompt, the output will be more similar. However, if you have an output you really like, you can also try keeping the seed the same and only change the prompt. You can adjust seed settings by clicking on “control after generate” in the KSampler box:

So if we like a particular result and want to do more with it, we can fix the seed and change the prompt. But we can also use a more detailed prompt to try to keep features similar, and let the seed continue to change for the outcome to be more creative.

Let’s continue to develop this prompt:

photo of a woman, age 20, petite, skinny, with long hair, dyed platinum hair, innocent looking, very cute, east asian features, alluring, brown eyes, big eyes, raised inner eyebrows. she’s sitting on a pink bed in a in a highrise. city lights are visible outside through a window on the left. on the right is a table with a computer monitor. the monitor is displaying the large characters “美卡论坛” in a browser window with a dark gray background. she’s wearing is wearing pink lace panties  and a tight white crop shirt and showing off her midriff. she is taking off her top, flashing her bare breasts.  she has large breasts. she’s sitting facing the viewer with her legs wide open. she has an innocent open mouth smile. use rim light, dramatic shadow, detailed skin, detailed eyes.

Output:

https://files.catbox.moe/ho21m3.png

(NSFW)

这一位下一步整么发展,就看你了

Step 11:

Image to video generation 我自己也刚刚开始,但我找到了一个比较简单的workflow:

Download this model:

https://huggingface.co/Phr00t/WAN2.2-14B-Rapid-AllInOne/blob/main/Mega-v12/wan2.2-rapid-mega-aio-nsfw-v12.2.safetensors

Put it into ComfyUI\models\checkpoints\WAN

Download this workflow

https://huggingface.co/Phr00t/WAN2.2-14B-Rapid-AllInOne/blob/main/Mega-v3/Rapid-AIO-Mega.json

And put it into ComfyUI\user\default\workflows, or just drag the file into the ComfyUI browser window.

By default, the workflow generates a 5 second video (81 frames at 16 fps) 768x768 video from text. We need to enable generating from image.

First, make sure the correct model version is loaded. (The workflow defaults to version 3 of that particular model series, but we want to select the version 12.1 that we just downloaded).

Next, click the specified icon to Unbypass the Start Frame and WanVideoWrapper boxes.

Then load a previously generated image from image model (or any other image) into Start Frame.

Don’t forget to change the prompt into something more interesting, like

a woman sitting on a bed squeezes her breasts with her hands. she smiles and blows a kiss at the camera. smooth and fluid motion, static camera, high quality, detailed lighting, consistent character identity, clean background, stable animation. High quality, sharp details.

Then click run:

Note that video generation will take much longer. Also, I noticed that video generation time/VRAM requirements scale linearly with resolution, but quadratically with video length. It’s hard to go over 8 or 9 seconds even with 24GB VRAM.

Output:

output

full version

That’s it for now. Thank you for reading, and please share your thoughts, tips, workflows, and best creations in this thread!

If this helped you and you want to help me, please tell me where I can buy a 5090 FE for MSRP. Thank you!

79 个赞

在这个区冲钛吗? :distorted_face:

4 个赞

换区了,因为这实际上也可以做很多其他东西。但Mod不允许可以换回性爱。

看到了熟悉的东西,之前做过类似的插件很久,但是还是不如上班香啊

而且对着ai撸管真不行吧,效果不好,同质化严重,还是人画的用着爽

我发现了一个很爽的办法,set up a workflow and scene that you like, set the seed to random, and then have it generate 64 or 128 images. you’ll get a slightly different woman every few seconds, all doing whatever you want. I really like it.

2 个赞

点进楼主的链接发现了以下:

NSFW Prompt

人物设定 & 发型设计 & OOTD 全解 19岁韩国Kpop偶像coser,皮肤光滑细腻,通体雪白,极致冷白皮,脸蛋精致甜美如少女,象牙白冷白皮肤光滑透亮泛着亲密潮红细腻光泽,妆容干净可爱,棕色大地眼影柔和晕染,眼线细腻贴合眼形,睫毛浓密卷翘自然上扬,唇色玫瑰豆沙湿润带晶莹光泽,轻微咬唇痕迹显粉嫩齿痕,腮红饱和桃粉晕染成小猫脸庞。长直黑发顺滑披散至腰际,几缕发丝因汗水微微粘在脸颊和颈侧。脸型精致瓜子脸,妆容清透自然:大地色眼影轻晕染,眼线细致贴合双眼皮,浓密自然睫毛,唇色玫瑰豆沙带湿润光泽。身材丰满火辣,巨乳挺拔,腰肢纤细,臀部浑圆翘挺,全身赤裸,乳头粉嫩挺立,阴部光洁无毛。男性为身材健硕的65岁白人男性老头,肌肉线条分明,腹肌清晰,阴茎粗壮勃起,青筋凸显,皮肤略带古铜色。 表情管理 女性嘴唇微张呈惊讶与渴望混合的O形,眼神向上仰视男性,瞳孔放大,带着一丝紧张与兴奋的俏皮。男性低头俯视女性,表情专注而充满欲望,嘴唇紧抿。 肢体状态与细节 女性跪坐在粉色床单上面对男性,双膝分开,臀部坐在脚跟上,上身前倾,双手握住男性勃起的阴茎,掌心包裹茎身,指尖轻轻扣住根部,指甲涂抹裸粉色。男性站立面对女性,双腿微分,左手搭在女性后颈,右手自然垂落。她的巨乳因前倾而垂坠晃动,乳晕清晰可见。 场景与构图 卧室内部,粉色床单凌乱铺展,白色窗帘背景模糊。中景为两人面对面的亲密互动,前景是女性握住阴茎的手部和垂落的长发丝,背景是柔和白色墙面与床头柜。镜头采用低角度仰视构图,强调女性仰头仰视的视角与男性俯视的压迫感,视觉引导从女性脸部向上延伸至男性阴茎。 光影设计 & 拍摄要点 柔和暖白自然光从右侧窗帘方向洒入,形成侧逆光,在女性脸颊、肩线、乳房轮廓勾勒出金边高光,男性身体左侧阴影加深肌肉立体感。床头灯补充微弱暖光,从上方偏左打下,在床单上形成浅浅阴影。画面呈现真实近距离手机拍摄质感,轻微颗粒感与自然色差,超高细节,8k,真实摄影风格。

这种prompt是怎么想出来的 :distorted_face: 我写高考作文都做不到这么用力

2 个赞

玩开源diffusion模型的都这样,因为模型效果有限

这种都是前人做过很多尝试试出来的好prompt,说白了就是抄的

也有人通过调试prompt生成好东西放patreon赚钱的

用多了就不行了,我23年1月就开始研究这些东西了,主要还是画风和滤镜比较单一,导致用起来很容易审美疲劳

1 个赞

但model有进步,ZIT比SD1.5/SDXL好很多,说不定年底会有更好的

黄色果然是人类进步的源泉
难怪古代皇帝都要穿黄

Lz好人一生平安

1 个赞

我正要问有没有prerequisite

比如cpu和gpu ram 有没有要求..

或者有没有一键部署的工具…

1 个赞

好帖子,必须顶

1 个赞

看来韩国女的数据还是污染这么严重

2 个赞

CPU 随便,影响不大,Intel i5-12600/Ryzen 5800 水平应该够了

RAM 最好 32GB

GPU 要求高,最好是 nVidia 卡,VRAM 最重要,8GB VRAM minimum for image generation (fp8),最好要 16GB+ VRAM (allows fp16 for image generation and also video generation). CUDA core 越多 generate 越快。Baseline 应该是 RTX 3070 左右。最好 RTX 3090/4080 水平 or better。

3 个赞

我还真有这样的设备,3080ti 或者笔记本4090,64 ram 这种的话生成9宫格,或者一张1080p的图要多久呢:thinking:

1 个赞

that’s a good setup. it will only take several seconds per image, or 5 minutes per 5 second video.

keep the image resolution below 1280x800 for diffusion generation. use a separate upscaling node if you want higher resolution. you can find sample workflows for this.

2 个赞

看来还是很吃资源(没想到视频比图片慢这么多

1 个赞

video takes a lot of resources because you are generating (# of seconds x fps) image frames all at the same time, and the model has to apply some sort of self-attention mechanism to all the frames to keep them coherent. something like that. so it’s much more complex than images, with less reliable results.

1 个赞

放学术版挺对的;视频的例子可以换成伸懒腰之类的比较好。我非常支持大家都在本地跑uncensored的模型,不被openai/人类学/网信办限制。

谁 @ 一下小林让他好好学习一下

3 个赞