百变背景：AIGC电商图片可控生成技术

百变背景：AIGC电商图片可控生成技术

2024-12-23 20:55

随着AI生成内容（AIGC）技术如Diffusion的飞速进展，现如今，大家已能够轻易地使用Stable Diffusion（SD）[1]等文生图的模型或工具，将心中所想仅凭语言描述（prompt）即转化为具体图像。基于此，我们不禁思考：是否有可能进一步发展该技术，允许用户通过描述来为商品定制特定背景，从而协助商家快速且轻松地打造理想的商品图像？例如，为一个包生成一个室内桌面摆放的背景，或是为某款连衣裙创造出站在海边的、气质甜美的模特形象等。

5.1 白底背景生成

为了生成具有较好光影感的白底图，我们采用对图像的风格参考较强的Shuffle ControlNet 进行控制，输入为一张纯白色的参考图；同时为了避免影响前景部分的生成效果，我们同上文所述的Masked Canny ControlNet一样，引入mask 来控制 Shuffle ControlNet 仅作用于背景区域。此外，我们优化了生图时的 prompt，使得结果图在白底的基础上具有一定的光影效果。

为了提高白底图的效果稳定性，我们还使用了一种基于LoRA的方案。我们收集了大量高视觉美观度的纯色棚拍图，为提高LoRA与Inpainting ControlNet的兼容性，我们对图像进行分割处理，获得人物前景与背景，配合Inpainting ControlNet训练一个LoRA来生成纯色背景。

5.2 指定颜色后处理

在此基础上，为了生成指定颜色的背景，我们首先对白底图进行前景分割，然后以color matcher[10]的方式对背景进行颜色变换，具体来说，color matcher会接受一张颜色参考图和白底图的背景，并通过线性变换的方式将白底图背景映射为参考图的颜色。最后我们将前景和变换后的背景结合起来得到最终的结果图。

5.3 效果

一起交流

想和你一起学习进步！『NewBeeNLP』目前已经建立了多个不同方向交流群（机器学习 / 深度学习 / 自然语言处理 / 搜索推荐 / 图网络 / 面试交流 / 等），名额有限，赶紧添加下方微信加入一起讨论交流吧！（注意一定o要备注信息才能通过）

团队论文：

Hongyu Chen, Yiqi Gao, Min Zhou, Peng Wang, Xubin Li, Tiezheng Ge, Bo Zheng. Enhancing prompt Following with Visual Control Through Training-Free Mask-Guided Diffusion, arXiv preprint arXiv:2404.14768, 2024.

其他论文：

[1] https://stability.ai/news/stable-diffusion-public-release

[2] Lvmin Zhang, Anyi Rao, Maneesh Agrawala. Adding Conditional Control to Text-to-Image Diffusion Models. ICCV2023: 3813-3824

[3] aphael Tang, Linqing Liu, Akshat Pandey, Zhiying Jiang, Gefei Yang, Karun Kumar, Pontus Stenetorp, Jimmy Lin, and Ferhan Ture. What the daam: Interpreting stable diffusion using cross attention. ACL 2023: 5644-5659

[4] Hila Chefer, Yuval Alaluf, Yael Vinker, Lior Wolf, and Daniel Cohen-Or.Attend-and-excite:Attention-based semantic guidance for text-to-image diffusion models.ACM Transactions on Graphics (TOG), 42(4):1–10, 2023.

[5] Royi Rassin, Eran Hirsch, Daniel Glickman, Shauli Ravfogel, Yoav Goldberg, and Gal Chechik. Linguistic binding in diffusion models: Enhancing attribute correspondence through attention map alignment. NeurIPS 2023.

[6] Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, and William Yang Wang. Training-free structured diffusion guidance for compositional text-to-image synthesis. arXiv preprint arXiv:2212.05032, 2022.

[7] https://civitai.com/models/56519/negativehand-negative-embedding

[8] https://civitai.com/models/200255/hands-xl-sd-15

[9] Shanchuan Lin, Bingchen Liu, Jiashi Li, Xiao Yang. Common Diffusion Noise Schedules and Sample Steps are Flawed. WACV2024: 5392-5399

[10] F. Pitie and A. Kokaram, "The linear Monge-Kantorovitch linear colour mapping for example-based colour transfer," 4th European Conference on Visual Media Production, London, 2007, pp. 1-9, doi: 10.1049/cp:20070055.

END

以上就是本篇文章【百变背景：AIGC电商图片可控生成技术】的全部内容了，欢迎阅览！文章地址：http://keair.bhha.com.cn/quote/5096.html
动态相关文章文章同类文章热门文章栏目首页网站地图返回首页康宝晨移动站 http://keair.bhha.com.cn/mobile/ , 查看更多