BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing

Created by MG96

External Public cs.CV cs.AI cs.MM

Statistics

Citations
0
References
0
Last updated
Loading...
Authors

Yaowei Li Lingen Li Zhaoyang Zhang Xiaoyu Li Guangzhi Wang Hongxiang Li Xiaodong Cun Ying Shan Yuexian Zou
Project Resources

Name Type Source Actions
ArXiv Paper Paper arXiv
Abstract

Element-level visual manipulation is essential in digital content creation, but current diffusion-based methods lack the precision and flexibility of traditional tools. In this work, we introduce BlobCtrl, a framework that unifies element-level generation and editing using a probabilistic blob-based representation. By employing blobs as visual primitives, our approach effectively decouples and represents spatial location, semantic content, and identity information, enabling precise element-level manipulation. Our key contributions include: 1) a dual-branch diffusion architecture with hierarchical feature fusion for seamless foreground-background integration; 2) a self-supervised training paradigm with tailored data augmentation and score functions; and 3) controllable dropout strategies to balance fidelity and diversity. To support further research, we introduce BlobData for large-scale training and BlobBench for systematic evaluation. Experiments show that BlobCtrl excels in various element-level manipulation tasks while maintaining computational efficiency, offering a practical solution for precise and flexible visual content creation. Project page: https://liyaowei-stu.github.io/project/BlobCtrl/

Note:

No note available for this project.

No note available for this project.
Contact:

No contact available for this project.

No contact available for this project.