When we combine an IP adapter with a segmentation model, we can make fine-tuned adjustments to specific areas of an image. These adjustments can look significantly better than those achieved through traditional in-painting coupled with a ControlNet method.
The workflow featured in the video is broken into three components:
- Basic workflow (loading checkpoint, prompts, KSampler, etc.)
- IP Adapter nodes
- Segmentation nodes
IP Adapter Nodes
We’re using four different nodes in this area, which include the following: Load Image, IPAdapter Unified Loader, Load Clip Vision model, and IPAdapter Advanced.
The overall goal here is to understand the style of the reference image and pass that information along with the model so the output accurately understands what that reference image is about and applies it accordingly.
While in the example provided in the video uses a Hawaiian shirt, the input image could be any image that you want to extract the style from. This could be other shirt or fabric styles, paintings, monograms, etc.
🧐 Power-up: You can learn more about IPAdapters in this in-depth guide provided by HuggingFace.
Segment Anything Nodes
Borrowing from the popular SD WebUI Segment Anything extension, the Segment Anything custom nodes for ComfyUI provided by storyicon allow you to provide a textual input and it will, if found, segment that object from an image. Both extensions are based on Grounding Dino.
The video demonstrates how to connect a Load Image node along with SAMModelLoader, GroundingDinoModelLoader, and the GroundingDinoSAMSegment nodes.
One important variable to touch on that wasn’t included in the video is the threshold value in the GroundingDinoSAMSegment node. Essentially, a lower value may result in more of an area being selected, whereas a higher value will be more specific. The problem is that too high of a value may result in nothing being selected at all.
Generally, the default value of 0.30 works well for most cases. Use the Convert Mask to Image node if you want to review the segment.
Important Notes
- Inpainting Checkpoint: Using an inpainting model, preferably SDXL, will improve results.
- VRAM Requirements: The segmentation models used in this workflow can be VRAM intensive, so ensure sufficient VRAM is available.
- Style Transfer Limitations: Transfers are not perfect representations, nor will they place details in specific areas (i.e., flower on sleeve in Hawaiian shirt may not show on sleeve of model).
- Limited by Segment Bounds: Only the area that has been segmented will receive changes. Therefore, if this method is used for changing outfits, then length, details, etc. will not be transferred.
- Textual Prompt: Just describe the item that is being transferred in the style.
Explore More
More ComfyUI Tutorials
Continue learning with more tutorials from the ComfyUI category.
View All ComfyUI TutorialsWatch the full walkthrough
Quickly Change Outfits
