segment-everything-td

TouchDesigner implementation of Meta's SAM 2 & 3 models and YOLO11-seg models. Flexible backend that can handle realtime segmentation via YOLO11-seg (TD integration in progress) and time-consuming image auto segmentation using Meta's SAM.

Multi-backend segmentation toolkit for TouchDesigner — YOLO11, Meta SAM 2 & SAM 3 in one component

---

The Vision

Projection mapping has always felt like magic to me — watching light reshape physical objects, giving static surfaces a living, breathing quality. But there's a bottleneck that every projection artist knows too well: masking.

You photograph your subject. You trace each element by hand. You create dozens of masks for different surfaces. And if you want to apply different effects to different parts? More manual work.

I asked myself: What if AI could do the segmentation, and I could focus on the art?

segment-everything-td is my answer. Take a photo of your projection target, pass it through a single TouchDesigner component, and receive individual masks for every detected element — ready to route through StreamDiffusion, apply unique effects, and composite back together.

Imagine photographing a building facade and instantly having separate masks for every window, door, and architectural detail. Or capturing a performer and segmenting their silhouette from the background in one click. That's the workflow this enables.

---

What It Does

At its core, Segment Everything TD provides automatic mask generation through three different AI backends:

| Backend | Speed | Best For |

|---------|-----------|-------------------------------------------------|

| YOLO11 | ~30ms | Realtime segmentation, 80 object classes |

| SAM 2 | 2-10 min | Zero-shot, segment anything with visual prompts |

| SAM 3 | ~30ms GPU | Text prompts — "segment the person in red" |

You provide an image TOP. The component saves it, runs Python externally (keeping TD responsive), and dynamically creates MovieFileIn + OUT TOPs for each detected mask. Progress reporting shows you exactly where processing stands: "Point 128/256 (50%)".

When it's done, you have clean binary masks ready for compositing, effects routing, or feeding into StreamDiffusion.

---

The Hackathon Journey

Phase 1: Foundation

Started with YOLO11-seg for realtime webcam segmentation. Got it running at 30-60 FPS on Apple Silicon. The Python side was solid, but TD integration? That's where the real work began.

Phase 2: Meta SAM Integration

Integrated SAM 2 and SAM 3 through Ultralytics. SAM 2 gives you incredible zero-shot segmentation — it can segment objects it's never seen before. SAM 3 adds text prompting, so you can literally type "person with red shirt" and it finds every matching instance.

The tradeoff: SAM is slow. A 16x16 point grid (256 points) takes ~2.5 minutes on CPU. But for projection mapping prep work, that's acceptable — you're doing this once per venue, not per frame.

Phase 3: TouchDesigner Integration (The Hard Part)

This is where I learned that TouchDesigner's Python environment is its own beast.

The first subprocess calls failed silently. No output, no errors, nothing. After hours of debugging, I discovered TD's Python paths were polluting the subprocess environment — my venv was trying to load TD's bundled OpenCV instead of its own.

Other challenges solved:

- UUID mismatch: TD generated one ID, Python generated another. Output files went to different directories than the file watcher expected.

- Progress reporting: SAM can take minutes. Users need feedback. Implemented file-based IPC with progress.json polling every frame.

- Dynamic operator creation: Masks are created at runtime. Had to generate MovieFileIn + OUT TOP pairs on the fly, arranged in a clean vertical layout.

Phase 4: Polish & StreamDiffusion Integration

The dream workflow came together: photograph subject → segment → route each mask through StreamDiffusionTD → composite. Each element of the projection target getting its own AI-generated treatment, then reassembled into a cohesive whole.

---

Features

SegmentationCOMP

- Drag-and-drop .tox — works in any TD project

- Backend selection — YOLO (fast), SAM2 (accurate), SAM3 (text prompts)

- Segment button — one click to process

- Clear Masks button — remove all generated TOPs

- Progress display — live updates during processing

- Dynamic outputs — masks appear as OUT TOPs automatically

Technical

- Non-blocking — TD stays responsive during processing

- UUID isolation — each run gets its own output directory

- Clean subprocess environment — no TD Python conflicts

- Cross-platform — works on Mac, Linux, Windows (Mac tested extensively)

---

Demo: Projection Mapping Workflow

1. Photograph your projection target

2. Load image into TD, connect to SegmentationCOMP

3. Select backend (SAM2 for best quality)

4. Click Segment

5. Wait for masks to appear (~2-3 min)

6. Route each out_mask through StreamDiffusionTD

7. Apply unique prompts per segment

8. Composite everything back together

9. Project onto original surface

Each architectural element, each body part, each object — transformed independently, unified in the final output.

---

What's Next

On the roadmap:

- Interactive point selection in TD (click to segment)

- Text prompt input field (type what to segment)

- Realtime YOLO integration (script works, TD hookup remaining)

- Video segmentation with temporal consistency

The bigger vision:

This component is designed to be backend-agnostic. Today it's YOLO and SAM. Tomorrow it could be any segmentation model. The architecture — subprocess isolation, file-based IPC, dynamic operator creation — supports whatever comes next.

---

Technical Specs

| Metric | Value |

|------------|--------------------------------|

| Languages | Python, TouchDesigner |

| ML Models | YOLO11-seg, SAM 2, SAM 3 |

| Framework | Ultralytics |

| TD Version | 2023.10000+ |

| Platforms | macOS (tested), Windows, Linux |

---

Why I Built This

The creative coding community has given me so much. Every TouchDesigner tutorial, every Discord answer, every shared .tox file — it all adds up.

Segment Everything TD is my contribution back. It's the tool I wished existed when I started exploring projection mapping. It's designed to be:

- Accessible — drag in the .tox, configure two paths, go

- Extensible — clean architecture for adding new backends

- Documented — comprehensive guides included

---

Credits & Acknowledgments

- Daydream.live — for hosting this hackathon and making StreamDiffusion accessible

- Ultralytics — YOLO11 and SAM integration that made this possible

- Meta AI — SAM 2 and SAM 3 models

- Derivative — TouchDesigner, the canvas for all of this

- The TD Discord — countless answers to my subprocess questions

- Torin and Andrew for the immensely valuable support/direction

---

Try It

GitHub: https://github.com/ehfazrezwan/segment-everything-td

Includes:

- Complete Python source

- SegmentationCOMP.tox

- Full documentation

- Example TD project

---

Final Thoughts

There's something poetic about using AI to break images apart so we can put them back together more beautifully. Segmentation isn't the end — it's the beginning. It's the moment where a single image becomes a palette of possibilities.

I built Segment Everything TD because I believe the future of projection mapping is intelligent — where the tedious work is automated and artists can focus on what matters: the vision.

Take a picture. Segment everything. Create something new.

---

Project Status: v1.0 — Image segmentation functional, realtime mode in progress

Difficulty: Beginner-friendly setup, intermediate TD knowledge helpful

Last Updated: January 2026

segment-everything-td

segment-everything-td

Explore new worlds with Daydream Scope

segment-everything-td

The Hackathon Journey

Tags