How ComfyUI is becoming the Core of My Modular AI Backend (And Why It Should Be Yours Too)

After months of creative deep-diving and system design, I’m finally pulling back the curtain on how I’m evolving my AI infrastructure. At the heart of it all? ComfyUI—a tool that started as a visual image generation interface, but is growing into a modular, GPU-accelerated powerhouse that can drive nearly every major local AI function I use.

This isn’t just another AI stack. It’s Samaritan—my custom Linux-based AI backend. And ComfyUI is the hub.

🚀 Why ComfyUI?

Originally built for node-based Stable Diffusion workflows, ComfyUI has exploded in capability:

Supports Image → Video, upscaling, enhancement, interpolation, even inpainting video.
New node packs now support local LLM calls, TTS/STT, face restoration, segmentation, motion modeling, and more.
Highly customizable, GPU-accelerated, and runs great in containerized environments.
At this point, ComfyUI is limited only by our dreams… and maybe our hardware. 😄

For someone like me—juggling storytelling, music production, AI experimentation, and more… it’s the first tool that feels like it was built for the future with its innovative pipeline/workflow design. One interface to rule them all.

🧱 A Modular AI Backend Built for the Real World

Here’s how I’m structuring my AI system:

Main Host: Ubuntu Linux-based system with Docker and GPU passthrough

ComfyUI Containers by Function:

8188: Visual Generation (images, inpainting, vid2vid)
8189: Face Restoration, Upscaling (Real-ESRGAN, CodeFormer, etc.)
8190: Voice + TTS workflows
8191: LLM API chaining (Ollama, GPTQ, etc.)
8192: Storyboarding/Comic layout design

Benefits of this setup:

Dedicated GPUs or CPU resources per container
Easy updates + rollback
Reusable workflows per task
Port-forwarded access from my Windows-based creative workstation (‘The Machine’)

🎤 From Creation to Command Center

What started as a tool to design character imagery now lets me:

Generate full music video storyboards
Run local LLM-based character scripts
TTS vocals for placeholder parts
Create visual effects to blend into DaVinci Resolve edits

And the best part? All of it lives in a system I control. Private. Reproducible. Modular.

🔗 How This Fits the Bigger Picture

I’ve realized I don’t need to choose between being a creative and being a techie. By building a modular backend like this, I’ve unlocked the ability to:

Switch tasks fast without reconfiguring my system
Delegate heavier tasks to cloud when needed
Customize tools that align with the story I’m trying to tell

If you’re a technical creator or visionary hybrid like me, this isn’t just workflow optimization—it’s about owning your creative future.

Follow me on LinkedIn, Patreon, YouTube, pcSHOWme.net and more… to see how this system (Samaritan) grows and powers much of what I create, from AI music videos to fully self-hosted infrastructure. More behind-the-scenes, more insight, more heart… “To Infinity and Beyond”, “To go boldly where no man has gone before”…

—

Jim Bodden
pcSHOWme | Synergistic Harmony | Faith-in-Action Studio Life

pcSHOWme AI