AI Toolbox
Unified Interface for ML Pipelines
A web-based dashboard that consolidates disparate AI tools—text-to-speech, image generation, transcription—into a single, manageable interface. Eliminates CLI complexity for creative workflows.
Problem
AI/ML tools require command-line expertise, have inconsistent interfaces, and create fragmented workflows. Users waste time switching between terminals, managing file paths, and remembering command syntax instead of focusing on creative output.
CLI Complexity
Every tool has different commands, flags, and syntax to remember
Fragmented Workflows
Switching between terminals, managing paths, organizing outputs
Inconsistent Setup
Each tool requires separate installation and configuration
Architecture
The system follows a modular plugin architecture: a FastAPI web frontend provides the UI and API, while the tool registry pattern enables dynamic tool discovery. Each tool defines its parameters declaratively, allowing auto-generated forms. Jobs execute asynchronously with real-time status tracking and organized output management.
Web Frontend
- •FastAPI with Jinja2 templating
- •HTMX for dynamic interactions
- •Auto-generated forms from schemas
- •File upload handling with validation
Tool Registry
- •Declarative tool definitions
- •Parameter validation rules
- •Category-based organization
- •Zero-code tool addition
Job System
- •Async job execution with asyncio
- •Persistent job state (JSON file)
- •Real-time status polling API
- •Output file organization by job ID
Tool Runners
- •Docker-based isolation (Kokoro TTS)
- •Direct process execution
- •Working directory isolation per tool
- •Environment variable management
Technical Approach
Built a FastAPI backend with a plugin-style tool registry pattern. Each tool defines its parameters, validation rules, and execution command in a declarative configuration. Jobs run asynchronously with status tracking and output management. The web interface auto-generates forms from tool schemas, eliminating manual UI development for new tools.
@dataclass
class Tool:
id: str
name: str
description: str
category: str
parameters: list[ToolParameter]
working_dir: str
command_template: str
output_type: str = "file"
output_pattern: str = "*"
# Tool registry - declarative definitions
TOOLS: dict[str, Tool] = {
"kokoro-tts": Tool(
id="kokoro-tts",
name="Kokoro TTS",
category="Audio",
working_dir="/projects/kokoro-tts",
command_template=
"docker run --rm -v {output_dir}:/out "
"kokoro-tts:0.2 --voice {voice} --text '{text}' "
"--out /out/{output_name}",
parameters=[
ToolParameter(name="text", type="textarea", ...),
ToolParameter(name="voice", type="select", ...),
ToolParameter(name="speed", type="number", ...),
],
),
# ... more tools
}async def run_job(job_id: str, tool: Tool) -> None:
"""Execute a tool job asynchronously."""
job = get_job(job_id)
job["status"] = "running"
job["started_at"] = datetime.now().isoformat()
# Build command from template
cmd, work_dir = build_command(tool, job["parameters"], job_id)
# Run in subprocess with asyncio
process = await asyncio.create_subprocess_shell(
cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=work_dir,
)
stdout, stderr = await process.communicate()
if process.returncode == 0:
job["status"] = "completed"
job["output"] = {"files": find_output_files(job_id)}
else:
job["status"] = "failed"
job["error"] = stderr.decode()Key Technical Decisions
Tool Registry Pattern
Instead of hardcoding each tool, I created a declarative registry where each tool defines its parameters, validation rules, and command template. This enables adding new tools without writing UI code.
Async Job System
AI/ML operations can take minutes (image generation) or seconds (TTS). Using asyncio subprocesses allows handling multiple concurrent jobs without blocking the web server.
Working Directory Isolation
Each tool runs in its own working directory with its own Python virtual environment. This prevents dependency conflicts between tools using different library versions.
HTMX Over React
For a tool dashboard with mostly forms and status updates, HTMX provides server-rendered HTML with minimal JavaScript. Faster to develop, easier to maintain.
Integrated Tools
Kokoro TTS
High-quality text-to-speech with 20+ voices, prosody control, and Docker isolation
Whisper Transcription
Local speech-to-text using OpenAI Whisper server for privacy and speed
Image Generation
DALL-E, Gemini, and Qwen-Image-2512 integration for AI image creation
PostFlow Integration
Schedule social media posts directly from the toolbox workflow
TextOverlay
Auto-fitting text overlays on images via JSON configuration
Voice Cloning
Qwen TTS-based voice cloning with time-stretching and pitch preservation
Tech Stack
Results & Outcomes
Unified 5+ AI tools under single interface
Async job processing with progress tracking
Auto-generated forms from tool schemas
Organized output management with download links
Zero-downtime tool additions via registry pattern
Live Demo
Explore the interactive dashboard showing real-time job status, tool selection, and output management. This is a read-only demonstration of the system interface.
Open Dashboard Demo