AI Toolbox

Unified Interface for ML Pipelines

A web-based dashboard that consolidates disparate AI tools—text-to-speech, image generation, transcription—into a single, manageable interface. Eliminates CLI complexity for creative workflows.

View Source Live Demo

Problem

AI/ML tools require command-line expertise, have inconsistent interfaces, and create fragmented workflows. Users waste time switching between terminals, managing file paths, and remembering command syntax instead of focusing on creative output.

CLI Complexity

Every tool has different commands, flags, and syntax to remember

Fragmented Workflows

Switching between terminals, managing paths, organizing outputs

Inconsistent Setup

Each tool requires separate installation and configuration

Architecture

The system follows a modular plugin architecture: a FastAPI web frontend provides the UI and API, while the tool registry pattern enables dynamic tool discovery. Each tool defines its parameters declaratively, allowing auto-generated forms. Jobs execute asynchronously with real-time status tracking and organized output management.

Web Frontend

•FastAPI with Jinja2 templating
•HTMX for dynamic interactions
•Auto-generated forms from schemas
•File upload handling with validation

Tool Registry

•Declarative tool definitions
•Parameter validation rules
•Category-based organization
•Zero-code tool addition

Job System

•Async job execution with asyncio
•Persistent job state (JSON file)
•Real-time status polling API
•Output file organization by job ID

Tool Runners

•Docker-based isolation (Kokoro TTS)
•Direct process execution
•Working directory isolation per tool
•Environment variable management

Technical Approach

Built a FastAPI backend with a plugin-style tool registry pattern. Each tool defines its parameters, validation rules, and execution command in a declarative configuration. Jobs run asynchronously with status tracking and output management. The web interface auto-generates forms from tool schemas, eliminating manual UI development for new tools.

app/tools/registry.py

@dataclass
class Tool:
    id: str
    name: str
    description: str
    category: str
    parameters: list[ToolParameter]
    working_dir: str
    command_template: str
    output_type: str = "file"
    output_pattern: str = "*"

# Tool registry - declarative definitions
TOOLS: dict[str, Tool] = {
    "kokoro-tts": Tool(
        id="kokoro-tts",
        name="Kokoro TTS",
        category="Audio",
        working_dir="/projects/kokoro-tts",
        command_template=
            "docker run --rm -v {output_dir}:/out "
            "kokoro-tts:0.2 --voice {voice} --text '{text}' "
            "--out /out/{output_name}",
        parameters=[
            ToolParameter(name="text", type="textarea", ...),
            ToolParameter(name="voice", type="select", ...),
            ToolParameter(name="speed", type="number", ...),
        ],
    ),
    # ... more tools
}

app/tools/runner.py

async def run_job(job_id: str, tool: Tool) -> None:
    """Execute a tool job asynchronously."""
    job = get_job(job_id)
    job["status"] = "running"
    job["started_at"] = datetime.now().isoformat()
    
    # Build command from template
    cmd, work_dir = build_command(tool, job["parameters"], job_id)
    
    # Run in subprocess with asyncio
    process = await asyncio.create_subprocess_shell(
        cmd,
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE,
        cwd=work_dir,
    )
    
    stdout, stderr = await process.communicate()
    
    if process.returncode == 0:
        job["status"] = "completed"
        job["output"] = {"files": find_output_files(job_id)}
    else:
        job["status"] = "failed"
        job["error"] = stderr.decode()

Key Technical Decisions

Tool Registry Pattern

Instead of hardcoding each tool, I created a declarative registry where each tool defines its parameters, validation rules, and command template. This enables adding new tools without writing UI code.

Async Job System

AI/ML operations can take minutes (image generation) or seconds (TTS). Using asyncio subprocesses allows handling multiple concurrent jobs without blocking the web server.

Working Directory Isolation

Each tool runs in its own working directory with its own Python virtual environment. This prevents dependency conflicts between tools using different library versions.

HTMX Over React

For a tool dashboard with mostly forms and status updates, HTMX provides server-rendered HTML with minimal JavaScript. Faster to develop, easier to maintain.

Integrated Tools

Kokoro TTS

High-quality text-to-speech with 20+ voices, prosody control, and Docker isolation

Whisper Transcription

Local speech-to-text using OpenAI Whisper server for privacy and speed

Image Generation

DALL-E, Gemini, and Qwen-Image-2512 integration for AI image creation

PostFlow Integration

Schedule social media posts directly from the toolbox workflow

TextOverlay

Auto-fitting text overlays on images via JSON configuration

Voice Cloning

Qwen TTS-based voice cloning with time-stretching and pitch preservation

Tech Stack

FastAPI

Jinja2 Templates

HTMX

Celery-style async jobs

Kokoro TTS

Whisper

OpenAI API

Docker

Results & Outcomes

Unified 5+ AI tools under single interface

Async job processing with progress tracking

Auto-generated forms from tool schemas

Organized output management with download links

Zero-downtime tool additions via registry pattern

Integrated Tools

Tool Categories

Async

Job Processing

100%

Auto-generated UI

Live Demo

Explore the interactive dashboard showing real-time job status, tool selection, and output management. This is a read-only demonstration of the system interface.

Open Dashboard Demo