← Back to Blog

Undertow Engine API: Headless Short-Form Video at the API Layer

A standalone Python microservice that takes a JSON brief and outputs a fully composited, captioned, posted short-form video — built on FastAPI, Celery, MoviePy, and Playwright.

Undertow Engine API is a video pipeline you call with a POST. Submit a brief — assets, template, captions, target platforms — and it returns a job ID. The service composites the video, renders it, posts (or schedules) it to TikTok / Instagram Reels / YouTube Shorts, and webhooks you when each stage completes.

It's deliberately a standalone microservice, not an embedded library: the upstream caller stays clean (a Next.js app, a scheduler, an n8n flow, another Python service), and all the heavy MoviePy + Chromium dependencies live in one place.

End-to-end

1. POST /jobs              ── brief JSON (assets, template, captions, targets)
        │
        ▼
2. FastAPI queues          ── Celery task, returns job_id immediately
        │
        ▼
3. MoviePy renders         ── intro / B-roll / overlay / outro composited to MP4
        │
        ▼
4. Captioning              ── Whisper transcription → styled caption burn-in
        │
        ▼
5. Playwright posts        ── headless browser uploads to each target platform
        │                     (TikTok, Reels, Shorts) with the right metadata
        ▼
6. Webhook callback        ── per-stage status + final platform URLs

The split matters. MoviePy handles rendering (CPU-bound, ffmpeg under the hood). Playwright handles posting (network + DOM-bound). Celery keeps both off the FastAPI request thread, so the API stays snappy and the long-running work survives restarts.

Stack

Layer Tech
API FastAPI (Python 3.11), async request handlers
Job queue Celery + Redis broker
Video rendering MoviePy + ffmpeg
Captioning OpenAI Whisper (transcription) → Pillow caption layer
Headless posting Playwright (Chromium), per-platform DOM automation
Storage AWS S3 — raw assets in, rendered MP4 out
Deploy Docker → ECR → SSM-managed EC2 (test + prod environments)
CI pytest + ruff; image tagged test-{SHA} / prod-{SHA} for immutable prod images

Why headless Playwright for the post step

Every major short-form platform has a posting API… on paper. In practice, the official APIs are slow to gain features (TikTok captions, music tracks, scheduling) and have rate limits that don't match creator-tool needs. A Playwright-driven post — using a signed-in session — matches what a human poster would do, including the platform-specific niceties the API lags on.

This is fragile by design: when a platform changes its uploader UI, the script breaks loudly. The service treats the per-platform poster as a swappable adapter, so a broken TikTok flow doesn't take down Reels.

Why a microservice vs. an in-process module

  • Heavy deps in one place. MoviePy pulls in ffmpeg; Playwright pulls in Chromium. Callers shouldn't have to.
  • Independent scale. Render jobs spike at content-publish time. The rest of the system shouldn't auto-scale on that signal.
  • Versioned API. The video pipeline can evolve (new captioning model, new platform adapter) without forcing every caller to redeploy.

Deployment

Test and prod each have their own ECR repository and EC2 instance, with images tagged test-{SHA} and prod-{SHA} respectively. SSM is used for environment configuration; nothing static. Deploys are immutable — every prod release is a new image, never an in-place mutate.