Job-Shorts: rendering every chapter of the Book of Job as a 60-second AI video

The Bible is public domain. AI video gen is here. Short-form Bible content has a real audience. I wanted to know if I could turn reading Job into a 42-episode YouTube/TikTok series rendered entirely on my desktop.

I'm calling it Job-Shorts. The pipeline is local-first, commercial-safe, and routes every text-generation call through my Claude Code subscription so the per-chapter cost is the electricity it takes the 3090 to render.

The pipeline

chapter_number + KJV text
        │
        ▼
1. SCRIPT GEN     (Claude via subscription, ~10s)
        → 150-word narration (hook / setup / tension / payoff / turn / CTA)
2. BREAKDOWN      (Claude, ~15s)
        → 4–6 visual beats + character lock + style lock
3. KEYFRAMES      (ComfyUI / Flux Dev / Z-Image)
        → 1 preview image per beat
4. NARRATION      (F5-TTS, local)
        → audio + word-level timestamps
5. VIDEO GEN      (ComfyUI / LTX 2.3 or HunyuanVideo)
        → each beat at narration-matched length, 2 takes
6. EVALUATOR      (Claude vision)
        → picks the best take by scoring rendered frames
7. CAPTIONS       (faster-whisper)
        → burned word-by-word captions from narration
8. ASSEMBLE       (FFmpeg)
        → concat + narration + music bed (sidechain-ducked) + verse overlays
9. PUBLISHING     (Claude)
        → title + description + hashtags + thumbnail
        │
        ▼
output/chapter_N/final.mp4 (1080x1920, vertical, ready to upload)

End-to-end on the 3090: roughly 30–60 minutes per chapter, mostly unattended.

The LLM routing trick

There's a job_shorts.llm module that fronts every text call. It picks one of three backends:

Backend	When picked	Cost
`claude_code`	default if `claude` is on PATH	uses your Claude Code subscription
`claude_api`	only if `llm_backend=claude_api`	pay-per-token
`ollama`	fallback	free, local

claude_code works by invoking claude -p as a subprocess and reading from your existing auth. So all script generation, scene breakdown, evaluator scoring, and publishing metadata go through my Max plan — no per-token spend.

That single decision turns the math on its head. Without it, generating 42 chapters at GPT-4-class quality would cost real money. With it, the only cost is the electricity for ComfyUI to render the video.

Patterns I borrowed (and the ones I had to invent)

Borrowed from other text-to-film projects — these are well-established now:

Character lock — full physical description injected verbatim into every prompt
Style lock — 20–40 word visual style string locked across all prompts
World reconstruction — every prompt fully self-contained, no inter-clip memory assumed
Storyboard-before-video — cheap keyframe preview before expensive video gen
Multiple takes + AI evaluator — generate 2–3 takes, LLM scores PASS/FAIL
Duration calc from word count — narration WPM dictates clip length
Resume-from-crash state file — JSON state after every step

What I had to add for Bible content specifically:

Whisper caption timing — accurate word-level burned captions for muted viewing (this is non-negotiable on Shorts/TikTok)
Series-wide consistency — series.json keeps Job + style identical across all 42 episodes
KJV auto-fetch — pulls public-domain Bible text from bible-api.com so I never type a verse
Verse chyron overlay — quoted scripture appears on screen with proper formatting
Music bed with sidechain ducking — auto-select + duck under narration
Batch mode — process N chapters overnight unattended

The CLI surface

# See your hardware tier and recommended models
python -m job_shorts.cli info

# Verify the LLM backend works
python -m job_shorts.cli test-llm

# Just write the script — fast, free, no rendering
python -m job_shorts.cli script 1

# Generate one chapter end-to-end (supervised)
python -m job_shorts.cli chapter 1

# Batch chapters 1 through 10
python -m job_shorts.cli batch 1-10

# Fully autonomous overnight: auto-launch services, vision-evaluator, no gates
python -m job_shorts.cli auto-batch 1-42

# Resume a crashed run
python -m job_shorts.cli resume output/chapter_03

The fully-autonomous mode is the one I actually use. Start it, walk away, the rig knows how to relaunch ComfyUI or Ollama if either dies, and there's a JSON state file after every step so a power blip doesn't cost a chapter.

Where it's at

Phase 0 (23 modules) is complete. Phase 1 — first end-to-end render — is the next step. The interesting open questions are around the evaluator: how do you teach a vision model to spot when LTX has slipped into uncanny-valley territory before you commit to a take? The current heuristic is naive (luminance variance + character consistency check). I think there's a smarter version that compares each frame back to the keyframe storyboard.

If you want to read the actual code or watch progress, it lives on my GitHub. The whole point of this project is that anyone with a 12 GB+ GPU should be able to fork it and render their own Bible (or any other public-domain text). The pipeline doesn't care that it's Job.