Skip to content

Text Descriptions

Text descriptions are the simplest way to tell Onset Engine what you want in your video. Instead of writing a JSON driver file, you describe low and high energy content in natural language — and the engine generates a full 4-tier driver automatically.

In Studio Mode’s Clip Direction section, you’ll find two text fields:

FieldMaps ToExample
”During quiet parts, focus on:“1_LOW tier descriptionscalm landscapes, still water, gentle close-ups
”On the heavy drops, focus on:“3_HIGH + 4_MAX tier descriptionsexplosive action, fast motion, @Hero moments

When you hit Generate (or Autopilot renders), the engine calls PromptParser._generate_driver():

  1. Splits your comma-separated text into individual descriptions
  2. Extracts @Tag references into the subjects array
  3. Creates a blended 2_MED tier from the last low description + first high description
  4. Saves the result to drivers/_autopilot_<hash>.json (content-hashed for cache reuse)

Low field: calm establishing shots, serene landscapes High field: explosive fights, @Goku power-ups, fast sword combat

Generates:

{
"name": "Auto-generated from text descriptions",
"version": 3,
"tiers": {
"1_LOW": {
"descriptions": ["calm establishing shots", "serene landscapes"]
},
"2_MED": {
"descriptions": ["serene landscapes", "explosive fights"]
},
"3_HIGH": {
"descriptions": ["explosive fights", "fast sword combat"],
"subjects": ["@Goku"]
},
"4_MAX": {
"descriptions": ["explosive fights", "fast sword combat"],
"subjects": ["@Goku"]
}
}
}

Onset Engine resolves clip direction in this order:

  1. Explicit driver JSON — if a .json driver file is loaded, it takes full control
  2. Text descriptions — if the text fields have content, a driver is auto-generated
  3. No driver — falls back to motion-score selection (high-motion clips for high energy)

When a driver is loaded in Studio Mode, the text fields dim to indicate they’re overridden. Clearing the driver (✕ button) re-enables text descriptions.

When you open the ✨ Create wizard with text descriptions filled in, the wizard auto-populates:

  • Your low text → 1_LOW tier description box
  • Your high text → 3_HIGH and 4_MAX description boxes
  • @Tag references → subjects fields

This gives you a starting point to refine rather than building from scratch.

For even more creative control, you can use text descriptions to guide the engine’s clip selection and generation process, allowing you to shape the final video purely through natural language.

  • Write 3–6 descriptions per field for best results
  • Use @Tag references only after tagging clips via the GUI’s tagging system
  • Comma-separated is best: calm water, gentle breeze, sunset
  • Text descriptions are saved per-project in the job JSON file