Skip to content

E LAB AUDIO GENERATION

This workflow automates the conversion of educational lesson scripts into high-quality audio files using AI text-to-speech technology. It processes structured lesson scripts, generates voice narration with multiple AI voices, handles both single-narrator and dialogue segments, merges audio clips, and delivers the final audio files to Google Drive with comprehensive tracking in Airtable.

Purpose

No business context provided yet — add a context.md to enrich this documentation.

How It Works

  1. Script Upload: Users submit lesson scripts through a web form, providing lesson number, name, and script file
  2. Script Parsing: The workflow extracts and parses the script content, identifying different segment types (single narrator vs. dialogue)
  3. Voice Assignment: Assigns appropriate AI voices - a default voice for single segments, and character-specific voices for dialogue segments
  4. Audio Generation: Converts text to speech using ElevenLabs API with high-quality voice models
  5. Audio Processing: Uploads individual audio clips to Cloudinary for temporary storage
  6. Audio Merging: Uses Shotstack API to merge multiple audio clips with silence gaps into cohesive files
  7. Final Delivery: Downloads merged audio and uploads to Google Drive for permanent storage
  8. Progress Tracking: Updates Airtable records throughout the process to track status and provide download links

The workflow handles two parallel processing paths - one optimized for script segments and another for individual audio files, ensuring efficient processing of different content types.

Workflow Diagram

graph TD
    A[Form Submit] --> B[Extract from File]
    B --> C[Parse Script]
    C --> D[Create Airtable Record]
    D --> E[Assign Voice]
    E --> F[Loop Over Items]

    F --> G{Segment Type?}
    G -->|Single| H[Wait2]
    G -->|Dialogue| I[Wait]

    H --> J[Code in JavaScript]
    J --> K[Generate Voice]
    K --> L[Upload Script Audio]
    L --> M[Save Upload to Static Data]
    M --> N[Collect and Group Script Audios]
    N --> O[Send Script Group to Shotstack]
    O --> P[Wait for Shotstack]
    P --> Q[Get Script Render Status]
    Q --> R{Render Complete?}
    R -->|Yes| S[Download Merged Script Audio]
    S --> T[Upload Script Audio to Drive]
    T --> U[Update Script Record]

    I --> V[Parse Dialogue]
    V --> W[Generate Voice for Dialogue]
    W --> X[Upload Asset from File Data]
    X --> Y[Parse Audio Link]
    Y --> Z[Send Audio to Shotstack]
    Z --> AA[Wait1]
    AA --> BB[Get Merge Audio Status]
    BB --> CC{Success?}
    CC -->|Yes| DD[Get Audio]
    DD --> EE[Upload File]
    EE --> FF[Update Record]

    FF --> F
    U --> F

Trigger

Form Trigger: A web form titled "E-Lab Audio Automation" that accepts: - lesson_number (required number field) - lesson_name (required text field)
- script_content (required file upload, accepts .txt and .doc files)

Nodes Used

Node Type Purpose
Form Trigger Accepts lesson script submissions via web form
Extract from File Extracts text content from uploaded script files
Code (JavaScript) Parses scripts, assigns voices, processes audio data
Airtable Creates and updates tracking records for lessons
Split in Batches Processes script segments in batches
If Routes processing based on segment type (single vs dialogue)
Wait Adds delays for API rate limiting
HTTP Request Calls ElevenLabs API for voice generation and Shotstack for audio merging
Cloudinary Temporary storage for individual audio clips
Google Drive Final storage destination for completed audio files
No Operation Placeholder for conditional flow control

External Services & Credentials Required

ElevenLabs API

  • Purpose: AI text-to-speech voice generation
  • Credential: HTTP Header Authentication (E-lab)
  • Models Used: eleven_multilingual_v2
  • Output Format: MP3 at 44.1kHz

Shotstack API

  • Purpose: Audio merging and rendering
  • Authentication: API key in headers
  • Output: MP3 format audio files

Cloudinary

  • Purpose: Temporary audio file storage and processing
  • Credential: Cloudinary API (Cloudinary account)
  • Resource Type: Video (for audio files)

Airtable

  • Purpose: Progress tracking and lesson management
  • Credential: Airtable Token API (EXP Training Bot, E-Lab Script Generator Logs)
  • Tables: Scripts, Audio generation logs

Google Drive

  • Purpose: Final audio file storage
  • Credential: Google Drive OAuth2 API (Google Drive account 2)
  • Folder: E-lab Audio files

Environment Variables

No explicit environment variables are used. All configuration is handled through n8n credentials and hardcoded values within the workflow nodes.

Data Flow

Input

  • Lesson number (integer)
  • Lesson name (string)
  • Script file (.txt or .doc format)

Processing

  • Script content parsed into segments with IDs like EDU_16_1_5
  • Voice assignments: Default voice for narration, character-specific voices for dialogue
  • Audio generation with 128kbps MP3 quality
  • Temporary Cloudinary storage with public URLs
  • Audio merging with 0.5-second silence gaps

Output

  • Merged MP3 audio files stored in Google Drive
  • Airtable records with status tracking and download links
  • File naming convention: {segment_id}.mp3 or {segment_id}_merged.mp3

Error Handling

The workflow includes basic error handling through: - Conditional checks for successful API responses - Status validation before proceeding to next steps - Airtable status updates to track failed processes - Wait nodes to handle API rate limits and processing delays

No explicit error recovery or retry mechanisms are implemented.

Known Limitations

Based on the workflow structure: - Processing time depends on script length and external API response times - No automatic retry for failed API calls - Limited to ElevenLabs voice models and Shotstack processing capabilities - File size limitations based on external service constraints - Sequential processing may be slow for very large scripts

No related workflows are explicitly referenced in the current workflow configuration.

Setup Instructions

  1. Import Workflow: Import the JSON workflow definition into your n8n instance

  2. Configure Credentials:

    • Set up ElevenLabs API credentials with header authentication
    • Configure Shotstack API key
    • Add Cloudinary API credentials
    • Set up Airtable API token with access to the E-Lab Script Generator Logs base
    • Configure Google Drive OAuth2 credentials
  3. Verify External Services:

    • Ensure Airtable base appeEMxtMrmeiVNW0 exists with required tables
    • Confirm Google Drive folder 1M3uHDGoEn1atGs1lBVYOdJyTmpMWAUhO is accessible
    • Test Cloudinary upload permissions
  4. Voice Configuration:

    • Verify ElevenLabs voice IDs are valid and accessible
    • Update character voice mappings if needed
    • Test voice generation with sample text
  5. Activate Workflow: Enable the workflow to start accepting form submissions

  6. Test: Submit a sample lesson script through the form trigger to verify end-to-end functionality

The workflow will be accessible via the generated form URL once activated.