Skip to content

E-Lab Audio Generation Workflow

This workflow automates the conversion of educational scripts into high-quality audio files for E-Lab training programs. It processes uploaded script files, extracts text segments, generates voice audio using AI text-to-speech, and delivers final merged audio files to Google Drive for distribution.

Purpose

No business context provided yet — add a context.md to enrich this documentation.

How It Works

  1. Script Upload: Users submit lesson scripts via a web form with lesson number, name, and script file
  2. Text Extraction: The workflow extracts text content from uploaded script files (.txt or .doc)
  3. Script Parsing: Content is analyzed to identify individual segments and dialogue sections
  4. Voice Assignment: Each segment gets assigned appropriate voices - single narrator voice for regular content, character-specific voices for dialogue
  5. Audio Generation: Text segments are converted to speech using ElevenLabs AI voices
  6. Audio Processing: Generated audio files are uploaded to Cloudinary for temporary storage
  7. Audio Merging: Shotstack API combines individual audio clips with silence gaps into complete lesson audio
  8. Final Delivery: Merged audio files are uploaded to Google Drive and tracking records are updated
  9. Progress Tracking: PostgreSQL database and Airtable maintain processing status throughout the workflow

Workflow Diagram

graph TD
    A[Form Submit] --> B[Extract from File]
    B --> C[Parse Script]
    C --> D[Create Record]
    D --> E[Assign Voice]
    E --> F[Loop Over Items]
    F --> G{Segment Type?}
    G -->|Single| H[Clean Text]
    G -->|Dialogue| I[Parse Dialogue]
    H --> J[Generate Voice]
    I --> K[Generate Voice for Dialogue]
    J --> L[Upload to Cloudinary]
    K --> M[Upload to Cloudinary]
    L --> N[Save to Static Data]
    M --> O[Parse Audio Links]
    N --> P[Collect and Group Audios]
    O --> Q[Send to Shotstack]
    P --> R[Send to Shotstack]
    Q --> S[Wait for Processing]
    R --> T[Wait for Processing]
    S --> U[Check Status]
    T --> V[Check Status]
    U --> W[Download Audio]
    V --> X[Download Audio]
    W --> Y[Upload to Google Drive]
    X --> Z[Upload to Google Drive]
    Y --> AA[Update Records]
    Z --> BB[Update Records]

Trigger

Form Trigger: Web form accepting: - lesson_number (number, required) - lesson_name (text, required) - script_content (file upload, .txt/.doc, required)

Nodes Used

Node Type Purpose
Form Trigger Accepts script uploads via web form
Extract from File Extracts text content from uploaded files
Code Parses scripts, assigns voices, processes audio data
HTTP Request Calls ElevenLabs API for voice generation and Shotstack for audio merging
Cloudinary Temporary storage for individual audio files
Google Drive Final storage destination for completed audio files
Airtable Tracks processing status and metadata
PostgreSQL Maintains detailed processing logs
Split in Batches Processes segments individually with rate limiting
If Routes segments based on type (single vs dialogue)
Wait Implements delays for API processing

External Services & Credentials Required

  • ElevenLabs API: Text-to-speech generation
    • Credential: E-lab (HTTP Header Auth)
    • Voice IDs for Linda, James, Mark, Jane, Peter
  • Shotstack API: Audio merging and processing
    • API keys for staging and production environments
  • Cloudinary: Temporary audio file storage
    • Credential: Cloudinary account
  • Google Drive: Final audio file storage
    • Credential: Google Drive account 2 (OAuth2)
  • Airtable: Progress tracking
    • Credential: EXP Training Bot (Token API)
    • Base: E-Lab Script Generator Logs
  • PostgreSQL: Detailed logging
    • Credential: elab database connection

Environment Variables

No explicit environment variables are used. All configuration is handled through n8n credentials and hardcoded values within nodes.

Data Flow

Input: - Lesson number and name - Script file (.txt or .doc format)

Processing: - Script segments with assigned voice IDs - Individual MP3 audio files - Merged audio with silence gaps - Processing status updates

Output: - Complete lesson audio file in Google Drive - Updated tracking records in Airtable and PostgreSQL - Drive folder links for access

Error Handling

The workflow includes basic error handling: - Validation checks for required form data - Status verification for Shotstack rendering completion - Conditional routing based on segment types - Wait nodes to handle API processing delays

No explicit error recovery or notification mechanisms are implemented.

Known Limitations

Based on the workflow structure: - Limited to .txt and .doc file formats - Hardcoded voice assignments may not suit all content types - No automatic retry mechanism for failed API calls - Processing time depends on script length and external API performance - Static data storage may not persist across workflow restarts

No related workflows are mentioned in the available context.

Setup Instructions

  1. Import Workflow: Import the JSON workflow definition into your n8n instance

  2. Configure Credentials:

    • Set up ElevenLabs API credentials with voice access
    • Configure Shotstack API keys for both environments
    • Connect Cloudinary account for file storage
    • Set up Google Drive OAuth2 for file uploads
    • Configure Airtable API access to the logging base
    • Set up PostgreSQL database connection
  3. Verify External Services:

    • Test ElevenLabs voice generation with sample text
    • Confirm Shotstack audio merging capabilities
    • Validate Google Drive folder permissions
    • Check Airtable base structure matches expected schema
    • Verify PostgreSQL table schema for logging
  4. Test Workflow:

    • Submit a test script through the form trigger
    • Monitor processing through each stage
    • Verify final audio output in Google Drive
    • Check tracking records in both Airtable and PostgreSQL
  5. Production Deployment:

    • Update any staging API keys to production
    • Configure appropriate Google Drive folder destinations
    • Set up monitoring for workflow execution status