E-Lab Audio Generation Workflow¶

This workflow automates the conversion of educational scripts into high-quality audio files for E-Lab training programs. It processes uploaded script files, extracts text segments, generates voice audio using AI text-to-speech, and delivers final merged audio files to Google Drive for distribution.

Purpose¶

No business context provided yet — add a context.md to enrich this documentation.

How It Works¶

Script Upload: Users submit lesson scripts via a web form with lesson number, name, and script file
Text Extraction: The workflow extracts text content from uploaded script files (.txt or .doc)
Script Parsing: Content is analyzed to identify individual segments and dialogue sections
Voice Assignment: Each segment gets assigned appropriate voices - single narrator voice for regular content, character-specific voices for dialogue
Audio Generation: Text segments are converted to speech using ElevenLabs AI voices
Audio Processing: Generated audio files are uploaded to Cloudinary for temporary storage
Audio Merging: Shotstack API combines individual audio clips with silence gaps into complete lesson audio
Final Delivery: Merged audio files are uploaded to Google Drive and tracking records are updated
Progress Tracking: PostgreSQL database and Airtable maintain processing status throughout the workflow

Workflow Diagram¶

graph TD
    A[Form Submit] --> B[Extract from File]
    B --> C[Parse Script]
    C --> D[Create Record]
    D --> E[Assign Voice]
    E --> F[Loop Over Items]
    F --> G{Segment Type?}
    G -->|Single| H[Clean Text]
    G -->|Dialogue| I[Parse Dialogue]
    H --> J[Generate Voice]
    I --> K[Generate Voice for Dialogue]
    J --> L[Upload to Cloudinary]
    K --> M[Upload to Cloudinary]
    L --> N[Save to Static Data]
    M --> O[Parse Audio Links]
    N --> P[Collect and Group Audios]
    O --> Q[Send to Shotstack]
    P --> R[Send to Shotstack]
    Q --> S[Wait for Processing]
    R --> T[Wait for Processing]
    S --> U[Check Status]
    T --> V[Check Status]
    U --> W[Download Audio]
    V --> X[Download Audio]
    W --> Y[Upload to Google Drive]
    X --> Z[Upload to Google Drive]
    Y --> AA[Update Records]
    Z --> BB[Update Records]

Trigger¶

Form Trigger: Web form accepting: - lesson_number (number, required) - lesson_name (text, required) - script_content (file upload, .txt/.doc, required)

Nodes Used¶

Node Type	Purpose
Form Trigger	Accepts script uploads via web form
Extract from File	Extracts text content from uploaded files
Code	Parses scripts, assigns voices, processes audio data
HTTP Request	Calls ElevenLabs API for voice generation and Shotstack for audio merging
Cloudinary	Temporary storage for individual audio files
Google Drive	Final storage destination for completed audio files
Airtable	Tracks processing status and metadata
PostgreSQL	Maintains detailed processing logs
Split in Batches	Processes segments individually with rate limiting
If	Routes segments based on type (single vs dialogue)
Wait	Implements delays for API processing

External Services & Credentials Required¶

ElevenLabs API: Text-to-speech generation
- Credential: E-lab (HTTP Header Auth)
- Voice IDs for Linda, James, Mark, Jane, Peter
Shotstack API: Audio merging and processing
- API keys for staging and production environments
Cloudinary: Temporary audio file storage
- Credential: Cloudinary account
Google Drive: Final audio file storage
- Credential: Google Drive account 2 (OAuth2)
Airtable: Progress tracking
- Credential: EXP Training Bot (Token API)
- Base: E-Lab Script Generator Logs
PostgreSQL: Detailed logging
- Credential: elab database connection

Environment Variables¶

No explicit environment variables are used. All configuration is handled through n8n credentials and hardcoded values within nodes.

Data Flow¶

Input: - Lesson number and name - Script file (.txt or .doc format)

Processing: - Script segments with assigned voice IDs - Individual MP3 audio files - Merged audio with silence gaps - Processing status updates

Output: - Complete lesson audio file in Google Drive - Updated tracking records in Airtable and PostgreSQL - Drive folder links for access

Error Handling¶

The workflow includes basic error handling: - Validation checks for required form data - Status verification for Shotstack rendering completion - Conditional routing based on segment types - Wait nodes to handle API processing delays

No explicit error recovery or notification mechanisms are implemented.

Known Limitations¶

Based on the workflow structure: - Limited to .txt and .doc file formats - Hardcoded voice assignments may not suit all content types - No automatic retry mechanism for failed API calls - Processing time depends on script length and external API performance - Static data storage may not persist across workflow restarts

No related workflows are mentioned in the available context.

Setup Instructions¶

Import Workflow: Import the JSON workflow definition into your n8n instance
Configure Credentials:
- Set up ElevenLabs API credentials with voice access
- Configure Shotstack API keys for both environments
- Connect Cloudinary account for file storage
- Set up Google Drive OAuth2 for file uploads
- Configure Airtable API access to the logging base
- Set up PostgreSQL database connection
Verify External Services:
- Test ElevenLabs voice generation with sample text
- Confirm Shotstack audio merging capabilities
- Validate Google Drive folder permissions
- Check Airtable base structure matches expected schema
- Verify PostgreSQL table schema for logging
Test Workflow:
- Submit a test script through the form trigger
- Monitor processing through each stage
- Verify final audio output in Google Drive
- Check tracking records in both Airtable and PostgreSQL
Production Deployment:
- Update any staging API keys to production
- Configure appropriate Google Drive folder destinations
- Set up monitoring for workflow execution status