Skip to content

E-Lab Audio Production - Ameer

This workflow automates the conversion of E-Lab lesson scripts into AI-generated audio files. It accepts script uploads through a web form, parses the content to identify different segments and speakers, generates audio using ElevenLabs text-to-speech with multiple African English voices, and organizes the final audio files in Google Drive with full tracking in Airtable.

Purpose

No business context provided yet — add a context.md to enrich this documentation.

How It Works

  1. Script Submission: Users submit lesson scripts through a web form, providing lesson number, name, and uploading a text/Word document
  2. Record Creation: A new record is created in Airtable to track the processing status and metadata
  3. Status Update: The record status is updated to "Processing" to indicate work has begun
  4. Script Parsing: The uploaded script is analyzed to extract narrated segments, identify dialogue vs. single-voice content, and assign appropriate voices based on character roles and rotation
  5. Content Routing: The workflow splits into two paths - dialogue segments go through multi-voice processing while single-voice segments go directly to audio generation
  6. Audio Generation: Text is converted to speech using ElevenLabs API with assigned voice IDs for African English speakers
  7. Dialogue Processing: Multi-speaker dialogue segments are split into individual parts, each generated separately, then merged with silence gaps using FFmpeg
  8. File Upload: All generated audio files are uploaded to a designated Google Drive folder
  9. Final Processing: Statistics are calculated including processing time and file counts, then the Airtable record is updated with completion status and Drive links

Workflow Diagram

graph TD
    A[Script Submission Form] --> B[Create Airtable Record]
    B --> C[Update Status Processing]
    C --> D[Parse Script]
    D --> E{Is Dialogue?}
    E -->|No| F[Generate Single Audio]
    E -->|Yes| G[Split Dialogue Parts]
    F --> H[Upload Single Audio to Drive]
    G --> I[Generate Dialogue Part]
    I --> J[Wait for All Dialogue Parts]
    J --> K[Merge Dialogue with FFmpeg]
    K --> L[Upload Dialogue Audio to Drive]
    H --> M[Merge All Audio Files]
    L --> M
    M --> N[Calculate Final Stats]
    N --> O[Update Airtable Complete]

    %% Error handling path
    D -.-> P[Update Airtable Failed]
    F -.-> P
    G -.-> P
    I -.-> P
    K -.-> P

Trigger

Form Trigger: A web form accessible at a specific webhook URL that accepts: - Lesson Number (required number field) - Lesson Name (required text field) - Script File (required file upload, accepts .txt and .docx files)

Nodes Used

Node Type Purpose
Form Trigger Provides web interface for script submission
Airtable (Create) Creates initial tracking record
Airtable (Update) Updates processing status and final results
Code Parses script content and extracts segments with voice assignments
If Routes content based on dialogue vs single-voice detection
HTTP Request Calls ElevenLabs API for text-to-speech generation
Code Splits dialogue into individual speaker parts
Aggregate Waits for all dialogue parts to complete before merging
Code Merges dialogue audio files using FFmpeg
Google Drive Uploads generated audio files to cloud storage
Merge Combines single and dialogue audio file results
Code Calculates final statistics and metadata

External Services & Credentials Required

ElevenLabs API

  • Purpose: Text-to-speech generation with African English voices
  • Credential: API key for authentication
  • Voices Used: Linda, James, Mark, Jane, Peter (configured via environment variables)

Airtable

  • Purpose: Project tracking and status management
  • Credential: Airtable API token
  • Tables: Scripts table for recording processing status and metadata

Google Drive

  • Purpose: Audio file storage and sharing
  • Credential: Google Drive OAuth2 API access
  • Permissions: File upload and folder access

Environment Variables

Variable Description
ELEVENLABS_API_KEY API key for ElevenLabs text-to-speech service
ELEVENLABS_VOICE_LINDA Voice ID for Linda character
ELEVENLABS_VOICE_JAMES Voice ID for James character
ELEVENLABS_VOICE_MARK Voice ID for Mark character
ELEVENLABS_VOICE_JANE Voice ID for Jane character
ELEVENLABS_VOICE_PETER Voice ID for Peter character
AIRTABLE_BASE_ID Airtable base identifier for the Scripts table
GOOGLE_DRIVE_FOLDER_ID Target folder ID for audio file uploads

Data Flow

Input

  • Lesson Number: Integer identifying the lesson sequence
  • Lesson Name: Descriptive text for the lesson
  • Script File: Text or Word document containing structured lesson content with segment markers (EDU_X_Y_Z format)

Processing

  • Script content is parsed to identify segments marked with EDU patterns
  • Voice assignments are made based on character names (Guide, Practitioner, etc.) or narrator rotation
  • Audio generation creates MP3 files for each segment
  • Dialogue segments are merged with 400ms silence gaps between speakers

Output

  • Audio Files: MP3 files uploaded to Google Drive, organized by segment ID
  • Airtable Record: Updated with processing statistics, file links, and completion status
  • Metadata: JSON structure containing file details and Drive URLs

Error Handling

The workflow includes error handling for failed processing:

  • Error Capture: If any step fails, the workflow updates the Airtable record with "Failed" status
  • Error Logging: Error messages are captured and stored in the "Error Log" field
  • Graceful Degradation: The error handling node is configured to continue execution rather than stopping the workflow
  • Cleanup: Temporary files created during FFmpeg processing are automatically removed even on failure

Known Limitations

  • Script files must follow the specific EDU_X_Y_Z segment marking format
  • Only supports .txt and .docx file formats for script upload
  • Requires FFmpeg to be available on the execution environment for dialogue merging
  • Voice assignments are hardcoded to specific character names and rotation patterns
  • Processing time depends on script length and ElevenLabs API response times

No related workflows identified in the current configuration.

Setup Instructions

  1. Import Workflow: Import the JSON configuration into your n8n instance

  2. Configure Credentials:

    • Set up ElevenLabs API credentials with your API key
    • Configure Airtable API token with access to your base
    • Set up Google Drive OAuth2 credentials with file upload permissions
  3. Environment Variables: Configure all required environment variables in your n8n instance

  4. Airtable Setup: Create a "Scripts" table with the following fields:

    • Script ID (text)
    • Script Name (text)
    • Lesson Number (number)
    • Lesson Name (text)
    • Upload Date (date)
    • Status (single select: Pending, Processing, Complete, Failed)
    • Total Segments (number)
    • Processing Time (number)
    • Drive Folder Link (URL)
    • Audio Files (long text)
    • Error Log (long text)
  5. Google Drive Setup: Create a dedicated folder for audio files and note the folder ID

  6. Voice Configuration: Obtain voice IDs from ElevenLabs for each required character voice

  7. Test: Submit a test script through the form to verify the complete workflow

  8. Activate: Enable the workflow to start accepting script submissions