E LAB AUDIO GENERATION¶
This workflow automates the conversion of educational lesson scripts into high-quality audio files using AI text-to-speech technology. It processes structured lesson scripts, generates voice narration with multiple AI voices, handles both single-narrator and dialogue segments, merges audio clips, and delivers the final audio files to Google Drive with comprehensive tracking in Airtable.
Purpose¶
No business context provided yet — add a context.md to enrich this documentation.
How It Works¶
- Script Upload: Users submit lesson scripts through a web form, providing lesson number, name, and script file
- Script Parsing: The workflow extracts and parses the script content, identifying different segment types (single narrator vs. dialogue)
- Voice Assignment: Assigns appropriate AI voices - a default voice for single segments, and character-specific voices for dialogue segments
- Audio Generation: Converts text to speech using ElevenLabs API with high-quality voice models
- Audio Processing: Uploads individual audio clips to Cloudinary for temporary storage
- Audio Merging: Uses Shotstack API to merge multiple audio clips with silence gaps into cohesive files
- Final Delivery: Downloads merged audio and uploads to Google Drive for permanent storage
- Progress Tracking: Updates Airtable records throughout the process to track status and provide download links
The workflow handles two parallel processing paths - one optimized for script segments and another for individual audio files, ensuring efficient processing of different content types.
Workflow Diagram¶
graph TD
A[Form Submit] --> B[Extract from File]
B --> C[Parse Script]
C --> D[Create Airtable Record]
D --> E[Assign Voice]
E --> F[Loop Over Items]
F --> G{Segment Type?}
G -->|Single| H[Wait2]
G -->|Dialogue| I[Wait]
H --> J[Code in JavaScript]
J --> K[Generate Voice]
K --> L[Upload Script Audio]
L --> M[Save Upload to Static Data]
M --> N[Collect and Group Script Audios]
N --> O[Send Script Group to Shotstack]
O --> P[Wait for Shotstack]
P --> Q[Get Script Render Status]
Q --> R{Render Complete?}
R -->|Yes| S[Download Merged Script Audio]
S --> T[Upload Script Audio to Drive]
T --> U[Update Script Record]
I --> V[Parse Dialogue]
V --> W[Generate Voice for Dialogue]
W --> X[Upload Asset from File Data]
X --> Y[Parse Audio Link]
Y --> Z[Send Audio to Shotstack]
Z --> AA[Wait1]
AA --> BB[Get Merge Audio Status]
BB --> CC{Success?}
CC -->|Yes| DD[Get Audio]
DD --> EE[Upload File]
EE --> FF[Update Record]
FF --> F
U --> F
Trigger¶
Form Trigger: A web form titled "E-Lab Audio Automation" that accepts:
- lesson_number (required number field)
- lesson_name (required text field)
- script_content (required file upload, accepts .txt and .doc files)
Nodes Used¶
| Node Type | Purpose |
|---|---|
| Form Trigger | Accepts lesson script submissions via web form |
| Extract from File | Extracts text content from uploaded script files |
| Code (JavaScript) | Parses scripts, assigns voices, processes audio data |
| Airtable | Creates and updates tracking records for lessons |
| Split in Batches | Processes script segments in batches |
| If | Routes processing based on segment type (single vs dialogue) |
| Wait | Adds delays for API rate limiting |
| HTTP Request | Calls ElevenLabs API for voice generation and Shotstack for audio merging |
| Cloudinary | Temporary storage for individual audio clips |
| Google Drive | Final storage destination for completed audio files |
| No Operation | Placeholder for conditional flow control |
External Services & Credentials Required¶
ElevenLabs API¶
- Purpose: AI text-to-speech voice generation
- Credential: HTTP Header Authentication (
E-lab) - Models Used:
eleven_multilingual_v2 - Output Format: MP3 at 44.1kHz
Shotstack API¶
- Purpose: Audio merging and rendering
- Authentication: API key in headers
- Output: MP3 format audio files
Cloudinary¶
- Purpose: Temporary audio file storage and processing
- Credential: Cloudinary API (
Cloudinary account) - Resource Type: Video (for audio files)
Airtable¶
- Purpose: Progress tracking and lesson management
- Credential: Airtable Token API (
EXP Training Bot,E-Lab Script Generator Logs) - Tables: Scripts, Audio generation logs
Google Drive¶
- Purpose: Final audio file storage
- Credential: Google Drive OAuth2 API (
Google Drive account 2) - Folder: E-lab Audio files
Environment Variables¶
No explicit environment variables are used. All configuration is handled through n8n credentials and hardcoded values within the workflow nodes.
Data Flow¶
Input¶
- Lesson number (integer)
- Lesson name (string)
- Script file (.txt or .doc format)
Processing¶
- Script content parsed into segments with IDs like
EDU_16_1_5 - Voice assignments: Default voice for narration, character-specific voices for dialogue
- Audio generation with 128kbps MP3 quality
- Temporary Cloudinary storage with public URLs
- Audio merging with 0.5-second silence gaps
Output¶
- Merged MP3 audio files stored in Google Drive
- Airtable records with status tracking and download links
- File naming convention:
{segment_id}.mp3or{segment_id}_merged.mp3
Error Handling¶
The workflow includes basic error handling through: - Conditional checks for successful API responses - Status validation before proceeding to next steps - Airtable status updates to track failed processes - Wait nodes to handle API rate limits and processing delays
No explicit error recovery or retry mechanisms are implemented.
Known Limitations¶
Based on the workflow structure: - Processing time depends on script length and external API response times - No automatic retry for failed API calls - Limited to ElevenLabs voice models and Shotstack processing capabilities - File size limitations based on external service constraints - Sequential processing may be slow for very large scripts
Related Workflows¶
No related workflows are explicitly referenced in the current workflow configuration.
Setup Instructions¶
-
Import Workflow: Import the JSON workflow definition into your n8n instance
-
Configure Credentials:
- Set up ElevenLabs API credentials with header authentication
- Configure Shotstack API key
- Add Cloudinary API credentials
- Set up Airtable API token with access to the E-Lab Script Generator Logs base
- Configure Google Drive OAuth2 credentials
-
Verify External Services:
- Ensure Airtable base
appeEMxtMrmeiVNW0exists with required tables - Confirm Google Drive folder
1M3uHDGoEn1atGs1lBVYOdJyTmpMWAUhOis accessible - Test Cloudinary upload permissions
- Ensure Airtable base
-
Voice Configuration:
- Verify ElevenLabs voice IDs are valid and accessible
- Update character voice mappings if needed
- Test voice generation with sample text
-
Activate Workflow: Enable the workflow to start accepting form submissions
-
Test: Submit a sample lesson script through the form trigger to verify end-to-end functionality
The workflow will be accessible via the generated form URL once activated.