E-Lab Audio Generation Workflow¶
This workflow automates the conversion of educational scripts into high-quality audio files for E-Lab training programs. It processes uploaded script files, extracts text segments, generates voice audio using AI text-to-speech, and delivers final merged audio files to Google Drive for distribution.
Purpose¶
No business context provided yet — add a context.md to enrich this documentation.
How It Works¶
- Script Upload: Users submit lesson scripts via a web form with lesson number, name, and script file
- Text Extraction: The workflow extracts text content from uploaded script files (.txt or .doc)
- Script Parsing: Content is analyzed to identify individual segments and dialogue sections
- Voice Assignment: Each segment gets assigned appropriate voices - single narrator voice for regular content, character-specific voices for dialogue
- Audio Generation: Text segments are converted to speech using ElevenLabs AI voices
- Audio Processing: Generated audio files are uploaded to Cloudinary for temporary storage
- Audio Merging: Shotstack API combines individual audio clips with silence gaps into complete lesson audio
- Final Delivery: Merged audio files are uploaded to Google Drive and tracking records are updated
- Progress Tracking: PostgreSQL database and Airtable maintain processing status throughout the workflow
Workflow Diagram¶
graph TD
A[Form Submit] --> B[Extract from File]
B --> C[Parse Script]
C --> D[Create Record]
D --> E[Assign Voice]
E --> F[Loop Over Items]
F --> G{Segment Type?}
G -->|Single| H[Clean Text]
G -->|Dialogue| I[Parse Dialogue]
H --> J[Generate Voice]
I --> K[Generate Voice for Dialogue]
J --> L[Upload to Cloudinary]
K --> M[Upload to Cloudinary]
L --> N[Save to Static Data]
M --> O[Parse Audio Links]
N --> P[Collect and Group Audios]
O --> Q[Send to Shotstack]
P --> R[Send to Shotstack]
Q --> S[Wait for Processing]
R --> T[Wait for Processing]
S --> U[Check Status]
T --> V[Check Status]
U --> W[Download Audio]
V --> X[Download Audio]
W --> Y[Upload to Google Drive]
X --> Z[Upload to Google Drive]
Y --> AA[Update Records]
Z --> BB[Update Records]
Trigger¶
Form Trigger: Web form accepting:
- lesson_number (number, required)
- lesson_name (text, required)
- script_content (file upload, .txt/.doc, required)
Nodes Used¶
| Node Type | Purpose |
|---|---|
| Form Trigger | Accepts script uploads via web form |
| Extract from File | Extracts text content from uploaded files |
| Code | Parses scripts, assigns voices, processes audio data |
| HTTP Request | Calls ElevenLabs API for voice generation and Shotstack for audio merging |
| Cloudinary | Temporary storage for individual audio files |
| Google Drive | Final storage destination for completed audio files |
| Airtable | Tracks processing status and metadata |
| PostgreSQL | Maintains detailed processing logs |
| Split in Batches | Processes segments individually with rate limiting |
| If | Routes segments based on type (single vs dialogue) |
| Wait | Implements delays for API processing |
External Services & Credentials Required¶
- ElevenLabs API: Text-to-speech generation
- Credential:
E-lab(HTTP Header Auth) - Voice IDs for Linda, James, Mark, Jane, Peter
- Credential:
- Shotstack API: Audio merging and processing
- API keys for staging and production environments
- Cloudinary: Temporary audio file storage
- Credential:
Cloudinary account
- Credential:
- Google Drive: Final audio file storage
- Credential:
Google Drive account 2(OAuth2)
- Credential:
- Airtable: Progress tracking
- Credential:
EXP Training Bot(Token API) - Base: E-Lab Script Generator Logs
- Credential:
- PostgreSQL: Detailed logging
- Credential:
elabdatabase connection
- Credential:
Environment Variables¶
No explicit environment variables are used. All configuration is handled through n8n credentials and hardcoded values within nodes.
Data Flow¶
Input: - Lesson number and name - Script file (.txt or .doc format)
Processing: - Script segments with assigned voice IDs - Individual MP3 audio files - Merged audio with silence gaps - Processing status updates
Output: - Complete lesson audio file in Google Drive - Updated tracking records in Airtable and PostgreSQL - Drive folder links for access
Error Handling¶
The workflow includes basic error handling: - Validation checks for required form data - Status verification for Shotstack rendering completion - Conditional routing based on segment types - Wait nodes to handle API processing delays
No explicit error recovery or notification mechanisms are implemented.
Known Limitations¶
Based on the workflow structure: - Limited to .txt and .doc file formats - Hardcoded voice assignments may not suit all content types - No automatic retry mechanism for failed API calls - Processing time depends on script length and external API performance - Static data storage may not persist across workflow restarts
Related Workflows¶
No related workflows are mentioned in the available context.
Setup Instructions¶
-
Import Workflow: Import the JSON workflow definition into your n8n instance
-
Configure Credentials:
- Set up ElevenLabs API credentials with voice access
- Configure Shotstack API keys for both environments
- Connect Cloudinary account for file storage
- Set up Google Drive OAuth2 for file uploads
- Configure Airtable API access to the logging base
- Set up PostgreSQL database connection
-
Verify External Services:
- Test ElevenLabs voice generation with sample text
- Confirm Shotstack audio merging capabilities
- Validate Google Drive folder permissions
- Check Airtable base structure matches expected schema
- Verify PostgreSQL table schema for logging
-
Test Workflow:
- Submit a test script through the form trigger
- Monitor processing through each stage
- Verify final audio output in Google Drive
- Check tracking records in both Airtable and PostgreSQL
-
Production Deployment:
- Update any staging API keys to production
- Configure appropriate Google Drive folder destinations
- Set up monitoring for workflow execution status