MAESTRO: Ankit Sanjyal

MAESTRO: AI Text-to-Video Lecture System

A full-stack platform that automatically generates educational video lectures from text topics, with an interactive RAG-powered QA bot for student engagement.

LangChain OpenAI PlayHT FFmpeg Flask MongoDB Fordham University

GitHub Live Platform

What It Does

MAESTRO (Multi-modal AI Educational System for Teaching and Resource Organization) is a research platform built at Fordham University's Educational Data Mining Lab. An instructor provides a topic or document; MAESTRO produces a complete video lecture with synthesized narration, synchronized slides, and an embeddable QA bot that students can query about the lecture content.

The platform is live at erdos.dsm.fordham.edu and is actively used in Fordham courses.

Generation Pipeline

Script Synthesis

OpenAI GPT generates a structured lecture script from the input topic/document

Neural Voice

PlayHT converts the script to natural-sounding audio via neural text-to-speech

Slide Generation

Key points extracted and rendered as visual slides matched to script segments

Video Assembly

FFmpeg synchronizes audio + slides into a final MP4 lecture video

RAG QA Bot

LangChain + OpenAI embeddings index the transcript for interactive student Q&A

Key Features

Retrieval-Augmented QA

Students can ask questions about any lecture using a chatbot that retrieves relevant transcript segments and answers with GPT: grounded in the actual lecture content, not hallucinations.

Instructor Customization

Instructors can edit the generated script before rendering, adjust voice style, choose slide templates, and set the lecture's depth level (introductory / intermediate / advanced).

MongoDB Storage

All lecture transcripts, embeddings, and metadata stored in MongoDB for fast retrieval and QA indexing. Supports multiple concurrent courses and student sessions.

Research Context

Part of active research in Educational Data Mining. The platform generates learning data (click patterns, QA logs) that feeds back into EDM research on student engagement and comprehension.

My Role

As Graduate Research Assistant at the Fordham EDM Lab (Aug 2024–present), I led the MAESTRO project, including:

Designed and implemented the end-to-end video generation pipeline (script → voice → slides → video)
Integrated LangChain + OpenAI embeddings for the retrieval-augmented QA system
Built the Flask API backend and MongoDB data layer
Deployed the platform to the live Fordham server

Tech Stack

AI: OpenAI GPT (script generation + QA), OpenAI Embeddings, PlayHT (neural TTS)
RAG: LangChain pipeline with MongoDB vector store
Video: FFmpeg for audio-visual synchronization and MP4 rendering
Backend: Python, Flask REST API
Database: MongoDB (transcripts, embeddings, lecture metadata)