A Workflow for Creating Narration for Voice-Over Presentation Using Commercially Available Artificial Intelligence

Althaf Hussain Kallamadugu
Construction Development and Planning Department, Clemson University SC 29634, USA
althafk@g.clemson.edu

Nurudeen Segun Lawal
Construction Development and Planning Department, Clemson University SC 29634, USA

Joseph Michael Burgett
Construction Development and Planning Department, Clemson University SC 29634, USA

Abstract

This rapid communication presents a multi-step workflow for recreating existing course lectures using artificial intelligence (AI) and natural language processing (NLP). The workflow encompasses audio extraction from original lectures, transcript refinement via ChatGPT and human proofreading, audio regeneration through text-to-speech, closed captioning, presentation recreation with AI-generated content, and the development of supplementary resources like study guides and AI chatbots. The implemented approach leverages AI to enhance educational accessibility and personalization while balancing automation with human oversight. Potential benefits include tailored learning experiences and data-driven decision-making. However, ethical considerations surrounding AI biases, intellectual property, privacy, and misinformation must be carefully addressed before deployment. Overall, the workflow demonstrates AI’s transformative potential in education.

Keywords: Artificial Intelligence, Generated PowerPoint, human-created, online asynchronous, workflow

© 2024 under the terms of the J ATE Open Access Publishing Agreement

Introduction 

When professors are unavailable to teach online courses, it creates a big challenge for educational institutions. This paper introduces a comprehensive AI-assisted workflow to revise or recreate course materials when the original instructor isn’t available. The process involves extracting audio from existing lectures, refining the transcripts using speech recognition and natural language processing, and generating new audio through text-to-speech technology. Additionally, closed captions are added for accessibility, and presentations are recreated with updated audiovisual content. Supplementary learning resources are also developed using AI-generated content. This workflow allows institutions to effectively repurpose and refresh course materials, ensuring that students can continue learning seamlessly, both online and offline. The AI-generated materials can be customized to meet the needs of individual students or entire classes, enhancing personalized learning and improving understanding and knowledge retention. By addressing the issue of instructor unavailability, this approach highlights the transformative potential of AI and natural language processing in education. It paves the way for more efficient, accessible, and personalized learning experiences that keep pace with technological advancements in education. 

Literature review 

AI in online education has come a long way, incorporating advanced technologies like web-based systems, humanoid robots, and chatbots to make learning more engaging and effective [1]. These advancements in AI and related technologies are transforming not just education, but also how professionals are trained and operate in various fields [2]. The combination of AI and wireless network technology has significantly transformed online engineering courses, showing how impactful AI can be on educational platforms [3]. Various AI applications, including chatbots, robotic assistants, and even holograms, are now helping teachers and students, enhancing the overall effectiveness of the educational system [4]. Such interactive tools not only enhance learning but also prepare students for real-world applications of their knowledge [5]. AI has also proven beneficial in streamlining processes within educational institutions. It has shown potential in optimizing student enrollment, improving retention rates, and providing organizational guidance and support to students [6]. In the K-12 sector, the idea of a “Turing Teacher” has been explored, focusing on the key features necessary for effective AI-powered teaching tools [7]. Most AI technologies in education have centered on developing intelligent tutoring systems, virtual laboratories, and assessment tools, which help create more interactive and efficient learning experiences [8]. Regarding voice synthesis, research suggests that a speaker’s enthusiasm can affect social cues, cognitive load, and learning transfer in multimedia learning environments [9]. Some studies recommend using human-recorded voices over synthetic ones for virtual agents, emphasizing the value of human-like interactions in educational settings [10]. Overall, integrating AI into online education presents numerous opportunities to improve learning outcomes, increase student engagement, and personalize the educational experience. By leveraging AI technologies effectively, educational institutions can create dynamic and interactive learning environments that cater to the diverse needs of learners. 

Methods 

A stepwise workflow has been created to generate AI voice-over presentations based on traditional human voice-over presentations.  Figure 1 provides the steps for the workflow. 

Fig 1. Stepwise workflow for creating AI-generated workflow 

This workflow uses a multi-step approach to recreate course lectures with artificial intelligence and natural language processing techniques. The process includes (1) extracting audio from the original course and converting it into text transcripts, (2) refining the transcripts with ChatGPT 4.0 and human proofreading, (3) generating new audio files from the refined transcripts, (4) creating closed captions, (5) recreating the course presentations with AI-generated content, (6) developing a study guide, and (7) creating an AI chatbot based on the study guide. 

1) Audio Extraction: Audio files were extracted from the existing course and converted into text transcripts using the online transcript converter (The researchers found KwiCut to be a reliable online transcript converter for this workflow, but many other converters are available commercially). The transcripts were then refined with the assistance of ChatGPT. 

2) Transcript Refinement: Transcript accuracy and relevance were improved through two approaches: (a) Refining the transcript with ChatGPT 4.0, using a simple rephrasing prompt for each slide to enhance clarity and coherence while preserving core content and (b) Human proofreading of the transcripts for further refinement. This combined AI-assisted refinement and human oversight process ensured a highly accurate and contextually relevant transcript. 

3) Fine-tuning and Proofreading: The final transcript generated by ChatGPT 4.0 underwent human proofreading to ensure content relevance and accuracy. Repeated words and predefined acronyms were identified and eliminated. A Subject Matter Expert (SME) thoroughly reviewed the transcript for accuracy before proceeding to audio generation. 

4) Audio Generation:  The transcript was imported into an online text-to-speech application. While many applications are available, the researchers used “ElevenLabs” for this workflow.  ElevenLabs has over two dozen AI voices and can clone the original instructor’s voice.  This application was used to create the narration for the PowerPoint presentations.  The researchers found that longer transcripts must be divided into smaller sections to avoid mispronunciation and synthetic slurring. 

5)  Closed Caption Creation: Closed caption files were generated using the online close caption generating software (KwiCut used in this study), extracting captions in the SRT format. The SRT files were uploaded into presentation software and manually refined to ensure error-free captions. 

Results and Discussion 

Presentation Recreation 

The AI-generated audio files and closed captions were seamlessly added to the presentations, replacing the original human voiceovers. Animations were resynchronized with the new AI narrations to create a smooth and engaging audiovisual experience. 

Study Guide and Chatbot Development 

A detailed study guide was created using refined transcripts, providing students with valuable resources to understand the course material better. An AI-powered chatbot was also developed based on this study guide and the transcripts. This chatbot helps students get answers to their questions when the professor is not available. The researchers used AskAI to easily create a chatbot trained with this study’s transcript and other relevant texts. Example: The following two hyperlinks demonstrate the original voiceover provided by the instructor [11] and the recreated presentation generated through AI [12]. Link #1:  Original.  Link #2:  AI Presentation 

Figure 2. (a) Human Tutor teaching in Zoom class  
Figure 2. (b) AI Tutor teaching in Zoom class  

Benefits of the AI-Assisted Course Recreation Workflow 

The AI-assisted workflow proposed here brings several benefits that can improve the educational experience and support informed decision-making. One key advantage is its ability to create personalized learning resources that match each student’s unique needs, preferences, and learning styles. Using AI-generated content, materials like study guides and chatbots can be adjusted in real time, making learning more engaging and interactive for students. Another important feature is the workflow’s capability to generate closed captions and audio narrations, making course content more accessible to students with different abilities and learning preferences. This ensures inclusivity in the learning process. Furthermore, including human proofreading and subject matter expertise guarantees that the course content is accurate and relevant, maintaining high standards of educational quality. 

Potential Ethical Considerations 

While the AI-assisted course recreation workflow offers many benefits, professors and educational institutions must consider the ethical implications before using this technology. One major concern is that AI-generated content could unintentionally reinforce biases or spread false information, especially if the training data or algorithms are biased. It’s crucial to have strict measures to ensure that AI-generated content is accurate, fair, and unbiased. This includes having human oversight and fact-checking procedures to verify the content. Additionally, there are issues regarding intellectual property rights and giving credit to the original creators. As AI technology advances, questions arise about who owns and should be credited for AI-generated content. Institutions need clear policies to protect the rights of content creators and experts while acknowledging AI systems’ role in content creation. Addressing these ethical considerations is essential for responsibly integrating AI into educational practices, ensuring that it enhances learning experiences while upholding fairness, accuracy, and respect for intellectual property rights. The AI-assisted course recreation workflow offers several advantages in terms of time, cost, and applicability. 

Time-wise, this approach can significantly speed up updating or recreating course materials when an instructor is unavailable. Instead of starting from scratch, existing lectures can be quickly transformed into new, polished content. Cost-wise, while there may be initial investments in AI tools and software, the long-term savings could be substantial. Universities won’t need to hire as many temporary instructors or spend as much time and money on manual content creation. In terms of applicability, this method is versatile. It can be used across various subjects and course types, making it a flexible solution for many educational institutions. The ability to easily generate closed captions and multiple language versions also increases accessibility for diverse student populations. However, there are considerations to keep in mind. The quality of the AI-generated content will depend on the quality of the original material and the effectiveness of the AI tools used. There may also be a learning curve for staff to become proficient with the new technology. Additionally, while AI can handle much of the work, human oversight is still crucial. Time must be allocated for proofreading, fact-checking, and ensuring the content meets educational standards. Overall, this workflow presents an innovative approach to course content creation that could save time and money while maintaining educational quality, though it requires careful implementation and ongoing human involvement to be truly effective. 

Conclusion 

This research shows that AI can be a powerful tool for recreating and updating online course materials. By using a step-by-step process that combines AI technology with human oversight, universities can quickly refresh their courses when instructors aren’t available. The key takeaway is that this method can save time and money while maintaining quality. It allows personalized learning experiences and makes content more accessible through features like closed captions and AI chatbots. However, it’s important to remember that AI isn’t perfect. Human experts still need to check the content for accuracy and fairness. There are also ethical concerns, like protecting intellectual property rights and avoiding bias in AI-generated material. Overall, this approach demonstrates how AI can transform online education, making it more efficient and adaptable. However, using this technology responsibly and in ways that truly benefit students and educators is crucial. 

Acknowledgments 

The authors would like to acknowledge Clemson University and the Clemson Applied Technology (CAT) Lab for supporting this research.  It was only through their support that this research was possible. 

The authors would also like to acknowledge the use of the AI large language model ChatGPT 4.0 in the drafting of this rapid communication for grammatical assistance. All facts and statements presented in this paper were independently verified. 

Disclosures. The authors declare no conflicts of interest. 

[1] L. Chen, P. Chen, & Z. Lin, “Artificial intelligence in education: a review”. IEEE Access, 8, 75264-75278, 2020. https://doi.org/10.1109/access.2020.2988510 

[2] J.M. Burgett,  “UAS Law Enforcement Technicians in South Carolina: An Exploration of Supply and Demand. Journal of Advanced Technological Education.  Volume 2(1), (2023a).   

[3] C. Che, Q. Luo, & Y. Mao, “The reform of engineering professional online education courses by artificial intelligence and wireless network technology in the context of engineering certification” Wireless Communications and Mobile Computing, 2022, 1-11. https://doi.org/10.1155/2022/3822931 

[4] Q. Zhao, and S. Nazir, “English multimode production and usage by artificial intelligence and online reading for sustaining effectiveness”, Mobile Information Systems, 2022, 1-16. https://doi.org/10.1155/2022/6780502 

[5] J.M. Burgett, “METAR SMS Text Message Service to Support Part 107 Compliance: A Classroom Lab Exercise”. Journal of Advanced Technological Education.  Volume 3(1), (2023a).   

[6] H. Lukianets, and T. Lukianets, “Promises and perils of AI use on the tertiary educational level” Грааль Науки (Grail of Science), (25), 306-311, 2023, https://doi.org/10.36074/grail-of-science.17.03.2023.053 

 [7] A. Pelaez, A. Jacobson, K. Trias, & E. Winston, “The turing teacher: identifying core attributes for AI learning in k-12”. Frontiers in Artificial Intelligence, 5. (2022), https://doi.org/10.3389/frai.2022.1031450 

[8] X. Zhai, X.Chu, C. Chai, M.Jong, A.Starčič, M. Spector, … & Y. Li, “A review of artificial intelligence (ai) in education from 2010 to 2020”. Complexity, 2021, 1-18. https://doi.org/10.1155/2021/8812542  

[9] T. Liew, S. Tan, T. Tan, & S. Kew, “Does speaker’s voice enthusiasm affect social cue, cognitive load and transfer in multimedia learning?”. Information and Learning Sciences, 121(3/4), 117-135. 2020,  https://doi.org/10.1108/ils-11-2019-0124 

[10] A. Abdulrahman, and D. Richards, “Is natural necessary? human voice versus synthetic voice for intelligent virtual agents”. Multimodal Technologies and Interaction, 6(7), 51. 2022 https://doi.org/10.3390/mti6070051 

[11] “Human Voice Presentation,” Demo Drone Course. [Online]. Available: https://demo-drone-course.s3.amazonaws.com/course1/human-voice-presentation/story.html 

[12] “AI Voice Presentation,” Demo Drone Course. [Online]. Available: https://demo-drone-course.s3.amazonaws.com/course2/ai-voice-presentation/story.html