Загрузка...

Voice Genius AI: Extract, Transcribe & Chat with Any Audio Using Whisper, Ollama, Anthropic & OpenAI

🎙️ Voice Genius AI: Extract, Transcribe & Chat with Your Audio/Video Content 🚀
✨ Discover Voice Genius AI - the ultimate open-source tool for extracting and transcribing audio from videos and audio files! Built using OpenAI's Whisper models, this powerful application lets you seamlessly convert spoken content to text and then interact with it using both local and cloud-based LLMs through a sleek Q&A interface. 💬

⚠️ Note to viewers: Please excuse the audio distortion in portions of this video up until the 28-minute mark due to microphone gain issues. The content is still fully comprehensible, and the audio improves significantly after that point.

🕒 Video Timeline:
• 00:08 Intro
• 00:14 Voice Genius AI App Concept
• 00:39 Voice Genius AI Github Repo
• 01:25 Part 1 - Environment Setup
○ 01:32 💻 Visual Studio Code - Integrated Development Environment
○ 01:57 🤖 Ollama - Chat with and Integrate Local LLMs
○ 03:59 🐍 Python - Your go-to programming Language
• 04:20 Part 2 - App Demo & Walkthrough
○ 05:01 🚀 Running the app through VoiceGenius.sh
○ 06:50 🖥️ Voice Genius AI App User Interface
○ 07:53 🎧 Audio Transcription
○ 09:17 💬 Q&A Functionality using Ollama
○ 11:25 🔄 Loading multiple LLMs with Ollama
○ 12:55 📝 Q&A Session Export
○ 14:04 🎬 Transcribe Video from URL
○ 15:10 🧠 Q&A Functionality using Anthropic
○ 16:01 🔮 Q&A Functionality using OpenAI
• 17:12 Part 3 - Code Explanation
○ 17:51 📚 Code Section 1 - Libraries Import
○ 18:57 🛠️ Code Section 2 - Solving Compatibility Issues
○ 19:39 📊 Code Section 3 - Creating Performance Metrics
○ 20:30 💾 Code Section 4 - Adding Chat Export Functionality
○ 21:59 📋 Code Section 5 - Active Models, Caching & Session State
○ 23:16 🔌 Code Section 6 - Adding Local and Cloud-based LLMs
○ 24:41 🔊 Code Section 7 - Audio Extraction & Processing
○ 26:29 📝 Code Section 8 - Transcribe Extracted Audio
○ 28:40 🤔 Code Section 9 - Q&A System Setup
○ 31:53 📤 Code Section 10 - Export and Save Transcripts
○ 32:42 📱 Code Section 11 - App Sidebar User Interface
○ 36:15 🖼️ Code Section 12 - Main Content Area, Transcription & Media Processing
○ 38:22 💬 Code Section 13 - Q&A Interface, Chat & Transcription
○ 40:27 👋 Code Section 14 & 15 - Welcome Screen User Interface
• 41:04 Outro

Whether you're a developer looking to build on top of this framework or a content creator needing accurate transcription tools, Voice Genius AI offers a flexible solution for all your audio processing needs! 🌟

👍 Like, subscribe, and share if you found this helpful! Drop a comment below with any questions or feature requests for Voice Genius AI. 🙏

🔗 Resources:
⚫ 📂 GitHub Repository: www.github.com/MoAshour93/VoiceGeniusAI
⚫ 🌐 Personal Website: www.apcmasterypath.co.uk - Visit for insights on AI applications in the construction industry and RICS APC candidate teaching/mentoring packages
⚫ My personal Github page: https://github.com/MoAshour93
⚫ Our Website: www.apcmasterypath.co.uk
⚫ All APC Mastery Path Blogposts: https://www.apcmasterypath.co.uk/blog-list
⚫ Personal Linkedin Page: https://www.linkedin.com/in/mohamed-ashour-0727/
⚫ APC Mastery Path Linkedin Page: https://www.linkedin.com/company/apc-mastery-path

📽️Useful videos:
⚫Llama 3.1 Conversational Chat Template for Finetuning using Unsloth & Deployment to Open WebUI: https://www.youtube.com/watch?v=owfxFA_L5g4&t=1s
⚫ Unsloth FineTuning & Comparing LLMs:Mistral, Gemma 2, Llama 3.1 with Chatbot Deployment on OpenWebUI: https://www.youtube.com/watch?v=73kcBMbEgWw&t=1s
⚫Create Training Data for Finetuning LLMs: https://www.youtube.com/watch?v=v2GniOB2D_U&t=1011s&ab_channel=APCMasteryPath
⚫Finetune your LLMs on custom datasets using Unsloth: https://www.youtube.com/watch?v=Y3T4FNRSFlE&t=921s
⚫Deploy Open WebUI with Zero Coding Skills : https://www.youtube.com/watch?v=5uT1rL6DKV8&t=7s
#AudioProcessing #AITranscription #OpenSource #Whisper #LLM #VoiceGeniusAI #PythonDevelopment #Ollama #Anthropic #OpenAI

Видео Voice Genius AI: Extract, Transcribe & Chat with Any Audio Using Whisper, Ollama, Anthropic & OpenAI канала APC Mastery Path
Страницу в закладки Мои закладки
Все заметки Новая заметка Страницу в заметки