Загрузка страницы

I Replaced Myself with AI: The Scary Future

The beginning clip of this video was created with OpenGPT, Descript, Wav2Lip and Capsule's AI Studio. The results were not what I expected AI content to look like...

Affiliate links to YouTube gear I use:
Sony a7siii: https://go.magik.ly/ml/1qb8i/
Sony A7c: https://go.magik.ly/ml/1qb8k/
14in M1 Pro MacBook Pro: https://go.magik.ly/ml/1qb83/
Mac Studio: https://go.magik.ly/ml/1qb8o/

Timestamps:
0:00 AI Generated Clip
0:16 Introduction
1:31 What Goes Into a Video
3:31 ChatGPT
6:18 Wav2Lip and Descript
10:26 Capsule Studio AI, Dall-E, Stable Diffusion and Stock Footage Sites
11:36 AI Banana Review
13:13 Ethics of AI
14:46 AI Accuracy
17:05 Using AI as an Assistant
18:06 AI Assisted Video
20:13 Conclusion

AI Tools mentioned in this video:
ChatGPT: https://openai.com/blog/chatgpt/
Wav2Lip: https://github.com/Rudrabha/Wav2Lip
Descript: https://www.descript.com
Capsule AI Studio: https://ai.capsule.video
Dall-E: https://openai.com/dall-e-2/
Stable Diffusion: https://huggingface.co/spaces/stabilityai/stable-diffusion

Before an AI can replace me, it needs to be able to do what a YouTuber does. Creating a YouTube video can be split into multiple parts. It requires you make a script, a slightly clickbait title, tags, and thumbnails. And that’s the bare necessities, then there’s the video itself, this requires A-roll, which is this sitting shot of me talking right now. Audio, which is what you’re hearing in your ears right now, and b-roll which are just shots of things that are sprinkled throughout the video to give you further context into what I’m talking about.

We’ll be using ChatGPT. Chat GPT is an AI that's able to have a conversation with you. People use it to write code, chat and it can be used to provide a summary of things in seconds. I asked it to write a tech review on a Banana. ChatGPT can more than just write a script. I asked it for a list of b-roll shots, Clickbait titles and what my thumbnail should look like.

I opted to use Descript because it has a voice cloning option that can generate full sentences and be used to fix audio mistakes you make during videos. And since I want to recycle my already existing library of A-roll instead of having to record myself, I used Wav2Lip that takes existing video and changes your lip flaps to any audio clip that you want combined with it. Full disclosure, Wav2Lip is free to use, but it’s for personal use, research and development. And should not be used commercially without the author’s consent. The purpose of it appearing in this video, is just to show you what AI is capable of now. The goal with these two tools is to make sure the AI can mimic my voice, and that my mouth moves how it;s supposed to with that voice. So, I fed Descript 2 hours worth of audio of my voice so that they can generate my overdub voice.

With Descript, I’m able to type what I want fake Jimmy to say. I went ahead and copy and pasted the banana script in Descript to generate a fake audio clip, then used Wav2Lip to sync my lips from an existing video to the audio of fake me reviewing a banana.

Now that we have a-roll video created of me talking, we need to sort out our B-roll situation. I'm going to take all the b-roll ideas that we created using ChatGPT as a reference and use it with Capsule’s Studio AI. This tool takes my footage, transcribes the audio, then enables me to select certain parts of the video in order to generate a b-roll shot using a text prompt for that section of the video. Once I’m satisfied with the results, I can use free or paid stock footage sites like storyblocks, or shutterstock to fill in the missing blanks and then if I’m still missing any additional footage, I can generate photos using photo generating AI like Dall-E or Stable Diffusion to create the remaining clips I would need.

While Chat GPT was pretty accurate for our example with the Banana, I tried other things too. I tried to ask it to make a review on the 14in M1 Pro MacBook pro a laptop that I’ve reviewed on this channel. It was mostly accurate telling me the specs, and some of the features, but in some areas, it got information plain wrong. After reading through the macbook script the AI generated, and making more Macbook reviews using Chat GPT I found that it was accurate about ~80% of the time, of course that’s just my own observations. That’s pretty impressive. But there’s still going to be the remaining 20% of the time you’ll have to review the work or make fixes to make sure it’s accurate. So while it cuts down a ton of time you spend coming up with ideas and writing content, now you spend extra time reviewing it and looking for mistakes. And that’s not just ChatGPT. It was the other tools too. We saw that Wav2Lip couldn’t fully remove some blockiness around the lips, we saw Descript overdub replicate my voice but some words and phrases sounded kind of warped, unnatural, and emotionless. So obviously these AI tools are very good but not perfect, and honestly might never be perfect.

Видео I Replaced Myself with AI: The Scary Future канала Jimmy Tries World
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
4 февраля 2023 г. 1:30:01
00:21:22
Яндекс.Метрика