OpenAI transcribed Google’s YouTube videos to train AI models: Report

OpenAI reportedly transcribed over one million hours of YouTube videos to collect training data for its advanced GPT-4 model, disregarding the Google-owned platform’s copyright rules. According to a report by The New York Times, Microsoft-backed OpenAI used an indigenous speech recognition tool called Whisper to transcribe audio from YouTube videos to yield conversational text, which was then used to train the AI model that powers ChatGPT.

According to the report, makers of ChatGPT internally discussed on how the use of YouTube data for training might be against the platform’s policy.

Read more

You may also like

Comments are closed.