Using the yt-whisper Service for YouTube Video Transcription and Indexing

#---------------------------------------------------------------------------- PINECONE_TOKEN=your-api-key # Replace your-api-key with your personal API key ``` 3. Build and run the application. In a terminal, change directory to your `docker-genai` directory and run the following command. ```console $ docker compose up --build ``` Docker Compose builds and runs the application based on the services defined in the `docker-compose.yaml` file. When the application is running, you'll see the logs of 2 services in the terminal. In the logs, you'll see the services are exposed on ports `8503` and `8504`. The two services are complimentary to each other. The `yt-whisper` service is running on port `8503`. This service feeds the Pinecone database with videos that you want to archive in your knowledge database. The following section explores this service. ## Using the yt-whisper service The yt-whisper service is a YouTube video processing service that uses the OpenAI Whisper model to generate transcriptions of videos and stores them in a Pinecone database. The following steps show how to use the service. 1. Open a browser and access the yt-whisper service at [http://localhost:8503](http://localhost:8503). 2. Once the application appears, in the **Youtube URL** field specify a Youtube video URL and select **Submit**. The following example uses [https://www.youtube.com/watch?v=yaQZFhrW0fU](https://www.youtube.com/watch?v=yaQZFhrW0fU).

The yt-whisper service downloads the audio of the video, uses Whisper to transcribe it into a WebVTT (`*.vtt`) format (which you can download), then uses the text-embedding-3-small model to create embeddings, and finally uploads those embeddings in to the Pinecone database. After processing the video, a video list appears in the web app that informs you which videos have been indexed in Pinecone. It also provides a button to download the transcript.

This section details how to use the yt-whisper service, a YouTube video processing tool that uses OpenAI's Whisper to transcribe videos and store them in a Pinecone database. Users can access the service through a web interface, input a YouTube video URL, and submit it for processing. The service downloads the video's audio, transcribes it into WebVTT format, generates embeddings using the text-embedding-3-small model, and uploads these embeddings to Pinecone. Once processed, the web app displays a list of indexed videos and provides a button to download the video's transcript.