Building a GenAI Video Transcription and Chat Application with Docker, OpenAI, and Pinecone

--- title: GenAI video transcription and chat linkTitle: Video transcription and chat description: Explore a generative AI video analysis app that uses Docker, OpenAI, and Pinecone. keywords: python, generative ai, genai, llm, whisper, pinecone, openai, whisper summary: | Learn how to build and deploy a generative AI video analysis and transcription bot using Docker. tags: [ai] aliases: - /guides/use-case/genai-video-bot/ params: time: 20 minutes --- ## Overview This guide presents a project on video transcription and analysis using a set of technologies related to the [GenAI Stack](https://www.docker.com/blog/introducing-a-new-genai-stack/). The project showcases the following technologies: - [Docker and Docker Compose](#docker-and-docker-compose) - [OpenAI](#openai-api) - [Whisper](#whisper) - [Embeddings](#embeddings) - [Chat completions](#chat-completions) - [Pinecone](#pinecone) - [Retrieval-Augmented Generation](#retrieval-augmented-generation) > **Acknowledgment** > > This guide is a community contribution. Docker would like to thank > [David Cardozo](https://www.davidcardozo.com/) for his contribution > to this guide. ## Prerequisites - You have an [OpenAI API Key](https://platform.openai.com/api-keys). > [!NOTE] > > OpenAI is a third-party hosted service and [charges](https://openai.com/pricing) may apply. - You have a [Pinecone API Key](https://app.pinecone.io/). - You have installed the latest version of [Docker Desktop](/get-started/get-docker.md). Docker adds new features regularly and some parts of this guide may work only with the latest version of Docker Desktop. - You have a [Git client](https://git-scm.com/downloads). The examples in this section use a command-line based Git client, but you can use any client. ## About the application The application is a chatbot that can answer questions from a video. In addition, it provides timestamps from the video that can help you find the sources used to answer your question. ## Get and run the application 1. Clone the sample application's repository. In a terminal, run the following command. ```console $ git clone https://github.com/Davidnet/docker-genai.git ``` The project contains the following directories and files: ```text ├── docker-genai/ │ ├── docker-bot/ │ ├── yt-whisper/ │ ├── .env.example │ ├── .gitignore │ ├── LICENSE │ ├── README.md │ └── docker-compose.yaml ``` 2. Specify your API keys. In the `docker-genai` directory, create a text file called `.env` and specify your API keys inside. The following is the contents of the `.env.example` file that you can refer to as an example. ```text #---------------------------------------------------------------------------- # OpenAI #---------------------------------------------------------------------------- OPENAI_TOKEN=your-api-key # Replace your-api-key with your personal API key #---------------------------------------------------------------------------- # Pinecone #---------------------------------------------------------------------------- PINECONE_TOKEN=your-api-key # Replace your-api-key with your personal API key ``` 3. Build and run the application. In a terminal, change directory to your `docker-genai` directory and run the following command. ```console $ docker compose up --build ``` Docker Compose builds and runs the application based on the services defined in the `docker-compose.yaml` file. When the application is running, you'll see the logs of 2 services in the terminal. In the logs, you'll see the services are exposed on ports `8503` and `8504`. The two services are complimentary to each other. The `yt-whisper` service is running on port `8503`. This service feeds the Pinecone database with videos that you want to archive in your knowledge database. The following section explores this service. ## Using the yt-whisper service The yt-whisper service is a YouTube video processing service that uses the OpenAI

This guide explains how to build a video transcription and analysis chatbot using Docker, OpenAI, Whisper, embeddings, Pinecone, and retrieval-augmented generation. It provides instructions on setting up the application, configuring API keys, and running the services using Docker Compose. The application allows users to ask questions about videos and provides timestamps to locate the relevant information within the video. The yt-whisper service processes YouTube videos using OpenAI.