Home Explore Blog CI



docker

1st chunk of `content/guides/rag-ollama/develop.md`
8dd192e4fc35c4cfd8848ea3a20d4251a5c3620447c3f77a0000000100000b79
---
title: Use containers for RAG development
linkTitle: Develop your app
weight: 10
keywords: python, local, development, generative ai, genai, llm, rag, ollama
description: Learn how to develop your generative RAG application locally.
aliases:
  - /guides/use-case/rag-ollama/develop/
---

## Prerequisites

Complete [Containerize a RAG application](containerize.md).

## Overview

In this section, you'll learn how to set up a development environment to access all the services that your generative RAG application needs. This includes:

- Adding a local database
- Adding a local or remote LLM service

> [!NOTE]
> You can see more samples of containerized GenAI applications in the [GenAI Stack](https://github.com/docker/genai-stack) demo applications.

## Add a local database

You can use containers to set up local services, like a database. In this section, you'll explore the database service in the `docker-compose.yaml` file.

To run the database service:

1. In the cloned repository's directory, open the `docker-compose.yaml` file in an IDE or text editor.

2. In the `docker-compose.yaml` file, you'll see the following:

   ```yaml
   services:
     qdrant:
       image: qdrant/qdrant
       container_name: qdrant
       ports:
         - "6333:6333"
       volumes:
         - qdrant_data:/qdrant/storage
   ```

   > [!NOTE]
   > To learn more about Qdrant, see the [Qdrant Official Docker Image](https://hub.docker.com/r/qdrant/qdrant).

3. Start the application. Inside the `winy` directory, run the following command in a terminal.

   ```console
   $ docker compose up --build
   ```

4. Access the application. Open a browser and view the application at [http://localhost:8501](http://localhost:8501). You should see a simple Streamlit application.

5. Stop the application. In the terminal, press `ctrl`+`c` to stop the application.

## Add a local or remote LLM service

The sample application supports both [Ollama](https://ollama.ai/). This guide provides instructions for the following scenarios:

- Run Ollama in a container
- Run Ollama outside of a container

While all platforms can use any of the previous scenarios, the performance and
GPU support may vary. You can use the following guidelines to help you choose the appropriate option:

- Run Ollama in a container if you're on Linux, and using a native installation of the Docker Engine, or Windows 10/11, and using Docker Desktop, you
  have a CUDA-supported GPU, and your system has at least 8 GB of RAM.
- Run Ollama outside of a container if running Docker Desktop on a Linux Machine.

Choose one of the following options for your LLM service.

{{< tabs >}}
{{< tab name="Run Ollama in a container" >}}

When running Ollama in a container, you should have a CUDA-supported GPU. While you can run Ollama in a container without a supported GPU, the performance may not be acceptable. Only Linux and Windows 11 support GPU access to containers.

Title: Setting Up a Development Environment with Containers for RAG Applications
Summary
This section explains how to set up a development environment for RAG applications using containers, including adding a local database (Qdrant) and a local or remote LLM service (Ollama). It provides instructions for running Ollama either inside or outside a container, with considerations for performance and GPU support on different platforms. The guide also includes steps to start and stop the application using Docker Compose.