The sample application supports both [Ollama](https://ollama.ai/) and [OpenAI](https://openai.com/). This guide provides instructions for the following scenarios:
- Run Ollama in a container
- Run Ollama outside of a container
- Use OpenAI
While all platforms can use any of the previous scenarios, the performance and
GPU support may vary. You can use the following guidelines to help you choose the appropriate option:
- Run Ollama in a container if you're on Linux, and using a native installation of the Docker Engine, or Windows 10/11, and using Docker Desktop, you
have a CUDA-supported GPU, and your system has at least 8 GB of RAM.
- Run Ollama outside of a container if you're on an Apple silicon Mac.
- Use OpenAI if the previous two scenarios don't apply to you.
Choose one of the following options for your LLM service.
{{< tabs >}}
{{< tab name="Run Ollama in a container" >}}
When running Ollama in a container, you should have a CUDA-supported GPU. While you can run Ollama in a container without a supported GPU, the performance may not be acceptable. Only Linux and Windows 11 support GPU access to containers.
To run Ollama in a container and provide GPU access:
1. Install the prerequisites.
- For Docker Engine on Linux, install the [NVIDIA Container Toolkit](https://github.com/NVIDIA/nvidia-container-toolkit).
- For Docker Desktop on Windows 10/11, install the latest [NVIDIA driver](https://www.nvidia.com/Download/index.aspx) and make sure you are using the [WSL2 backend](/manuals/desktop/features/wsl/_index.md#turn-on-docker-desktop-wsl-2)
2. Add the Ollama service and a volume in your `compose.yaml`. The following is
the updated `compose.yaml`:
```yaml {hl_lines=["24-38"]}
services:
server:
build:
context: .
ports:
- 8000:8000
env_file:
- .env
depends_on:
database:
condition: service_healthy
database:
image: neo4j:5.11
ports:
- "7474:7474"
- "7687:7687"
environment:
- NEO4J_AUTH=${NEO4J_USERNAME}/${NEO4J_PASSWORD}
healthcheck:
test:
[
"CMD-SHELL",
"wget --no-verbose --tries=1 --spider localhost:7474 || exit 1",
]
interval: 5s
timeout: 3s
retries: 5
ollama:
image: ollama/ollama:latest
ports:
- "11434:11434"
volumes:
- ollama_volume:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
volumes:
ollama_volume:
```
> [!NOTE]
>
> For more details about the Compose instructions, see [Turn on GPU access with Docker Compose](/manuals/compose/how-tos/gpu-support.md).
3. Add the ollama-pull service to your `compose.yaml` file. This service uses
the `docker/genai:ollama-pull` image, based on the GenAI Stack's
[pull_model.Dockerfile](https://github.com/docker/genai-stack/blob/main/pull_model.Dockerfile).
The service will automatically pull the model for your Ollama
container. The following is the updated section of the `compose.yaml` file: