"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Please write 500 words about the fall of Rome."
}
]
}'
```
#### From the host using TCP
To call the `chat/completions` OpenAI endpoint from the host via TCP:
1. Enable the host-side TCP support from the Docker Desktop GUI, or via the [Docker Desktop CLI](/manuals/desktop/features/desktop-cli.md).
For example: `docker desktop enable model-runner --tcp <port>`.
If you are running on Windows, also enable GPU-backed inference.
See [Enable Docker Model Runner](#enable-dmr-in-docker-desktop).
2. Interact with it as documented in the previous section using `localhost` and the correct port.
```bash
#!/bin/sh
curl http://localhost:12434/engines/llama.cpp/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ai/smollm2",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Please write 500 words about the fall of Rome."
}
]
}'
```
#### From the host using a Unix socket
To call the `chat/completions` OpenAI endpoint through the Docker socket from the host using `curl`:
```bash
#!/bin/sh
curl --unix-socket $HOME/.docker/run/docker.sock \
localhost/exp/vDD4.40/engines/llama.cpp/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ai/smollm2",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Please write 500 words about the fall of Rome."
}
]
}'
```
## Known issues
### `docker model` is not recognised
If you run a Docker Model Runner command and see:
```text
docker: 'model' is not a docker command
```
It means Docker can't find the plugin because it's not in the expected CLI plugins directory.
To fix this, create a symlink so Docker can detect it:
```console
$ ln -s /Applications/Docker.app/Contents/Resources/cli-plugins/docker-model ~/.docker/cli-plugins/docker-model
```
Once linked, rerun the command.
### No safeguard for running oversized models
Currently, Docker Model Runner doesn't include safeguards to prevent you from
launching models that exceed your system's available resources. Attempting to
run a model that is too large for the host machine may result in severe
slowdowns or may render the system temporarily unusable. This issue is
particularly common when running LLMs without sufficient GPU memory or system
RAM.
### No consistent digest support in Model CLI
The Docker Model CLI currently lacks consistent support for specifying models by image digest. As a temporary workaround, you should refer to models by name instead of digest.
## Share feedback
Thanks for trying out Docker Model Runner. Give feedback or report any bugs you may find through the **Give feedback** link next to the **Enable Docker Model Runner** setting.