Configuring Ollama, OpenAI, and Custom Models in Zed

Download and install Ollama from [ollama.com/download](https://ollama.com/download) (Linux or macOS) and ensure it's running with `ollama --version`. 1. Download one of the [available models](https://ollama.com/models), for example, for `mistral`: ```sh ollama pull mistral ``` 2. Make sure that the Ollama server is running. You can start it either via running Ollama.app (macOS) or launching: ```sh ollama serve ``` 3. In the Agent Panel, select one of the Ollama models using the model dropdown. #### Ollama Context Length {#ollama-context} Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models. Zed API requests to Ollama include this as `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of ram are able to use most models out of the box. See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults. > **Note**: Token counts displayed in the Agent Panel are only estimates and will differ from the model's native tokenizer. Depending on your hardware or use-case you may wish to limit or increase the context length for a specific model via settings.json: ```json { "language_models": { "ollama": { "api_url": "http://localhost:11434", "available_models": [ { "name": "qwen2.5-coder", "display_name": "qwen 2.5 coder 32K", "max_tokens": 32768, "supports_tools": true, "supports_thinking": true, "supports_images": true } ] } } } ``` If you specify a context length that is too large for your hardware, Ollama will log an error. You can watch these logs by running: `tail -f ~/.ollama/logs/ollama.log` (macOS) or `journalctl -u ollama -f` (Linux). Depending on the memory available on your machine, you may need to adjust the context length to a smaller value. You may also optionally specify a value for `keep_alive` for each available model. This can be an integer (seconds) or alternatively a string duration like "5m", "10m", "1h", "1d", etc. For example, `"keep_alive": "120s"` will allow the remote server to unload the model (freeing up GPU VRAM) after 120 seconds. The `supports_tools` option controls whether or not the model will use additional tools. If the model is tagged with `tools` in the Ollama catalog this option should be supplied, and built in profiles `Ask` and `Write` can be used. If the model is not tagged with `tools` in the Ollama catalog, this option can still be supplied with value `true`; however be aware that only the `Minimal` built in profile will work. The `supports_thinking` option controls whether or not the model will perform an explicit “thinking” (reasoning) pass before producing its final answer. If the model is tagged with `thinking` in the Ollama catalog, set this option and you can use it in zed. The `supports_images` option enables the model’s vision capabilities, allowing it to process images included in the conversation context. If the model is tagged with `vision` in the Ollama catalog, set this option and you can use it in zed. ### OpenAI {#openai} > ✅ Supports tool use 1. Visit the OpenAI platform and [create an API key](https://platform.openai.com/account/api-keys) 2. Make sure that your OpenAI account has credits 3. Open the settings view (`agent: open configuration`) and go to the OpenAI section 4. Enter your OpenAI API key The OpenAI API key will be saved in your keychain. Zed will also use the `OPENAI_API_KEY` environment variable if it's defined. #### Custom Models {#openai-custom-models} The Zed Assistant comes pre-configured to use the latest version for common models (GPT-3.5 Turbo, GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o mini). To use alternate models, perhaps a preview release or a dated model release, or if you wish to control the request parameters, you can do so by adding the following to your Zed `settings.json`:

This section explains how to configure Ollama and OpenAI with Zed. For Ollama, it involves downloading and running Ollama, pulling a model, and selecting it in the Agent Panel. The context length can be adjusted in settings.json to avoid hardware limitations, and a keep_alive duration can be specified. Flags for tools, thinking, and vision can be enabled according to the Ollama catalog. For OpenAI, it involves creating an API key, ensuring the account has credits, and entering the key in Zed's settings. Users can configure custom OpenAI models in the settings.json to use alternate models or control request parameters.