Configuring LM Studio, Mistral, and Ollama in Zed

By default, Zed will use `stable` versions of models, but you can use specific versions of models, including [experimental models](https://ai.google.dev/gemini-api/docs/models/experimental-models). You can configure a model to use [thinking mode](https://ai.google.dev/gemini-api/docs/thinking) (if it supports it) by adding a `mode` configuration to your model. This is useful for controlling reasoning token usage and response speed. If not specified, Gemini will automatically choose the thinking budget. Here is an example of a custom Google AI model you could add to your Zed `settings.json`: ```json { "language_models": { "google": { "available_models": [ { "name": "gemini-2.5-flash-preview-05-20", "display_name": "Gemini 2.5 Flash (Thinking)", "max_tokens": 1000000, "mode": { "type": "thinking", "budget_tokens": 24000 } } ] } } } ``` Custom models will be listed in the model dropdown in the Agent Panel. ### LM Studio {#lmstudio} > ✅ Supports tool use 1. Download and install the latest version of LM Studio from https://lmstudio.ai/download 2. In the app press ⌘/Ctrl + Shift + M and download at least one model, e.g. qwen2.5-coder-7b You can also get models via the LM Studio CLI: ```sh lms get qwen2.5-coder-7b ``` 3. Make sure the LM Studio API server is running by executing: ```sh lms server start ``` Tip: Set [LM Studio as a login item](https://lmstudio.ai/docs/advanced/headless#run-the-llm-service-on-machine-login) to automate running the LM Studio server. ### Mistral {#mistral} > ✅ Supports tool use 1. Visit the Mistral platform and [create an API key](https://console.mistral.ai/api-keys/) 2. Open the configuration view (`assistant: show configuration`) and navigate to the Mistral section 3. Enter your Mistral API key The Mistral API key will be saved in your keychain. Zed will also use the `MISTRAL_API_KEY` environment variable if it's defined. #### Custom Models {#mistral-custom-models} The Zed Assistant comes pre-configured with several Mistral models (codestral-latest, mistral-large-latest, mistral-medium-latest, mistral-small-latest, open-mistral-nemo, and open-codestral-mamba). All the default models support tool use. If you wish to use alternate models or customize their parameters, you can do so by adding the following to your Zed `settings.json`: ```json { "language_models": { "mistral": { "api_url": "https://api.mistral.ai/v1", "available_models": [ { "name": "mistral-tiny-latest", "display_name": "Mistral Tiny", "max_tokens": 32000, "max_output_tokens": 4096, "max_completion_tokens": 1024, "supports_tools": true } ] } } } ``` Custom models will be listed in the model dropdown in the assistant panel. ### Ollama {#ollama} > ✅ Supports tool use Download and install Ollama from [ollama.com/download](https://ollama.com/download) (Linux or macOS) and ensure it's running with `ollama --version`. 1. Download one of the [available models](https://ollama.com/models), for example, for `mistral`: ```sh ollama pull mistral ``` 2. Make sure that the Ollama server is running. You can start it either via running Ollama.app (macOS) or launching: ```sh ollama serve ``` 3. In the Agent Panel, select one of the Ollama models using the model dropdown. #### Ollama Context Length {#ollama-context} Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models. Zed API requests to Ollama include this as `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of ram are able to use most models out of the box. See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults. > **Note**: Token counts displayed in the Agent Panel are only estimates and will differ from the model's native tokenizer.

This section provides instructions on configuring LM Studio, Mistral, and Ollama with Zed. For LM Studio, users need to download and install the application, download a model via the app or CLI, and ensure the LM Studio API server is running. Mistral requires obtaining an API key from the Mistral platform and entering it in Zed's configuration, with options for customizing models and parameters via the `settings.json`. Ollama requires downloading and installing Ollama, pulling a model, and ensuring the Ollama server is running. Zed has pre-configured maximum context lengths for Ollama models, and the displayed token counts are estimates.