By default, Zed will use `stable` versions of models, but you can use specific versions of models, including [experimental models](https://ai.google.dev/gemini-api/docs/models/experimental-models). You can configure a model to use [thinking mode](https://ai.google.dev/gemini-api/docs/thinking) (if it supports it) by adding a `mode` configuration to your model. This is useful for controlling reasoning token usage and response speed. If not specified, Gemini will automatically choose the thinking budget.
Here is an example of a custom Google AI model you could add to your Zed `settings.json`:
```json
{
"language_models": {
"google": {
"available_models": [
{
"name": "gemini-2.5-flash-preview-05-20",
"display_name": "Gemini 2.5 Flash (Thinking)",
"max_tokens": 1000000,
"mode": {
"type": "thinking",
"budget_tokens": 24000
}
}
]
}
}
}
```
Custom models will be listed in the model dropdown in the Agent Panel.
### LM Studio {#lmstudio}
> ✅ Supports tool use
1. Download and install the latest version of LM Studio from https://lmstudio.ai/download
2. In the app press ⌘/Ctrl + Shift + M and download at least one model, e.g. qwen2.5-coder-7b
You can also get models via the LM Studio CLI:
```sh
lms get qwen2.5-coder-7b
```
3. Make sure the LM Studio API server is running by executing:
```sh
lms server start
```
Tip: Set [LM Studio as a login item](https://lmstudio.ai/docs/advanced/headless#run-the-llm-service-on-machine-login) to automate running the LM Studio server.
### Mistral {#mistral}
> ✅ Supports tool use
1. Visit the Mistral platform and [create an API key](https://console.mistral.ai/api-keys/)
2. Open the configuration view (`assistant: show configuration`) and navigate to the Mistral section
3. Enter your Mistral API key
The Mistral API key will be saved in your keychain.
Zed will also use the `MISTRAL_API_KEY` environment variable if it's defined.
#### Custom Models {#mistral-custom-models}
The Zed Assistant comes pre-configured with several Mistral models (codestral-latest, mistral-large-latest, mistral-medium-latest, mistral-small-latest, open-mistral-nemo, and open-codestral-mamba). All the default models support tool use. If you wish to use alternate models or customize their parameters, you can do so by adding the following to your Zed `settings.json`:
```json
{
"language_models": {
"mistral": {
"api_url": "https://api.mistral.ai/v1",
"available_models": [
{
"name": "mistral-tiny-latest",
"display_name": "Mistral Tiny",
"max_tokens": 32000,
"max_output_tokens": 4096,
"max_completion_tokens": 1024,
"supports_tools": true
}
]
}
}
}
```
Custom models will be listed in the model dropdown in the assistant panel.
### Ollama {#ollama}
> ✅ Supports tool use
Download and install Ollama from [ollama.com/download](https://ollama.com/download) (Linux or macOS) and ensure it's running with `ollama --version`.
1. Download one of the [available models](https://ollama.com/models), for example, for `mistral`:
```sh
ollama pull mistral
```
2. Make sure that the Ollama server is running. You can start it either via running Ollama.app (macOS) or launching:
```sh
ollama serve
```
3. In the Agent Panel, select one of the Ollama models using the model dropdown.
#### Ollama Context Length {#ollama-context}
Zed has pre-configured maximum context lengths (`max_tokens`) to match the capabilities of common models.
Zed API requests to Ollama include this as `num_ctx` parameter, but the default values do not exceed `16384` so users with ~16GB of ram are able to use most models out of the box.
See [get_max_tokens in ollama.rs](https://github.com/zed-industries/zed/blob/main/crates/ollama/src/ollama.rs) for a complete set of defaults.
> **Note**: Token counts displayed in the Agent Panel are only estimates and will differ from the model's native tokenizer.