Skip to content

Commit

Permalink
ollama tweaks (#1448)
Browse files Browse the repository at this point in the history
* ollama tweaks

* suggested by @pcuenca

* move TOC

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
  • Loading branch information
julien-c and ngxson authored Oct 16, 2024
1 parent 481c63a commit 6a102ac
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 7 deletions.
4 changes: 2 additions & 2 deletions docs/hub/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -144,8 +144,8 @@
title: GGUF usage with llama.cpp
- local: gguf-gpt4all
title: GGUF usage with GPT4All
- local: ollama
title: Use Ollama with GGUF Model
- local: ollama
title: GGUF usage with Ollama
- title: Datasets
local: datasets
isExpanded: true
Expand Down
12 changes: 7 additions & 5 deletions docs/hub/ollama.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ ollama run hf.co/{username}/{repository}

Please note that you can use both `hf.co` and `huggingface.co` as the domain name.

Here are some other models that you can try:
Here are some models you can try:

```sh
ollama run hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF
Expand All @@ -23,7 +23,9 @@ ollama run hf.co/bartowski/Humanish-LLama3-8B-Instruct-GGUF

## Custom Quantization

By default, the `Q4_K_M` quantization scheme is used. To select a different scheme, simply add a tag:
By default, the `Q4_K_M` quantization scheme is used, when it's present inside the model repo. If not, we default to picking one reasonable quant type present inside the repo.

To select a different scheme, simply add a tag:

```sh
ollama run hf.co/{username}/{repository}:{quantization}
Expand All @@ -40,15 +42,15 @@ ollama run hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF:Q8_0
# the quantization name is case-insensitive, this will also work
ollama run hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF:iq3_m

# you can also select a specific file
# you can also directly use the full filename as a tag
ollama run hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF:Llama-3.2-3B-Instruct-IQ3_M.gguf
```

## Custom Chat Template and Parameters

By default, a template will be selected automatically from a list of commonly used templates. It will be selected based on the built-in `tokenizer.chat_template` metadata stored inside the GGUF file.

If your GGUF file doesn't have a built-in template or uses a custom chat template, you can create a new file called `template` in the repository. The template must be a Go template, not a Jinja template. Here's an example:
If your GGUF file doesn't have a built-in template or if you want to customize your chat template, you can create a new file called `template` in the repository. The template must be a Go template, not a Jinja template. Here's an example:

```
{{ if .System }}<|system|>
Expand All @@ -59,7 +61,7 @@ If your GGUF file doesn't have a built-in template or uses a custom chat templat
{{ .Response }}<|end|>
```

To know more about Go template format, please refer to [this documentation](https://github.com/ollama/ollama/blob/main/docs/template.md)
To know more about the Go template format, please refer to [this documentation](https://github.com/ollama/ollama/blob/main/docs/template.md)

You can optionally configure a system prompt by putting it into a new file named `system` in the repository.

Expand Down

0 comments on commit 6a102ac

Please sign in to comment.