如何使用Ollama

Python访问

conda activate llm_test
pip install ollama

ollama_test.py

import ollama

response = ollama.chat(model="llama3.1", messages=[

{"role": "user",
"content": "什么是数据治理,和数据管理有什么区别?"},
])

print(response)
$ python ollama_test.py 
model='llama3.1' created_at='2025-05-21T07:43:26.260998065Z' done=True done_reason='stop' total_duration=3600133750 load_duration=1005090191 prompt_eval_count=23 prompt_eval_duration=104226959 eval_count=96 eval_duration=2489433343 message=Message(role='assistant', content='答:数据治理是一种指挥、协调及控制数据的方式,使得数据能够被有效地生产、使用。它包含了数据质量、安全性、可用性的保证,以及对数据的访问、运算和存储等方面的管理。\n\n不同于传统的数据管理,数据治理更加注重的是数据的全生命周期管理及控制,包括数据的创建、使用、更新、移除等过程。', images=None, tool_calls=None)

容器运行且能访问GPU

安装Nvidia-Container-Tookit

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update

sudo apt-get install -y nvidia-container-toolkit

// 配置
sudo nvidia-ctk runtime configure --runtime=docker

$ cat /etc/docker/daemon.json
{
    "registry-mirrors": [
        "https://docker.m.daocloud.io"
    ],
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

单独在容器内运行Ollama

// CPU
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

// GPU
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

docker run --gpus all -d -v /opt/ai/ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

docker exec -it ollama ollama run llama3.2

docker exec -it ollama sh

让容器可以访问ollama

sudo vi /etc/systemd/system/ollama.service

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=/home/ps/anaconda3/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment=OLLAMA_MODELS=/jppeng/app/models/ollama
Environment="CUDA_VISIBLE_DEVICES=0,1"

[Install]
WantedBy=default.target
sudo systemctl daemon-reload
sudo systemctl restart ollama

如何在容器中使用Ollama的服务

docker exec -it fastgpt /bin/sh
$ curl http://10.0.0.11:11434
Ollama is running/app $

// 注意:主机IP不可为localhost

使用OpenWebUI

docker run -d -p 3001:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama