Sunday, July 7, 2024

Got Ollama installed and running.

Ollama is a project that lets you run models locally, but you can also create an api that can be connected to across your local network.  This is how I got it running on a debian box. 

Follow the directions on the site to install ollama.  They are clear.  

But then you have to update a few things. Open a terminal and type the following:

sudo su  

vi /etc/systemd/system/ollama.service

Use the editor of your choice if you don't like vi. Change the text to to match the following, replacing 'yourusername' with your own username in both User and Group. I had to do this so it would save models in my own home folder. My root folder was tiny but my home folder is huge. Then add 'Environment="OLLAMA_HOST=0.0.0.0"' line so that the server will listen for network connections on the local network. 

[Unit]

Description=Ollama Service

After=network-online.target


[Service]

ExecStart=/usr/local/bin/ollama serve

User=yourusername

Group=yourusername

Restart=always

RestartSec=3

Environment="OLLAMA_HOST=0.0.0.0"


[Install]

WantedBy=default.target 

Then save the file. Once that is done you have to restart the service. Type the following: 
systemctl daemon-reload
systemctl enable ollama
systemctl status ollama
And you should not see an error. If it doesn't restart then check the ollama.service file 

After that you can type 
exit
 to get out of sudo. And then follow the directions to 
ollama pull model_name
You can find the model names to pull from here: 
You can see what models are installed with  
ollama list
Mine says: 

NAME         ID          SIZE  MODIFIED     

zephyr:latest bbe38b81adec 4.1 GB 23 hours ago


So for me to run ollama with a model I have to say: 
ollama run zephr:latest
This gives me: 

ryzen7mini:~$ ollama run zephyr:latest

>>> hello

Hello, how may I assist you today? Please let me know if you have any questions or requests. Thank you for 

choosing our service! If you're just saying hello, I'm glad to hear that you're here and welcome to our 

community! Let us know if we can help you with anything else. Have a great day ahead!


>>> /bye


And you can test your api connection in a web browser:


 
You can see the models installed through the api: 


What is interesting about this is that you could populate a drop down in a ui from this list of models. 


And you can access the web api with shell scripts:


Here is that code that I got off the internet: 

#!/bin/sh


curl http://192.168.1.179:11434/api/generate -d '

{  

"model": "zephyr:latest",  

"prompt": "Why is the blue sky blue?",  

"stream": false,

"options":{

  "num_thread": 8,

  "num_ctx": 2024

  }

}' 



And you can tie visual studio code to the web api: 



That is using the codellm extension. I just searched for ollama and it was halfway down the first page of the list.


And this is how I configured it:



To Do:

I want to add another layer to ollama to add RAG and Agents to improve the models and add functions.  I also want to learn how to script these models with a python ollama library. 

I am very happy. :D 

I tried to do this with gpt4all and could not for the life of me get the program to accept connections from the local network.  If anyone knows how to make this happen, I would love to know, because it has a cleaner interface that makes it easy to upload text blocks and has RAG built in. 

It would be great if I could get a web interface to ollama that gives me a similar interafact to gpt4all for just selecting a model entering questions and pasting blocks of text and getting the responses back. 

No comments:

Post a Comment