Monday, May 22, 2023

Got a chat ai running on an old intel macbook air with 8GB of RAM.

 It is not a speed demon, but it types faster than a person.  Going to give details about what I used at the end.


I tried using the small model with the restrictions, but it was nearly unusable. It gave me 3 paragraphs about why it wouldn't answer a knock knock joke.  Seriously, why would you restrict jokes as somehow bad?  The uncensored version just works without huge disclaimers on every answer. 

This demonstrates that it knows a lot of math.



This is demonstrating that it can tell fairy tails in many different languages.


I was having a good conversation on moon bases and how to protect the earth from killer asteroids, when it confabulated that we can hide from volcanos in the dust, ash, and smoke from volcanos. 



I had it write an essay on potatoes, and it gave a really funny line : 

The introduction of the potato led to the end of the Little Ice Age, which had caused widespread famine across Europe.
This is a bad cause and effect fallacy and a huge jump in topic in the same sentence.  


How I got it working

I used llama.cpp from this link:

https://github.com/ggerganov/llama.cpp

I just followed the directions at the bottom of the page. 



I used the q5_0 model from this web site:


If you have 16GB of ram you can probably run the 13B parameter file which is twice as big.

Put this model in the llama/models/ directory.

I put the following command in a script file to make it easy to run:

#!/bin/bash


#

# Temporary script - will be removed in the future

#


cd `dirname $0`

cd ..


./main -t 4 -m models/WizardLM-7B-uncensored.ggml.q5_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -i -ins


I put this file in the examples directory and made it executable. 


Then I run the script and it works. 

No comments:

Post a Comment