Ollama is a free tool that allows you to runs LLMs (Large Language Models) directly on your local machine. This is convenient for AI developers, researchers, or just experimenting and learning about AI.
There is a lot to grasp initially so the best thing to do is just to jump in, set it up and use it. Then ask questions along the way.
Download Ollama
Firstly, let’s download Ollama at: https://ollama.com/download for your specific OS and then install it. When installed, you won’t see anything. It just runs in the background. To interact with it, you’ll have to use the command line.
Interacting with Ollama
Once Ollama is installed, open your terminal and type ollama --version
. Here you can see I’m running 0.5.11

Let’s get familiar with some basic commands. Type ollama --help
to show some basic commands such as:
seandotau@aseandotaus-MBP ~ % ollama --help
Large language model runner
Usage:
ollama [flags]
ollama [command]
Available Commands:
serve Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
stop Stop a running model
pull Pull a model from a registry
push Push a model to a registry
list List models
ps List running models
cp Copy a model
rm Remove a model
help Help about any command
Flags:
-h, --help help for ollama
-v, --version Show version information
These command are self explanatory once you’ve run it a few times but for first timers, let’s go through some examples.
ollama serve
This starts your Ollama app via command line. If you have the “GUI” application though, you can run that and not have to run the serve command. Also, if you run other commands such as ollama list
, the Ollama “GUI” application will start automatically. All this is to say that you probably don’t have to run ollama serve
.
seandotau@aseandotaus-MBP ~ % ollama serve
2025/02/15 21:32:18 routes.go:1186: INFO server config env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Users/seandotau/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]"
time=2025-02-15T21:32:18.285+11:00 level=INFO source=images.go:432 msg="total blobs: 6"
time=2025-02-15T21:32:18.285+11:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-02-15T21:32:18.286+11:00 level=INFO source=routes.go:1237 msg="Listening on 127.0.0.1:11434 (version 0.5.11)"
time=2025-02-15T21:32:18.335+11:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="21.3 GiB" available="21.3 GiB"
[GIN] 2025/02/15 - 21:32:43 | 200 | 66.833µs | 127.0.0.1 | HEAD "/"
[GIN] 2025/02/15 - 21:32:43 | 200 | 1.078416ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2025/02/15 - 21:32:57 | 200 | 28.167µs | 127.0.0.1 | HEAD "/"
ollama list
This will list the models you have downloaded. It will show blank for now so let’s download a model.
seandotau@aseandotaus-MBP ~ % ollama list
NAME ID SIZE MODIFIED
ollama pull
To download a model we need to run ollama pull
, but we need to know what model to pull. This is where: https://ollama.com/search comes into play. It lists all the model options available but how do you choose?
model summary
Llama: Llama is an NLP (Natural Language Processing) model for tasks like text generation, summarization, and machine translation. It is ideal for general-purpose chat, conversational AI, and answering questions.
Mistral: Mistral, similar to Llama, handles code generation and large-scale data analysis, making it ideal for developers working on AI-driven coding platforms
Phi-4: An AI language model developed by Microsoft, is a 14B parameter state-of-the-art small language model (SLM) that excels at complex reasoning in areas such as math, in addition to conventional language processing.
LLaVa: LLaVA is a multimodal model capable of processing text and images. (A multimodal model is an AI system that can process and understand multiple types of information or “modes” – such as text, images, audio, and video – often simultaneously.)
Code Llama: A large language model that can use text prompts to generate and discuss code.
These are just but a small snippet of what is available. What you’ll also notice are model size options. eg for llama3.2, there is a 1 billion or 3 billion parameter size option. The more parameters, the more accurate the model, but the larger the model size. Eg pulling Llama 3.2:1b will take 1.3Gb but Llama 3.2:3b will take 2Gb. Compare this to Llama 3.1:140b which will take a whopping 243Gb of disk space.
So now we know about types of models and model sizes, let’s pull Llama3.2 which is simple and lightweight. Run ollama pull llama3.2
.
Notice I didn’t add the model parameter? If it is left out, then it will default to ollama pull llama3.2:latest
which currently is the equivalent of ollama pull llama3.2:3b
.
seandotau@aseandotaus-MBP ~ % ollama pull llama3.2
pulling manifest
pulling dde5aa3fc5ff... 100% ▕████████████████████████████████▏ 2.0 GB
pulling 966de95ca8a6... 100% ▕████████████████████████████████▏ 1.4 KB
pulling fcc5a6bec9da... 100% ▕████████████████████████████████▏ 7.7 KB
pulling a70ff7e570d9... 100% ▕████████████████████████████████▏ 6.0 KB
pulling 56bb8bd477a5... 100% ▕████████████████████████████████▏ 96 B
pulling 34bb5ab01051... 100% ▕████████████████████████████████▏ 561 B
verifying sha256 digest
writing manifest
success
ollama show
Now if you run ollama list
you should see the model you just pulled.
seandotau@aseandotaus-MBP ~ % ollama list
NAME ID SIZE MODIFIED
llama3.2:3b a80c4f17acd5 2.0 GB 28 seconds ago
What is neat is that you can also show the information of this model by running ollama show llama3.2
seandotau@aseandotaus-MBP ~ % ollama show llama3.2
Model
architecture llama
parameters 3.2B
context length 131072
embedding length 3072
quantization Q4_K_M
Parameters
stop "<|start_header_id|>"
stop "<|end_header_id|>"
stop "<|eot_id|>"
License
LLAMA 3.2 COMMUNITY LICENSE AGREEMENT
Llama 3.2 Version Release Date: September 25, 2024
ollama rm
To remove a model, run ollama rm llama3.2
for instance. Tip: If you interrupted a download, you’ll have partial files in your /User/seandotau/.ollama/models/blob directory so you’ll have to go there and delete it manually.
ollama push, create, cp
These commands here you probably won’t be using much unless you will be creating models, copying models or pushing it to the model registry for others to use.
ollama run
Now comes the moment of truth. Running the model and giving it a test drive. When you run the model, you can start interacting with it.
seandotau@aseandotaus-MBP ~ % ollama run llama3.2
>>> what is the capital of Germany?
The capital of Germany is Berlin.
>>> Send a message (/? for help)
Cool right? To exit, type /bye
Extra cool features
Asking AI to summarise some text
seandotau@aseandotaus-MBP ~ % ollama run llama3.2 "summarise this file in 100 words: $(cat hobbit.text )"
Here is a 100-word summary of the file:
The Hobbit, written by J.R.R. Tolkien in 1937, is a classic children's fantasy novel that has sold over 100
million copies worldwide. The story follows Bilbo Baggins, a hobbit who joins Gandalf and dwarves on a quest to
reclaim their treasure from the dragon Smaug. The book features themes of personal growth, heroism, and warfare,
drawing from Tolkien's experiences in World War I and his scholarly knowledge of Germanic philology and
mythology. Adaptations for stage, screen, radio, board games, and video games have received critical
recognition, cementing its legacy as a beloved children's fantasy novel.
Make sure that the text file is in the current directory you are in.
Asking AI to interpret a picture
What you’ll notice here is that ollama will pull the model specified first (as I didn’t already have it) and then run it.
seandotau@aseandotaus-MBP ~ % ollama run llava "What's in this image? /Users/seandotau/image.png"
pulling manifest
pulling 170370233dd5... 100% ▕██████████████████████████████████████████████████████▏ 4.1 GB
pulling 72d6f08a42f6... 100% ▕██████████████████████████████████████████████████████▏ 624 MB
pulling 43070e2d4e53... 100% ▕██████████████████████████████████████████████████████▏ 11 KB
pulling c43332387573... 100% ▕██████████████████████████████████████████████████████▏ 67 B
pulling ed11eda7790d... 100% ▕██████████████████████████████████████████████████████▏ 30 B
pulling 7c658f9561e5... 100% ▕██████████████████████████████████████████████████████▏ 564 B
verifying sha256 digest
writing manifest
success
Added image '/Users/seandotau/image.png'
The image features Pikachu, a popular Pokémon character from the franchise. It is a small electric mouse with
yellow fur, large ears, and big round eyes. Pikachu is standing upright on its hind legs, and it appears to be
looking directly at the camera.
FYI: this was the image.
