Skip to content

Commit

Permalink
draft faq and fix ingest (#620)
Browse files Browse the repository at this point in the history
<!--
Please include a summary of the change and which issue is fixed along
with any relevant motivation, context and any dependencies that are
required for this change along with any breaking changes that may be
introduced.
-->
  • Loading branch information
sammcj authored Nov 28, 2024
1 parent 886196c commit 0e5668a
Show file tree
Hide file tree
Showing 3 changed files with 86 additions and 12 deletions.
66 changes: 66 additions & 0 deletions content/llm-faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
title: "LLM FAQ"
description: "Frequently Asked Questions about LLMs and AI"
aliases: ["llm", "faq", "frequently-asked-questions","llm-faq","ollama-faq"]
tags: ["ai", "tools", "llm", "tech", "llms", "ollama","llama","faq","ollama faq","llm faq"]
author: "Sam McLeod"
norss: false
comments: false
showDate: true
subtitle: "Frequently Asked Questions about LLMs and AI"
hidemeta: false
readingTime: true
ShowReadingTime: true
ShowWordCount: false
ShowBreadCrumbs: true
ShowPostNavLinks: true
mermaid: true
disableShare: false
disableHLJS: false
UseHugoToc: false
hideSummary: false
ShowRssButtonInSectionTermList: true
# cover:
# image: "diagram-gen.png"
# alt: "DiagramGen"
# hidden: false
toc:
enable: true
auto: true
draft: false
---

<!-- markdownlint-disable MD025 -->

## Ollama

### "Is Ollama just a wrapper for Llama.cpp?"

No.
Ollama uses llama.cpp as it's inference engine, but provides a different set of features.

Someone made a claim the other day that Ollama was "just a llama.cop wrapper" my comment was as follows:

With llama.cpp running on a machine, how do you connect your LLM clients to it and request a model gets loaded with a given set of parameters and templates?

... you can't, because llama.cpp is the inference engine - and it's bundled llama-cpp-server binary only provides very basic server functionality (really more of demo/example) which is all configured at the time you run the binary and manually provide it command line args for the one specific model and configuration you start it with.

Ollama provides a server and client for interfacing and packaging models, such as:

- Hot loading models (e.g. when you request a model from your client Ollama will load it on demand).
- Automatic model parallelisation.
- Automatic model concurrency.
- Automatic memory calculations for layer and gpu/cpu placement.
- Layered model configuration (basically docker images for models).
- Templating and distribution of model parameters, templates in a container image.
- Near feature complete OpenAI compatible API as well as a native API.
- Native libraries for common languages.
- Official container images for hosting.
- Provides a client/server model for running remote or local inference servers with either ollama or openai compatible clients.
- Support for both an official and self hosted model and template repositories.

Ollama currently supports serving llama.cpp's GGUF, Vision LLMs that llama.cpp's example server does not and HF safetensors models, they are adding additional model backends which will be coming soon (e.g. things like exl2, awq, etc...).

Ollama is not "better" or "worse" than llama.cpp because it's completely different.

---
1 change: 0 additions & 1 deletion content/posts/2024-07-28-ingest/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -275,4 +275,3 @@ As always, if you find any bugs or have ideas for improvements, don't hesitate t

Happy ingesting!

<script src="http://api.html5media.info/1.1.8/html5media.min.js"></script>
31 changes: 20 additions & 11 deletions hugo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: smcleod.net
theme: ["PaperMod"]
gitRepo: "https://github.com/sammcj/smcleod"

paginate: 150
pagination.pagerSize: 150
enableEmoji: true
enableGitInfo: true
enableRobotsTXT: true
Expand Down Expand Up @@ -48,6 +48,23 @@ params:
showtoc: true
tocopen: true

social:
twitter: "s_mcleod"
github: "sammcj"
linkedin: "sammcj"
mastodon: "s_mcleod"

socialIcons:
- name: github
url: "https://github.com/sammcj/"
icon: github
- name: linkedin
url: "https://www.linkedin.com/in/sammcj/"
icon: linkedin
- name: mastodon
url: "https://aus.social/@s_mcleod"
icon: mastodon

assets:
# disableHLJS: true # to disable highlight.js
disableFingerprinting: true
Expand Down Expand Up @@ -82,16 +99,6 @@ params:
Title: "smcleod.net \U0001F44B"
Content: "The personal blog of Sam McLeod. I write about AI, DevOps, Platform Engineering, and other geeky topics."

socialIcons:
- name: github
url: "https://github.com/sammcj/"
icon: github
- name: linkedin
url: "https://www.linkedin.com/in/sammcj/"
icon: linkedin
- name: mastodon
url: "https://aus.social/@s_mcleod"
icon: mastodon

cover:
hidden: true # hide everywhere but not in structured data
Expand All @@ -114,6 +121,7 @@ params:
minMatchCharLength: 0
limit: 10 # refer: https://www.fusejs.io/api/methods.html#search
keys: ["title", "permalink", "summary", "content"]

menu:
main:
- identifier: "Posts"
Expand Down Expand Up @@ -227,3 +235,4 @@ taxonomies:
category: "categories"
tag: "tags"
series: "series"
social: "socials"

0 comments on commit 0e5668a

Please sign in to comment.