Eval bug: How to load clip_model_load to CUDA #11250

zzc98 · 2025-01-15T09:36:42Z

Name and Version

version: 4393 (d79d8f3)
built with x86_64-conda-linux-gnu-cc (conda-forge gcc 14.2.0-1) 14.2.0 for x86_64-conda-linux-gnu

Operating systems

Linux

GGML backends

CUDA

Hardware

NVIDIA GeForce RTX 4090

Models

Qwen2-VL-7B-Instruct-Q5_K_M.gguf

Problem description & steps to reproduce

I use the following command:

llama-qwen2vl-cli -m Qwen2-VL-7B-Instruct-Q5_K_M.gguf --mmproj mmproj-Qwen2-VL-7B-Instruct-f16.gguf -p Describe this picture --image demo.jpeg

observe

clip_model_load: CLIP using CPU backend

How to load clip_model_load to CUDA

First Bad Commit

No response

Relevant log output

llm_load_print_meta: max token length = 256
llm_load_tensors: offloading 28 repeating layers to GPU
llm_load_tensors: offloading output layer to GPU
llm_load_tensors: offloaded 29/29 layers to GPU

clip_model_load: - type  f32:  325 tensors
clip_model_load: - type  f16:  196 tensors
clip_model_load: CLIP using CPU backend
clip_model_load: text_encoder:   0
clip_model_load: vision_encoder: 1

The text was updated successfully, but these errors were encountered:

danbev · 2025-01-15T09:46:37Z

I think you are missing the --n-gpu-layers option so that model layers are offloaded to the GPU:

-ngl N, --n-gpu-layers N: When compiled with GPU support, this option allows offloading some layers to the GPU for computation. Generally results in increased performance.

ngxson · 2025-01-15T15:03:33Z

GPU backend support is intentionally disabled for clip. I'm not sure why, but probably missing kernel (so it will crash if you force loading to GPU)

zzc98 added the bug-unconfirmed label Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: How to load clip_model_load to CUDA #11250

Eval bug: How to load clip_model_load to CUDA #11250

zzc98 commented Jan 15, 2025

danbev commented Jan 15, 2025

ngxson commented Jan 15, 2025

Eval bug: How to load clip_model_load to CUDA #11250

Eval bug: How to load clip_model_load to CUDA #11250

Comments

zzc98 commented Jan 15, 2025

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

danbev commented Jan 15, 2025

ngxson commented Jan 15, 2025