gghfez/command-a-03-2025-AWQ

Tested with vllm==0.10.1

Usage:

vllm serve gghfez/command-a-03-2025-AWQ --port 8080 --host 0.0.0.0 --dtype bfloat16 --max-model-len 32768 -tp 4 --gpu-memory-utilization 0.9
Downloads last month
19
Safetensors
Model size
21B params
Tensor type
BF16
·
I64
·
I32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gghfez/command-a-03-2025-AWQ

Quantized
(29)
this model