-
-
-
-
-
-
Inference Providers
Active filters:
verl
junnyu/Qwen2.5-7B-Instruct-1M-GRPO_logic_KK_5PPL
Text Generation
•
8B
•
Updated
•
5
sonyashijin/qwen3-32b-verilog-lora
LichengLiu03/Qwen2.5-3B-UFO
Text Generation
•
3B
•
Updated
•
37
•
2
LichengLiu03/Qwen2.5-3B-UFO-1turn
Text Generation
•
3B
•
Updated
•
2
•
2
mradermacher/Qwen2.5-3B-UFO-GGUF
3B
•
Updated
•
277
•
1
mradermacher/Qwen2.5-3B-UFO-1turn-GGUF
3B
•
Updated
•
147
•
1
Text Generation
•
0.6B
•
Updated
•
67
•
2
Jasaxion/MathSmith-HC-Problem-Synthesizer-Qwen3-8B
8B
•
Updated
•
3
•
1
Jasaxion/MathSmith-Hard-Problem-Synthesizer-Qwen3-8B
8B
•
Updated
•
37
•
1
thejaminator/grpo-feature-vector-step-1
Text Generation
•
8B
•
Updated
•
396
•
9
Text Generation
•
Updated
karthik/verl-qwen2.5-0.5b-gsm8k-ppo-step360
Text Generation
•
0.5B
•
Updated
•
3
samhitha2601/llama3.2-3b-ppo
Reinforcement Learning
•
Updated
•
4
samhitha2601/llama3.2-3b-ppo-critic
Reinforcement Learning
•
Updated
•
3
mradermacher/MathSmith-HC-Problem-Synthesizer-Qwen3-8B-GGUF
8B
•
Updated
•
1.26k
•
1
mradermacher/MathSmith-HC-Problem-Synthesizer-Qwen3-8B-i1-GGUF
8B
•
Updated
•
1.95k
•
1
mradermacher/MathSmith-Hard-Problem-Synthesizer-Qwen3-8B-GGUF
8B
•
Updated
•
114
•
1
mradermacher/MathSmith-Hard-Problem-Synthesizer-Qwen3-8B-i1-GGUF
8B
•
Updated
•
136
•
1
asatheesh/deepmath-qwen3-4b-instruct-drgrpo-lora
Reinforcement Learning
•
Updated
asatheesh/deepmath-qwen3-4b-instruct-rloo-lora
Reinforcement Learning
•
Updated
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-eagle3-spec2
Reinforcement Learning
•
Updated
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-eagle3-spec4
Reinforcement Learning
•
Updated
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-ngram-spec4
Reinforcement Learning
•
Updated
asatheesh/deepmath-qwen3-4b-instruct-rloo-lora-eagle3-spec5
Reinforcement Learning
•
Updated
Time-HD-Anonymous/STReasoner-8B
Feature Extraction
•
8B
•
Updated
•
182