GPT-OSS-20B.LOCAL
#88
by
recod4160
- opened
- README.md +3 -17
- chat_template.jinja +1 -1
- generation_config.json +1 -2
README.md
CHANGED
|
@@ -13,7 +13,7 @@ tags:
|
|
| 13 |
<p align="center">
|
| 14 |
<a href="https://gpt-oss.com"><strong>Try gpt-oss</strong></a> ·
|
| 15 |
<a href="https://cookbook.openai.com/topic/gpt-oss"><strong>Guides</strong></a> ·
|
| 16 |
-
<a href="https://
|
| 17 |
<a href="https://openai.com/index/introducing-gpt-oss/"><strong>OpenAI blog</strong></a>
|
| 18 |
</p>
|
| 19 |
|
|
@@ -22,7 +22,7 @@ tags:
|
|
| 22 |
Welcome to the gpt-oss series, [OpenAI’s open-weight models](https://openai.com/open-models) designed for powerful reasoning, agentic tasks, and versatile developer use cases.
|
| 23 |
|
| 24 |
We’re releasing two flavors of these open models:
|
| 25 |
-
- `gpt-oss-120b` — for production, general purpose, high reasoning use cases that fit into a single
|
| 26 |
- `gpt-oss-20b` — for lower latency, and local or specialized use cases (21B parameters with 3.6B active parameters)
|
| 27 |
|
| 28 |
Both models were trained on our [harmony response format](https://github.com/openai/harmony) and should only be used with the harmony format as it will not work correctly otherwise.
|
|
@@ -38,7 +38,7 @@ Both models were trained on our [harmony response format](https://github.com/ope
|
|
| 38 |
* **Full chain-of-thought:** Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. It’s not intended to be shown to end users.
|
| 39 |
* **Fine-tunable:** Fully customize models to your specific use case through parameter fine-tuning.
|
| 40 |
* **Agentic capabilities:** Use the models’ native capabilities for function calling, [web browsing](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#browser), [Python code execution](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#python), and Structured Outputs.
|
| 41 |
-
* **MXFP4 quantization:** The models
|
| 42 |
|
| 43 |
---
|
| 44 |
|
|
@@ -166,17 +166,3 @@ The gpt-oss models are excellent for:
|
|
| 166 |
Both gpt-oss models can be fine-tuned for a variety of specialized use cases.
|
| 167 |
|
| 168 |
This smaller model `gpt-oss-20b` can be fine-tuned on consumer hardware, whereas the larger [`gpt-oss-120b`](https://huggingface.co/openai/gpt-oss-120b) can be fine-tuned on a single H100 node.
|
| 169 |
-
|
| 170 |
-
# Citation
|
| 171 |
-
|
| 172 |
-
```bibtex
|
| 173 |
-
@misc{openai2025gptoss120bgptoss20bmodel,
|
| 174 |
-
title={gpt-oss-120b & gpt-oss-20b Model Card},
|
| 175 |
-
author={OpenAI},
|
| 176 |
-
year={2025},
|
| 177 |
-
eprint={2508.10925},
|
| 178 |
-
archivePrefix={arXiv},
|
| 179 |
-
primaryClass={cs.CL},
|
| 180 |
-
url={https://arxiv.org/abs/2508.10925},
|
| 181 |
-
}
|
| 182 |
-
```
|
|
|
|
| 13 |
<p align="center">
|
| 14 |
<a href="https://gpt-oss.com"><strong>Try gpt-oss</strong></a> ·
|
| 15 |
<a href="https://cookbook.openai.com/topic/gpt-oss"><strong>Guides</strong></a> ·
|
| 16 |
+
<a href="https://openai.com/index/gpt-oss-model-card"><strong>Model card</strong></a> ·
|
| 17 |
<a href="https://openai.com/index/introducing-gpt-oss/"><strong>OpenAI blog</strong></a>
|
| 18 |
</p>
|
| 19 |
|
|
|
|
| 22 |
Welcome to the gpt-oss series, [OpenAI’s open-weight models](https://openai.com/open-models) designed for powerful reasoning, agentic tasks, and versatile developer use cases.
|
| 23 |
|
| 24 |
We’re releasing two flavors of these open models:
|
| 25 |
+
- `gpt-oss-120b` — for production, general purpose, high reasoning use cases that fit into a single H100 GPU (117B parameters with 5.1B active parameters)
|
| 26 |
- `gpt-oss-20b` — for lower latency, and local or specialized use cases (21B parameters with 3.6B active parameters)
|
| 27 |
|
| 28 |
Both models were trained on our [harmony response format](https://github.com/openai/harmony) and should only be used with the harmony format as it will not work correctly otherwise.
|
|
|
|
| 38 |
* **Full chain-of-thought:** Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. It’s not intended to be shown to end users.
|
| 39 |
* **Fine-tunable:** Fully customize models to your specific use case through parameter fine-tuning.
|
| 40 |
* **Agentic capabilities:** Use the models’ native capabilities for function calling, [web browsing](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#browser), [Python code execution](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#python), and Structured Outputs.
|
| 41 |
+
* **Native MXFP4 quantization:** The models are trained with native MXFP4 precision for the MoE layer, making `gpt-oss-120b` run on a single H100 GPU and the `gpt-oss-20b` model run within 16GB of memory.
|
| 42 |
|
| 43 |
---
|
| 44 |
|
|
|
|
| 166 |
Both gpt-oss models can be fine-tuned for a variety of specialized use cases.
|
| 167 |
|
| 168 |
This smaller model `gpt-oss-20b` can be fine-tuned on consumer hardware, whereas the larger [`gpt-oss-120b`](https://huggingface.co/openai/gpt-oss-120b) can be fine-tuned on a single H100 node.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
chat_template.jinja
CHANGED
|
@@ -245,9 +245,9 @@
|
|
| 245 |
{%- if developer_message %}
|
| 246 |
{{- "# Instructions\n\n" }}
|
| 247 |
{{- developer_message }}
|
| 248 |
-
{{- "\n\n" }}
|
| 249 |
{%- endif %}
|
| 250 |
{%- if tools -%}
|
|
|
|
| 251 |
{{- "# Tools\n\n" }}
|
| 252 |
{{- render_tool_namespace("functions", tools) }}
|
| 253 |
{%- endif -%}
|
|
|
|
| 245 |
{%- if developer_message %}
|
| 246 |
{{- "# Instructions\n\n" }}
|
| 247 |
{{- developer_message }}
|
|
|
|
| 248 |
{%- endif %}
|
| 249 |
{%- if tools -%}
|
| 250 |
+
{{- "\n\n" }}
|
| 251 |
{{- "# Tools\n\n" }}
|
| 252 |
{{- render_tool_namespace("functions", tools) }}
|
| 253 |
{%- endif -%}
|
generation_config.json
CHANGED
|
@@ -3,8 +3,7 @@
|
|
| 3 |
"do_sample": true,
|
| 4 |
"eos_token_id": [
|
| 5 |
200002,
|
| 6 |
-
199999
|
| 7 |
-
200012
|
| 8 |
],
|
| 9 |
"pad_token_id": 199999,
|
| 10 |
"transformers_version": "4.55.0.dev0"
|
|
|
|
| 3 |
"do_sample": true,
|
| 4 |
"eos_token_id": [
|
| 5 |
200002,
|
| 6 |
+
199999
|
|
|
|
| 7 |
],
|
| 8 |
"pad_token_id": 199999,
|
| 9 |
"transformers_version": "4.55.0.dev0"
|