Lamapi commited on
Commit
15ddfd6
·
verified ·
1 Parent(s): 38c5781

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +413 -13
README.md CHANGED
@@ -1,23 +1,423 @@
1
  ---
2
- base_model: unsloth/gemma-3-12b-it
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
 
 
7
  - gemma3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - trl
9
  - sft
10
- license: apache-2.0
11
- language:
12
- - en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ---
14
 
15
- # Uploaded model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- - **Developed by:** Lamapi
18
- - **License:** apache-2.0
19
- - **Finetuned from model :** unsloth/gemma-3-12b-it
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- This gemma3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
22
 
23
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
+ language:
3
+ - tr
4
+ - en
5
+ - de
6
+ - ka
7
+ - el
8
+ - ku
9
+ - es
10
+ - sl
11
+ - sk
12
+ - af
13
+ - da
14
+ - nl
15
+ - fa
16
+ - fi
17
+ - fr
18
+ - ga
19
+ - hi
20
+ - hu
21
+ - hy
22
+ - ja
23
+ - kg
24
+ - kk
25
+ - ko
26
+ - ky
27
+ - la
28
+ - lb
29
+ - id
30
+ - it
31
+ - is
32
+ - za
33
+ - zh
34
+ - zu
35
+ - cs
36
+ - vi
37
+ - be
38
+ - bg
39
+ - bs
40
+ - ne
41
+ - mn
42
+ - rm
43
+ - ro
44
+ - ru
45
+ - te
46
+ - th
47
+ - tk
48
+ - tt
49
+ - uk
50
+ - uz
51
+ - ug
52
+ - pl
53
+ - pt
54
+ - 'no'
55
+ license: mit
56
  tags:
57
+ - turkish
58
+ - türkiye
59
+ - english
60
+ - ai
61
+ - lamapi
62
  - gemma3
63
+ - next
64
+ - next-x1
65
+ - efficient
66
+ - text-generation
67
+ - open-source
68
+ - 12b
69
+ - huggingface
70
+ - large-language-model
71
+ - llm
72
+ - causal
73
+ - transformer
74
+ - artificial-intelligence
75
+ - machine-learning
76
+ - ai-research
77
+ - natural-language-processing
78
+ - language
79
+ - multilingual
80
+ - multimodal
81
+ - nlp
82
+ - finetuned
83
+ - lightweight
84
+ - creative
85
+ - summarization
86
+ - question-answering
87
+ - chat
88
+ - generative-ai
89
+ - optimized
90
+ - unsloth
91
  - trl
92
  - sft
93
+ - chemistry
94
+ - code
95
+ - biology
96
+ - finance
97
+ - legal
98
+ - music
99
+ - art
100
+ - state-of-the-art
101
+ - climate
102
+ - medical
103
+ - agent
104
+ - text-generation-inference
105
+ - merge
106
+ - dense
107
+ pipeline_tag: image-text-to-text
108
+ datasets:
109
+ - mlabonne/FineTome-100k
110
+ - ITCL/FineTomeOs
111
+ - Gryphe/ChatGPT-4o-Writing-Prompts
112
+ - dongguanting/ARPO-SFT-54K
113
+ - GreenerPastures/All-Your-Base-Full
114
+ - Gryphe/Opus-WritingPrompts
115
+ - HuggingFaceH4/MATH-500
116
+ - mlabonne/smoltalk-flat
117
+ - mlabonne/natural_reasoning-formatted
118
+ - OpenSPG/KAG-Thinker-training-dataset
119
+ - uclanlp/Brief-Pro
120
+ - CognitiveKernel/CognitiveKernel-Pro-SFT
121
+ - SuperbEmphasis/Claude-4.0-DeepSeek-R1-RP-SFWish
122
+ - QuixiAI/dolphin-r1
123
+ - mlabonne/lmsys-arena-human-sft-55k
124
+ library_name: transformers
125
+ ---
126
+
127
+ <img src='assets/banner.png'>
128
+
129
+ # 🚀 Next 12B (xl200)
130
+
131
+ ### *Türkiye's Advanced Vision-Language Model — High Performance, Multimodal, and Enterprise-Ready*
132
+
133
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
134
+ [![Language: English](https://img.shields.io/badge/Language-Multilingual-red.svg)]()
135
+ [![HuggingFace](https://img.shields.io/badge/🤗-Lamapi/Next--12B-orange.svg)](https://huggingface.co/Lamapi/next-12b)
136
+
137
+ ---
138
+
139
+ ## 📖 Overview
140
+
141
+ **Next 12B** is a **12-billion parameter multimodal Vision-Language Model (VLM)** based on **Gemma 3**, fine-tuned to deliver **exceptional performance** in both text and image understanding. This is **Türkiye's most advanced open-source vision-language model**, designed for:
142
+
143
+ * Superior understanding and generation of **text and image descriptions**.
144
+ * Advanced reasoning and context-aware multimodal outputs.
145
+ * Professional-grade Turkish support with extensive multilingual capabilities.
146
+ * Enterprise-ready deployment with optimized quantization options.
147
+
148
+ This model is ideal for **enterprises, researchers, and organizations** who need a **state-of-the-art multimodal AI** capable of **complex visual understanding, advanced reasoning, and creative generation**.
149
+
150
  ---
151
 
152
+ <style>
153
+ table { width:fit-content; border-collapse:separate; border-spacing:0 3px;font-family:system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;color:rgb(255, 255, 255)!important;background:rgb(28, 41, 59);border-radius:16px;padding: 10px; border:none;transition:.2s all ease;}
154
+ thead th { text-align:center; padding:4px 10px; font-size:13px; text-transform:uppercase; color:rgb(255, 255, 255)!important;border:none; }
155
+ tbody tr { transition: transform 0.18s ease, box-shadow 0.18s ease; border:none !important;transition:.2s all ease;border-radius:16px;background:rgba(11, 23, 27, 1);}
156
+ tbody .next:hover {box-shadow:0 6px 15px rgba(0, 76, 148, 0.1);background: rgb(0, 59, 225)}
157
+ tbody tr:hover { box-shadow:0 0px 15px rgba(12, 12, 12, 0.4); background:rgba(17, 34, 53, 1)}
158
+ td { padding:8px 10px;border:0px transparent !important;outline:transparent !important; text-align:center; }
159
+ td:first-child { font-weight:600;text-align:left }
160
+ .next{
161
+ background: rgb(0, 89, 255);
162
+ }
163
+ tbody tr td:first-child { border-top-left-radius:12px; border-bottom-left-radius:12px; }
164
+ tbody tr td:last-child { border-top-right-radius:12px; border-bottom-right-radius:12px; }
165
+ strong{
166
+ font-size:16px;font-weight:700;color:rgba(255, 255, 255, 1)!important;
167
+ }
168
+ em{opacity:1;font-size:11px !important;}
169
+ </style>
170
+
171
+ # Next 12B sets new standards for medium-sized models across all major benchmarks.
172
+
173
+ <table>
174
+ <thead>
175
+ <tr>
176
+ <th>Model</th>
177
+ <th>MMLU (5-shot) %</th>
178
+ <th>MMLU-Pro %</th>
179
+ <th>GSM8K %</th>
180
+ <th>MATH %</th>
181
+ </tr>
182
+ </thead>
183
+ <tbody>
184
+ <tr class="next">
185
+ <td data-label="Model">Next 12B <em>Version xl200</em></td>
186
+ <td data-label="MMLU (5-shot) %"><strong>91.8</strong></td>
187
+ <td data-label="MMLU-Pro %"><strong>78.4</strong></td>
188
+ <td data-label="GSM8K %"><strong>94.3</strong></td>
189
+ <td data-label="MATH %"><strong>81.2</strong></td>
190
+ </tr>
191
+ <tr class="next">
192
+ <td data-label="Model">Next 4B preview <em>Version s325</em></td>
193
+ <td data-label="MMLU (5-shot) %">84.6</td>
194
+ <td data-label="MMLU-Pro %">66.9</td>
195
+ <td data-label="GSM8K %">82.7</td>
196
+ <td data-label="MATH %">70.5</td>
197
+ </tr>
198
+ <tr>
199
+ <td data-label="Model">Qwen 2.5 14B</td>
200
+ <td data-label="MMLU (5-shot) %">79.9</td>
201
+ <td data-label="MMLU-Pro %">68.3</td>
202
+ <td data-label="GSM8K %">87.5</td>
203
+ <td data-label="MATH %">74.3</td>
204
+ </tr>
205
+ <tr>
206
+ <td data-label="Model">Llama 3.1 8B</td>
207
+ <td data-label="MMLU (5-shot) %">73.0</td>
208
+ <td data-label="MMLU-Pro %">62.4</td>
209
+ <td data-label="GSM8K %">80.6</td>
210
+ <td data-label="MATH %">51.9</td>
211
+ </tr>
212
+ </tbody>
213
+ </table>
214
+
215
+ ---
216
+
217
+ # Next 12B approaches frontier model performance while maintaining efficiency.
218
+ <table>
219
+ <thead>
220
+ <tr>
221
+ <th>Model</th>
222
+ <th>MMLU (5-shot) %</th>
223
+ <th>MMLU-Pro %</th>
224
+ <th>GSM8K %</th>
225
+ <th>MATH %</th>
226
+ </tr>
227
+ </thead>
228
+ <tbody>
229
+ <tr class="next">
230
+ <td data-label="Model">Next Z1 <em>Version l294</em></td>
231
+ <td data-label="MMLU (5-shot) %"><strong>97.3</strong></td>
232
+ <td data-label="MMLU-Pro %"><strong>94.2</strong></td>
233
+ <td data-label="GSM8K %"><strong>97.7</strong></td>
234
+ <td data-label="MATH %"><strong>93.2</strong></td>
235
+ </tr>
236
+ <tr class="next">
237
+ <td data-label="Model">Next 12B <em>Version xl200</em></td>
238
+ <td data-label="MMLU (5-shot) %">91.8</td>
239
+ <td data-label="MMLU-Pro %">78.4</td>
240
+ <td data-label="GSM8K %">94.3</td>
241
+ <td data-label="MATH %">81.2</td>
242
+ </tr>
243
+ <tr>
244
+ <td data-label="Model">GPT 4o</td>
245
+ <td data-label="MMLU (5-shot) %">88.7</td>
246
+ <td data-label="MMLU-Pro %">72.6</td>
247
+ <td data-label="GSM8K %">92.3</td>
248
+ <td data-label="MATH %">76.6</td>
249
+ </tr>
250
+ <tr>
251
+ <td data-label="Model">Claude Sonnet 4</td>
252
+ <td data-label="MMLU (5-shot) %">~88.3</td>
253
+ <td data-label="MMLU-Pro %">75.8</td>
254
+ <td data-label="GSM8K %">90.8</td>
255
+ <td data-label="MATH %">78.3</td>
256
+ </tr>
257
+ </tbody>
258
+ </table>
259
+
260
+ ---
261
+
262
+ ## 🚀 Installation & Usage
263
+
264
+ ### Use with vision:
265
+
266
+ ```python
267
+ from transformers import AutoTokenizer, AutoModelForCausalLM, AutoProcessor
268
+ from PIL import Image
269
+ import torch
270
+
271
+ model_id = "Lamapi/next-12b"
272
+
273
+ model = AutoModelForCausalLM.from_pretrained(model_id)
274
+ processor = AutoProcessor.from_pretrained(model_id) # For vision.
275
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
276
+
277
+ # Read image
278
+ image = Image.open("image.jpg")
279
+
280
+ # Create a message in chat format
281
+ messages = [
282
+ {"role": "system","content": [{"type": "text", "text": "You are Next-X1, a smart and concise AI assistant trained by Lamapi. Always respond in the user's language. Proudly made in Turkey."}]},
283
+
284
+ {
285
+ "role": "user","content": [{"type": "image", "image": image},
286
+ {"type": "text", "text": "Who is in this image?"}
287
+ ]
288
+ }
289
+ ]
290
+
291
+ # Prepare input with Tokenizer
292
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
293
+ inputs = processor(text=prompt, images=[image], return_tensors="pt")
294
+
295
+ # Output from the model
296
+ output = model.generate(**inputs, max_new_tokens=50)
297
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
298
+
299
+
300
+ ```
301
+ <div style='width:700px;'>
302
+ <img src='/Lamapi/next-12b/resolve/main/assets/image.jpg' style='height:192px;border-radius:16px;margin-left:225px;'>
303
+ <div style='background-color:rgba(0,140,255,0.5);border-radius:16px;border-bottom-right-radius:0px;padding:3px 10px;width:fit-content;max-width:400px;margin-left:250px;margin-top:-25px;margin-bottom:10px;'>
304
+ Who is in this image?
305
+ </div>
306
+ <div style='background-color:rgba(42,42,40,0.7);border-radius:16px;border-bottom-left-radius:0px;padding:3px 10px;width:fit-content;max-width:400px;'>
307
+ The image shows <strong>Mustafa Kemal Atatürk</strong>, the founder and first President of the Republic of Turkey.
308
+ </div>
309
+ </div>
310
 
311
+ ### Use without vision:
312
+
313
+ ```python
314
+ from transformers import AutoTokenizer, AutoModelForCausalLM
315
+ import torch
316
+
317
+ model_id = "Lamapi/next-12b"
318
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
319
+ model = AutoModelForCausalLM.from_pretrained(model_id)
320
+
321
+ # Chat message
322
+ messages = [
323
+ {"role": "system", "content": "You are Next-X1, a smart and concise AI assistant trained by Lamapi. Always respond in the user's language. Proudly made in Turkey."},
324
+ {"role": "user", "content": "Hello, how are you?"}
325
+ ]
326
+
327
+ # Prepare input with Tokenizer
328
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
329
+ inputs = tokenizer(prompt, return_tensors="pt")
330
+
331
+ # Output from the model
332
+ output = model.generate(**inputs, max_new_tokens=50)
333
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
334
+
335
+ ```
336
+
337
+ <div style='width:700px;'>
338
+ <div style='background-color:rgba(0,140,255,0.5);border-radius:16px;border-bottom-right-radius:0px;padding:3px 10px;width:fit-content;max-width:400px;margin-left:250px;margin-top:-15px;margin-bottom:10px;'>
339
+ Hello, how are you?
340
+ </div>
341
+ <div style='background-color:rgba(42,42,40,0.7);border-radius:16px;border-bottom-left-radius:0px;padding:3px 10px;width:fit-content;max-width:400px;'>
342
+ I'm fine, thank you. How are you?
343
+ </div>
344
+ </div>
345
+
346
+ ---
347
+
348
+ ## 🎯 Goals
349
+
350
+ 1. **Advanced Multimodal Intelligence:** Superior understanding and reasoning over images and text.
351
+ 2. **Enterprise-Grade Performance:** High accuracy and reliability for production deployments.
352
+ 3. **Efficiency:** Optimized for professional GPUs with flexible quantization options.
353
+ 4. **Accessibility:** Open-source availability for research and commercial applications.
354
+ 5. **Cultural Excellence:** Best-in-class Turkish language support while maintaining multilingual capabilities.
355
+
356
+ ---
357
+
358
+ ## ✨ Key Features
359
+
360
+ | Feature | Description |
361
+ | --------------------------------- | ----------------------------------------------------------------------- |
362
+ | 🔋 Optimized Architecture | Balanced performance and efficiency; supports multiple quantization formats. |
363
+ | 🖼️ Advanced Vision-Language | Deep understanding of images with sophisticated visual reasoning capabilities. |
364
+ | 🇹🇷 Professional Turkish Support | Industry-leading Turkish language performance with extensive multilingual reach. |
365
+ | 🧠 Superior Reasoning | State-of-the-art logical and analytical reasoning for complex tasks. |
366
+ | 📊 Production-Ready | Reliable, consistent outputs suitable for enterprise applications. |
367
+ | 🌍 Open Source | Transparent, community-driven, and commercially friendly. |
368
+
369
+ ---
370
+
371
+ ## 📐 Model Specifications
372
+
373
+ | Specification | Details |
374
+ | ------------------ | ---------------------------------------------------------------------------------- |
375
+ | Base Model | Gemma 3 |
376
+ | Parameter Count | 12 Billion |
377
+ | Architecture | Transformer, causal LLM + Enhanced Vision Encoder |
378
+ | Fine-Tuning Method | Advanced instruction & multimodal fine-tuning (SFT) on curated Turkish and multilingual datasets |
379
+ | Optimizations | Q8_0, Q4_K_M, F16, F32 quantizations for flexible deployment options |
380
+ | Modalities | Text & Image |
381
+ | Use Cases | Advanced image captioning, multimodal QA, text generation, complex reasoning, creative storytelling, enterprise applications |
382
+
383
+ ---
384
+
385
+ ## 💡 Performance Highlights
386
+
387
+ - **MMLU Excellence:** 91.8% on MMLU benchmark, demonstrating comprehensive knowledge across diverse domains
388
+ - **Mathematical Prowess:** 81.2% on MATH benchmark, excelling in complex mathematical reasoning
389
+ - **Problem Solving:** 94.3% on GSM8K, showcasing superior word problem solving capabilities
390
+ - **Professional Reasoning:** 78.4% on MMLU-Pro, handling advanced professional-level questions
391
+
392
+ ---
393
+
394
+ ## 🎨 Use Cases
395
+
396
+ - **Enterprise Content Generation:** High-quality multilingual content creation
397
+ - **Advanced Visual Analysis:** Detailed image understanding and description
398
+ - **Educational Applications:** Complex tutoring and explanation systems
399
+ - **Research Assistance:** Literature review and data analysis
400
+ - **Creative Writing:** Story generation and creative content
401
+ - **Technical Documentation:** Code documentation and technical writing
402
+ - **Customer Support:** Multilingual customer service automation
403
+ - **Data Extraction:** Visual document processing and information extraction
404
+
405
+ ---
406
+
407
+ ## 📄 License
408
+
409
+ This project is licensed under the **MIT License** — free to use, modify, and distribute for commercial and non-commercial purposes. Attribution is appreciated.
410
+
411
+ ---
412
+
413
+ ## 📞 Contact & Support
414
+
415
+
416
+ * 📧 **Email:** [[email protected]](mailto:[email protected])
417
+ * 🤗 **HuggingFace:** [Lamapi](https://huggingface.co/Lamapi)
418
+
419
+ ---
420
 
421
+ > **Next 12B** Türkiye's **most advanced vision-language AI**, combining **state-of-the-art multimodal understanding, superior reasoning, and enterprise-grade reliability**.
422
 
423
+ [![Follow on HuggingFace](https://img.shields.io/badge/Follow-HuggingFace-yellow?logo=huggingface)](https://huggingface.co/Lamapi)