These are the data sets created specifically for training expert models in a MoE-context.
Rasmus Rasmussen
theprint
AI & ML interests
Small model experiments and homespun datasets.
Recent Activity
updated
a collection
2 days ago
Expert Series Data Sets
updated
a collection
2 days ago
Expert Series Data Sets
updated
a collection
2 days ago
Expert Series Data Sets
Organizations
Databird
I created a tool that generates synthetic data based on a short list of related topics. This collection is where I keep my Databird-related uploads.
VanRossum
Named after the inventor of Python, Guido Van Rossum, this collection is based on the VanRossum data set of 80k Python-related entries.
TextSynth
These models are great at brainstorming, re-writing, summaries and general conversation.
Merged Models
These are models created by merging existing models that are already fine tuned or even merged themselves.
TiTan
Smaller models, fine tuned on generating titles and tags.
ReWiz
The ReWiz series is based on a subset of data from 3 different data sets, which has been used for fine tuning.
Hemispheres
A for-fun experiment; fine tuning and using 3 models to simulate 3 parts of a brain: a practical, an emotional and one to balance out the two.
CleverBoi
CleverBoi is a curated collection of data that emphasizes logic, inference, science, code, math and empathy, and its fine tuned language models.
The Cthulhu Mythos Collection
Expert Series Data Sets
These are the data sets created specifically for training expert models in a MoE-context.
TiTan
Smaller models, fine tuned on generating titles and tags.
Databird
I created a tool that generates synthetic data based on a short list of related topics. This collection is where I keep my Databird-related uploads.
ReWiz
The ReWiz series is based on a subset of data from 3 different data sets, which has been used for fine tuning.
VanRossum
Named after the inventor of Python, Guido Van Rossum, this collection is based on the VanRossum data set of 80k Python-related entries.
Hemispheres
A for-fun experiment; fine tuning and using 3 models to simulate 3 parts of a brain: a practical, an emotional and one to balance out the two.
TextSynth
These models are great at brainstorming, re-writing, summaries and general conversation.
CleverBoi
CleverBoi is a curated collection of data that emphasizes logic, inference, science, code, math and empathy, and its fine tuned language models.
Merged Models
These are models created by merging existing models that are already fine tuned or even merged themselves.
The Cthulhu Mythos Collection