pranaydeeps
/

Ancient-Greek-BERT

@@ -1,7 +1,6 @@
 # Ancient Greek BERT
-![](https://media.nationalgeographic.org/assets/photos/141/591/6a85829d-16e1-4392-be49-74e0461e77ec_c0-713-4188-2078_r230x75.JPG?55192c09e6fa5ad5cce49f0b30f7fc05b6c6cb9e)
 The first and only available Ancient Greek sub-word BERT model!
@@ -15,13 +14,27 @@ Please refer to our paper titled: "A Pilot Study for BERT Language Modelling and
 ## How to use
 Can be directly used from the HuggingFace Model Hub with:
 ```python
 from transformers import AutoTokenizer, AutoModel
 tokeniser = AutoTokenizer.from_pretrained("pranaydeeps/Ancient-Greek-BERT")
 model = AutoModel.from_pretrained("pranaydeeps/Ancient-Greek-BERT")
 ```
 ## Training data
 The model was initialised from [AUEB NLP Group's Greek BERT](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1)
@@ -31,7 +44,18 @@ Gorman's Treebank
 ## Training and Eval details
 Standard de-accentuating and lower-casing for Greek as suggested in [AUEB NLP Group's Greek BERT](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1)
-The model was trained on 4 NVIDIA Tesla V100 16GB GPUs for 80 epochs, with a max-seq-len of 512 and results in a perplexity of 4.8 on the held out test set.
-It also gives state-of-the-art results when fine-tuned for PoS Tagging and Morphological Analysis on all 3 treebanks averaging >90% accuracy. Please consult our paper or contact [me](mailto:[email protected]) for further questions!

 # Ancient Greek BERT
+<img src="https://ichef.bbci.co.uk/images/ic/832xn/p02m4gzb.jpg"/>
 The first and only available Ancient Greek sub-word BERT model!
 ## How to use
+Requirements:
+```python
+pip install transformers
+pip install unicodedata
+pip install flair
+```
 Can be directly used from the HuggingFace Model Hub with:
 ```python
 from transformers import AutoTokenizer, AutoModel
 tokeniser = AutoTokenizer.from_pretrained("pranaydeeps/Ancient-Greek-BERT")
 model = AutoModel.from_pretrained("pranaydeeps/Ancient-Greek-BERT")
 ```
+## Fine-tuning for POS/Morphological Analysis
+Please refer the GitHub repository for the code and details regarding fine-tuning
 ## Training data
 The model was initialised from [AUEB NLP Group's Greek BERT](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1)
 ## Training and Eval details
 Standard de-accentuating and lower-casing for Greek as suggested in [AUEB NLP Group's Greek BERT](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1)
+The model was trained on 4 NVIDIA Tesla V100 16GB GPUs for 80 epochs, with a max-seq-len of 512 and results in a perplexity of 4.8 on the held out test set.
+It also gives state-of-the-art results when fine-tuned for PoS Tagging and Morphological Analysis on all 3 treebanks averaging >90% accuracy. Please consult our paper or contact [me](mailto:[email protected]) for further questions!
+## Cite
+If you end up using Ancient-Greek-BERT in your research, please cite the paper:
+```
+@inproceedings{ancient-greek-bert,
+author = {Singh, Pranaydeep and Rutten, Gorik and Lefever, Els},
+title = {A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek},
+year = {2021},
+booktitle = {The 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2021)}
+}
+```