This is a simple implementation of a Byte Pair Encoding Algorithm, used to tokenize text, and a Bigram Word Model.
These were created as part of my research work in Language Models and are my original implementation!
This is a simple implementation of a Byte Pair Encoding Algorithm, used to tokenize text, and a Bigram Word Model.
These were created as part of my research work in Language Models and are my original implementation!