Releases: mhardalov/bulstem-py
Releases · mhardalov/bulstem-py
Updated the project's description
PyPI package
Uploaded the package into the PyPI public repository. Removed dependencies on external libraries.
Stable packaging
Fixed the wheel to contain the rules.
NLTK Removal and Python >= 3.6 only
- Removed NLTK as package dependency (only used in the tests)
- Fixed package dependencies
- Improved the code style of the probject
- Dropped Python 2 support
Added pre-defined rule sets to package
Reading the rules from an external file
from bulstem.stem import BulStemmer
# Pre-defined names of rule sets
PRE_DEFINED_RULES = ['stem-context-1',
'stem-context-2',
'stem-context-3']
# Excepted output:
# 1 втор
# 2 втори
# 3 вторият
for i, rules_name in enumerate(PRE_DEFINED_RULES, start=1):
stemmer = BulStemmer.from_file(rules_name, min_freq=2, left_context=i)
print(i, stemmer.stem('вторият'))
stemmer = BulStemmer.from_file('stem_rules_context_2_utf8.txt', min_freq=2, left_context=i)
stemmer.stem('вторият')# Excepted output: 1. 'втори'
stemmer.stem('вероятен')# Excepted output: 1. 'вероят'
BulStemmer.from_file
params:
path
- Path (or pre-defined name) to the rules file formatted, as follows: word ==> stem ==> freq.min_freq
- The minimum frequency of a rule to be used when stemming.left_context
- Size of the prefix which will not be stemmed.
Initial Release
This is the initial release of bulstem_py