Corpus and notebook consisting of a total of 225 texts made up of 75 clinical trials (CTs), 75 consent forms (CFs) and 75 patient information documents (PIDs), used in the MA thesis "Lexical Simplification in Spanish Texts for Patients: the Complex Word Identification Task".
This code is adapted from https://github.com/huggingface/notebooks/blob/main/examples/token_classification.ipynb
If you use this data, please cite as follows:
@article{2024CWI,
title={Complex Word Identification for Lexical Simplification in Spanish Texts for Patients},
volume={Under review},
year={2024}
}