Skip to content

Twitter Dataset was used to classify the tweets as Offensive, Hate Text, and Neither of them

License

Notifications You must be signed in to change notification settings

priyanshkedia04/Tweets-Classification-among-Hate-Speech-Offensive-Neither

Repository files navigation

Tweets-Classification-among-Hate-Speech-Offensive-Neither

Twitter Dataset was used to classify the tweets as Offensive, Hate Text, and Neither of them. After Sampling and Balancing the T-Davidson Dataset taken from GITHUB 35,929 tweets are used to train Machine Learning and Deep Learning Models. The models were built using Scikit-Learn and Keras API in Python.

TF-IDF Vectors were formed with 8,000 most frequent words and were used as features for building the Machine Learning Models.

Their Accuracy and F1 Scores: SVM: 93% F1-Score: 93

Logistic Regression: 91% F1-Score: 91.3

SGD Classifier: 86% F1-Score: 85.65

Word2Vec Embeddings of 32 Dimensions were trained and used as the input features to build the Sequence Model consisting of Simple RNN and Dense Layer.

Training Accuracy: 98.39 %

Validation Accuracy: 96.25%

Test Accuracy: 95.99%

F1-Score: 96

Releases

No releases published

Packages

No packages published