Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tasks #2

Open
matthewr6 opened this issue Nov 24, 2019 · 1 comment
Open

Tasks #2

matthewr6 opened this issue Nov 24, 2019 · 1 comment

Comments

@matthewr6
Copy link
Owner

matthewr6 commented Nov 24, 2019

  • Biterm model finalizing + experiments (Chris)
  • TKM experiments (Matthew)
  • K-means experiments (different # clusters) (Sasankh)
  • LDA experiments (different featurizers/clusters) (Matthew)
  • Pseudo tf-idf topic extractor (based on clusters rather than documents - tf within cluster, idf against other clusters - Chris
  • Centroid vector extractor (Matthew)
  • MultinomialNB max probability-based extractor (Sasankh)
  • Brief descriptions/writeups of each featurizer, model (+ extractor), and experiment (if experiment is interesting; from results and visualizations)
  • Poster (Sasankh + Chris)
  • Start on paper (Chris + Sasankh)

Graphs (Matthew - dependent on data from experiments)

  • accuracy differences
  • for most successful model - plot of clusterings
    • cluster topics
    • examples of articles
@matthewr6
Copy link
Owner Author

matthewr6 commented Nov 27, 2019

update on TKM model - very poor clustering (see PCA graphs), decided to drop it

update on centroid vector extractor - done

  • write up the featurized-text vs. keyword-text analysis method (Matthew)
  • clustering analysis per Chuma's suggestion (Sasankh or Chris?)
  • qualitative analysis on graphs and cluster.json files (Sasankh or Chris?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant