-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shoud we integrate with scikit learn #991
Comments
I agree, IMO scikit-learn is fairly default and sometimes I see it being used just for model evaluation / visualization, so it ends up being a very common dependency in ML projects. |
I think this move would be well worth it. Presumably after doing this it would be easy to swap out different classifiers, so a to-do is figuring out how to configure which classifier you want to use. |
this is actually very easy, because i've already made so the classifier has the sklearn api. |
Here I adapted the code via inheritance to support sklearn classifiers: https://github.com/vintasoftware/deduplication-slides/blob/pycon-2020/rf_dedupe.py But it's worth checking the conditionals on |
closed by #992 |
Nice job @fgregg, this looks awesome! |
Thinking about #990, i'm wondering if we should make scikit learn a dependency.
When we started dedupe 10 years ago, it was really hard to get scipy and scikit learn set up on users machines, but it's not anymore.
It would be nice to get out of the game of implementing some algos ourselves.
Thoughts, @fjsj @NickCrews
The text was updated successfully, but these errors were encountered: