-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add blip-2 to bettertransformer #1125
Add blip-2 to bettertransformer #1125
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thank you! Could you just solve the conflict?
We should definitely have a nice table in the doc with the speedups (if there is, because I suspect for some archs/settings it is not necessarily huge).
@fxmarty yes I agree such a comparison would be nice! If there are already benchmarks then I would be interested to work on this. I also thought about adding tests to assert speedups, but if the speeds fluctuate this could mess with the CI. |
There are some scripts that we used for blog posts, but we did not put results in the documentation itself: https://github.com/huggingface/optimum/tree/main/tests/benchmark Encoder implementation may need to be revamped soon though, as currently we error out when they are used for training, while there's not really any reason to now. |
@baskrahmer thank you very much for contributing BLIP2 support! Also does it show any improvement in inference speed on GPU T4? Could not get any during my experiments, maybe I've done something wrong. |
Hey @kirillsemenov1314 :)
I believe this statement no longer holds. The BetterTransformer implementation of the T5 layer is found in this file, so I suggest going through it if you're interested in the implementation.
This is an interesting topic. AFAIK active work is being done on this tool which can be used to also run benchmarks on BetterTransformer architectures. Inference speed is influenced by a variety of factors such as the model, dataset and hardware. It can thus very well be that there is no significant speedup using BetterTransformer in your case, and it does not necessarily apply you are doing something wrong. |
What does this PR do?
Add BLIP-2 to the BetterTransformer API.
Part of #1056
Before submitting