v0.1.2

UnicornChan released this 15 Aug 17:39

· 389 commits to main since this release

77a34c2

Support windows native. #4
Support multiple GPU. #8
Support llamfile as linear backend.
Support new model: mixtral 8 * 7B and 8 * 22B
Support q2k, q3k, q5k dequant on gpu. #16
Support github action to create pre compile package
Support shared memory in different operator
Fix some bugs on build from source #23

Assets 59