Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of GGUF as a input format for LLM #1885

Open
wants to merge 52 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
ef89e2a
Added gguf-tools as submodule
AlexKoff88 Feb 3, 2025
8b86481
Merge remote-tracking branch 'origin/master' into ak/gguf_support
AlexKoff88 Mar 3, 2025
f671a44
Copied some pieces of functionality from 3rd parties. Rewriting is WIP.
AlexKoff88 Mar 5, 2025
4991ca0
Added GGUF reading and conversion to ov::Tensor. Fixed cmake
AlexKoff88 Mar 6, 2025
ecde304
Start adding model creation from GGUF
AlexKoff88 Mar 7, 2025
7a2f1ea
Added RoPE
AlexKoff88 Mar 10, 2025
84e9788
Added MHA
AlexKoff88 Mar 10, 2025
9221be7
Added RMSNorm, MVN, etc.
AlexKoff88 Mar 10, 2025
fbba6f8
Removed submodules
AlexKoff88 Mar 11, 2025
8fe17fa
Removed WWB changes
AlexKoff88 Mar 11, 2025
3cce078
Merge branch 'master' into ak/gguf_support
AlexKoff88 Mar 11, 2025
b91ce6f
Merged with master
AlexKoff88 Mar 11, 2025
64cb5d6
Added implementation of Trasformer block
AlexKoff88 Mar 11, 2025
d891ec8
Added RoPE initialization
AlexKoff88 Mar 11, 2025
6ef3418
Added code for Llama based models
AlexKoff88 Mar 11, 2025
4b0c61e
Finished pipeline for models creation
AlexKoff88 Mar 12, 2025
be72ce8
Reshuffled the code. Extended configs to other data types.
AlexKoff88 Mar 12, 2025
779878f
Changed headers extension to .hpp
AlexKoff88 Mar 13, 2025
24b3475
Fixed model creation issues. Added sample to test GGUF.
AlexKoff88 Mar 13, 2025
3f0f12e
Fixed inference issues. Still have problems with generation output qu…
AlexKoff88 Mar 13, 2025
a9f95e3
Fixes in the causal mask subgraph
AlexKoff88 Mar 14, 2025
6cd777f
Changed rotate_half
AlexKoff88 Mar 14, 2025
9a5ac48
Fixed MHA. Got good results for FP16
AlexKoff88 Mar 14, 2025
4025d00
Fixed Q8_0 models
AlexKoff88 Mar 14, 2025
5a35124
Fixes for Q4_0 and Q4_1 unpacking
AlexKoff88 Mar 14, 2025
8206ad9
Added Q4_0/1 support. Result does not converge.
AlexKoff88 Mar 17, 2025
68dce03
Fixed result convergence for Q4_0 and Q4_1
AlexKoff88 Mar 17, 2025
2890d06
Made model conversion more generic. Qwen results does not converge st…
AlexKoff88 Mar 18, 2025
1538f91
Added bias Add operation after MatMuls where it is applicable. Still …
AlexKoff88 Mar 19, 2025
02b2120
Merged with master
AlexKoff88 Mar 25, 2025
89a6554
Add GGUF reading into Stateful LLMPipeline
AlexKoff88 Mar 25, 2025
c0d7fb1
Added test for GGUF reader
AlexKoff88 Mar 26, 2025
71c9d2f
Merge remote-tracking branch 'origin/master' into ak/gguf_support
AlexKoff88 Mar 26, 2025
9521fb1
Removed submodule
AlexKoff88 Mar 26, 2025
279bcd8
Updated gguf-lib download process
AlexKoff88 Mar 26, 2025
d6a0dc3
Update src/cpp/CMakeLists.txt
AlexKoff88 Mar 26, 2025
c2995c4
Update src/cpp/CMakeLists.txt
AlexKoff88 Mar 26, 2025
5535e49
Update src/cpp/src/utils.cpp
AlexKoff88 Mar 26, 2025
d2e67ab
Fixed issue with FP16 KV-cache hint for quantized models
AlexKoff88 Mar 26, 2025
e2503a8
Tried to fix error
AlexKoff88 Mar 26, 2025
700faad
Tried to fix error2
AlexKoff88 Mar 26, 2025
e2809aa
GGUF sample compilation fix
AlexKoff88 Mar 26, 2025
ed0e577
GGUF sample compilation fix2
AlexKoff88 Mar 26, 2025
707f26e
GGUF sample compilation fix3
AlexKoff88 Mar 26, 2025
2d119f1
Merge remote-tracking branch 'origin/master' into ak/gguf_support
AlexKoff88 Mar 26, 2025
ba57473
Removed unused function
AlexKoff88 Mar 26, 2025
e686064
Try to fix issue
AlexKoff88 Mar 26, 2025
43a6b16
Try to fix issue2
AlexKoff88 Mar 27, 2025
765e00f
Fixed build issues on MAC
AlexKoff88 Mar 27, 2025
f8cac09
Merge remote-tracking branch 'origin/master' into ak/gguf_support
AlexKoff88 Mar 27, 2025
bcb827a
Fixed build issues on Windows
AlexKoff88 Mar 27, 2025
f5ce8a4
Windows build
AlexKoff88 Mar 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Updated gguf-lib download process
AlexKoff88 committed Mar 26, 2025
commit 279bcd8b96f08a997cbc008ac00fd6bf98ff10a2
4 changes: 2 additions & 2 deletions src/cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -80,8 +80,8 @@ if(ENABLE_GGUF)
message(STATUS "Downloading gguflib")
FetchContent_Declare(
gguflib
GIT_REPOSITORY https://github.com/antirez/gguf-tools/
GIT_TAG af7d88d808a7608a33723fba067036202910acb3)
URL https://github.com/antirez/gguf-tools/archive/af7d88d808a7608a33723fba067036202910acb3.zip
URL_HASH SHA256=d613559c7a398eb4a0919982e6a370055f8466497f0f866d331dc92b735927e7)
FetchContent_MakeAvailable(gguflib)
target_include_directories(${TARGET_NAME_OBJ}
PRIVATE "${gguflib_SOURCE_DIR}")