Skip to content

Commit afbdf0e

Browse files
committed
Take n_ctx from model
1 parent abc2364 commit afbdf0e

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

modules/llama_cpp_plugin/src/compiled_model.cpp

+1
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ LlamaCppModel::LlamaCppModel(const std::string& gguf_fname, const std::shared_pt
5050
llama_context_params cparams = llama_context_default_params();
5151
cparams.n_threads =
5252
std::thread::hardware_concurrency(); // TODO (vshampor): reuse equivalent setting defined by OV API
53+
cparams.n_ctx = 0; // this means that the actual n_ctx will be taken equal to the model's train-time value
5354
m_llama_ctx = llama_new_context_with_model(m_llama_model_ptr, cparams);
5455
OPENVINO_DEBUG << "llama_cpp_plugin: llama model loaded successfully from GGUF..." << std::endl;
5556

0 commit comments

Comments
 (0)