Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement reverse indexing cython #13

Merged
merged 3 commits into from
Dec 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions Examples/inverted_index_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from _inverted_index import add_document, find_documents
import pprint

inverted_index = {}

# Adicionando documentos
inverted_index = add_document(inverted_index, "doc1", "Este e o texto do primeiro documento.")
inverted_index = add_document(inverted_index, "doc2", "Este e o texto do segundo documento.")

# Exibindo o índice invertido
print("Índice invertido:")
pprint.pprint(inverted_index)

# Buscando documentos que contêm a palavra "primeiro"
result = find_documents(inverted_index, "primeiro")
print("\nDocumentos encontrados:")
print(result)
23 changes: 20 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Search Engine
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.0-4baaaa.svg)](code_of_conduct.md)
[![CMake Build and Test](https://github.com/pedrobiqua/Search_Engine/actions/workflows/cmake-multi-platform.yml/badge.svg?branch=main)](https://github.com/pedrobiqua/Search_Engine/actions/workflows/cmake-multi-platform.yml)
# Search Engine
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.0-4baaaa.svg)](code_of_conduct.md)
[![CMake Build and Test](https://github.com/pedrobiqua/Search_Engine/actions/workflows/cmake-multi-platform.yml/badge.svg?branch=main)](https://github.com/pedrobiqua/Search_Engine/actions/workflows/cmake-multi-platform.yml)

---

Expand Down Expand Up @@ -62,6 +62,23 @@ This will run the tests covering search engine functionality, reverse indexing,

---

## 🚀 Running Examples

The first step is building the project, for this to run:

```bash
poetry install
poetry build
```

After building it, run this command to see the library working:

```bash
poetry run python Examples/graph_example.py
```

---

## ⚙️ How It Works

- **Reverse Indexing**: Maps keywords to the documents where they appear.
Expand Down
3 changes: 2 additions & 1 deletion build.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@ def build(setup_kwargs):
# Modulos que serão usados no Python
libs_names = [
"_hello",
"_page_rank"
"_page_rank",
"_inverted_index"
]

extensions = []
Expand Down
47 changes: 47 additions & 0 deletions lib/src/_inverted_index.pyx
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# inverted_index.pyx
from libcpp.map cimport map as cpp_map
from libcpp.list cimport list as cpp_list
from libcpp.string cimport string
from libcpp.pair cimport pair

cdef extern from "inverted_index.h" namespace "inverted_index":
ctypedef string str

# Estrutura docs
cdef struct docs:
str name_doc
int freq
cpp_list[str] links_docs

# Tipos de dados
ctypedef cpp_list[docs] list_docs
ctypedef cpp_map[str, list_docs] map_str_docs

# Funções expostas
map_str_docs add_doc(map_str_docs& mp, const str& doc_name, str& text)
list_docs find_answer(map_str_docs& mp, str& input)

# Conversões entre string
def cpp_to_py_str(string cpp_str):
return cpp_str.decode("utf-8")

def py_to_cpp_str(unicode py_str):
return py_str.encode("utf-8")


# Wrapper para add_doc
def add_document(dict mp, unicode doc_name, unicode text):
cdef string cpp_doc_name = py_to_cpp_str(doc_name)
cdef string cpp_text = py_to_cpp_str(text)

cpp_result = add_doc(mp, cpp_doc_name, cpp_text) # Chama a função C++

# Retorna resultado convertido para Python
return cpp_result

# Wrapper para find_answer
def find_documents(dict mp, unicode input_word):
cdef string cpp_input = py_to_cpp_str(input_word)
cpp_result = find_answer(mp, cpp_input) # Função c++
# Retorna lista de documentos em Python
return cpp_result
Loading