-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Reverse Indexing in C++ #1
Labels
Comments
pedrobiqua
added a commit
that referenced
this issue
Oct 11, 2024
Adição de comentário onde será chamado a implementação do index reverso
pedrobiqua
added a commit
that referenced
this issue
Oct 11, 2024
pedrobiqua
added a commit
that referenced
this issue
Oct 11, 2024
pedrobiqua
added a commit
that referenced
this issue
Dec 10, 2024
The following tasks still need to be completed:
|
To do list:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We need to implement a reverse indexing system in C++ to optimize document retrieval and improve the search efficiency in our search engine. Reverse indexing will allow us to associate each word with a list of documents or pages where it appears, facilitating keyword-based search.
Tasks:
Define the Data Structure:
Use an efficient structure to store the index (e.g., std::unordered_map or std::map), where the key is a word and the value is a list of documents/pages.
Document Parsing:
Implement a function to process documents or web pages, tokenizing them into words and populating the reverse index.
Remove punctuation and normalize the text to lowercase.
Update the Index:
Implement logic to update the index as new documents are added or removed.
Search Query:
Implement a function that, given a word, returns the corresponding documents/pages using the reverse index.
Testing:
Create unit tests to ensure that the index works correctly and that queries return the expected results.
Test with different dataset sizes to evaluate performance.
Requirements:
References:
The text was updated successfully, but these errors were encountered: