Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update from Master #655

Merged
merged 130 commits into from
Feb 5, 2025
Merged

update from Master #655

merged 130 commits into from
Feb 5, 2025

Conversation

ltclm
Copy link
Contributor

@ltclm ltclm commented Feb 5, 2025

No description provided.

faselm and others added 30 commits November 2, 2023 16:47
service container in the infra vhost environment. After a scan for
orphaned indexes the container (running in service mode) tried to remove
orphaned indexes on EFS which as mounted read-only on the host.

in service mode (as backend for search wsgi)
the service-sphinxsearch container needs read-only access to the EFS

in maintenance mode (triggered by data deploy on geodatasync)
the service-sphinxsearch container needs read-write access to the EFS

with this change the clean-up of orphaned indexes in the EFS will be
done only in maintenance mode, during the sphinx index createion.

this will allow us to strictly mount the geodata efs in read-only mode on all
our kubernetes or infra vhosts instances.
this will fix an issue that has ocurred during the initial start of the
fix/suppress warnings during the scan of the efs with find
swissboundaries3d-gemeinden: Even though there are going to be created a large index, this pr will
needed during the parallel operations of service-search-sphinx on
infra-vhost and k8s.

on k8s we are forced to mount the efs indexes with the full path:
so this will be the folder with the efs index files:
* /var/local/geodata/service-sphinxsearch/${DBSTAGING}/index
this is due to a limitation of the aws efs csi driver which is not
supporting subpaths as volumemount config. we would have to create new
peristant volumes with terraform which is unreasonable for this special
use-case.
rebert and others added 26 commits October 28, 2024 11:42
charachters, that's why we have to auomtaically expand keywords on index
generation
the fuzzy string search Text will be built without wildcards by the wsgi
application. we have to enable automated wildcard expansion for infix
matches in the index configuration
PB-1167: improve fuzzy search index configuration
the limit will be set per docker run --memory parameter, by default 50% of the
currently available memory can be used for the index creation.
the table lebensraumkarte.lebensraukarte_schweiz: 64 GB

compared with:
database solarkataster: 160 GB
table solarenergie_daecher: 4 GB

table size towards sphinx index size
table bfs.gwr_chsdi: 4.71 GB -> sphinx index size: 10 GB
table bfs.landschaftswandel: 1.5 MB -> Index size: ~ 5 MB

So the index of the lebensraumkarte_schweiz will most probably be > 130 GB

That is the reason why we remove the index.
This layer seems to big to calculate an index on it
increase percentage of accessible memory to 70
limit the available memory for the sphinx index creation
as decided with @ltrea we will only enable global cgroup limitations in a first
step. if this is working fine a docker run limit is not necessary.
disable docker run memory limitation
some dummy changes for the creation of a new base branch
@github-actions github-actions bot added this to the 2025-03-12 milestone Feb 5, 2025
@ltclm ltclm requested a review from faselm February 5, 2025 12:11
Copy link
Contributor

@faselm faselm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ltclm ltclm merged commit b5ccea6 into develop-2025-03-12 Feb 5, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants