Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: filemanager crawl #859

Merged
merged 16 commits into from
Feb 13, 2025
Merged

feat: filemanager crawl #859

merged 16 commits into from
Feb 13, 2025

Conversation

mmalenic
Copy link
Member

Closes #852

Changes

  • Adds crawl API for sync/async crawls that tracks status in a new s3_crawl table in the database.
  • Crawls will list all objects in the bucket and ingest records based on that, updating record metadata and tags.
  • This PR also contains some cleanup of mockall #[double] usage, which was conflicting with aws-smithy-mocks-experimental

The async API isn't working yet because HttpApi doesn't support async lambda invocations. I'm happy with the PR for now and I'll add that in another PR.

I can either change the API to a RestApi or re-invoke the Lambda function from within the existing HttpApi endpoint and detach the execution. I'm not sure if there is a preference to use RestApi with any of the other microservices?

…emanager-crawl

# Conflicts:
#	lib/workload/stateless/stacks/filemanager/filemanager/src/database/entities/sea_orm_active_enums.rs
#	lib/workload/stateless/stacks/filemanager/filemanager/src/database/mod.rs
#	lib/workload/stateless/stacks/filemanager/filemanager/src/events/aws/collecter.rs
#	lib/workload/stateless/stacks/filemanager/filemanager/src/events/aws/mod.rs
#	lib/workload/stateless/stacks/filemanager/filemanager/src/handlers/aws.rs
#	lib/workload/stateless/stacks/filemanager/filemanager/src/queries/list.rs
#	lib/workload/stateless/stacks/filemanager/filemanager/src/queries/mod.rs
#	lib/workload/stateless/stacks/filemanager/filemanager/src/routes/filter/mod.rs
#	lib/workload/stateless/stacks/filemanager/filemanager/src/routes/openapi.rs
@mmalenic mmalenic added feature New feature filemanager an issue relating to the filemanager labels Feb 12, 2025
@mmalenic mmalenic requested a review from victorskl February 12, 2025 23:55
@mmalenic mmalenic self-assigned this Feb 12, 2025
@mmalenic mmalenic requested a review from brainstorm February 12, 2025 23:57
@victorskl
Copy link
Member

.. will review soon in arvo!

Copy link
Member

@victorskl victorskl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@victorskl
Copy link
Member

I'm not sure if there is a preference to use RestApi with any of the other microservices?

Give it try RestApi if you really need to. Do watch out on cognito authorizer but this might work nowaday. Might hit and loose built-in JWT auth capability but, could workaround with Lambda Auth (but then ... umm, a bit less in favour as the overhead it adds, tbh).

@victorskl
Copy link
Member

Or, check with @raylrui on WebSocket - ditto #710 Fancy!

@mmalenic mmalenic added this pull request to the merge queue Feb 13, 2025
Merged via the queue into main with commit 709c07d Feb 13, 2025
6 checks passed
@mmalenic mmalenic deleted the feat/filemanager-crawl branch February 13, 2025 03:53
@alexiswl alexiswl mentioned this pull request Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature filemanager an issue relating to the filemanager
Projects
None yet
Development

Successfully merging this pull request may close these issues.

filemanager: add crawl function
3 participants