-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
caching of invalid profiles #938
Comments
You bring up a valid case. I would suggest, that the correct thing to do, is not to disable caching of the instances, but to catch the "token expired" exceptions and in that case remake the instances You might find useful that there is The cache itself could also be made smarter, such that instances only last a certain while - that would be relatively easy to code. |
Thanks for this information! Two follow-up questions I have: Since Oh, maybe that is what |
Yes! I had forgotten you can call it like this; it's exactly what I meant by remaking the session. Note that you might call _set_session if in async mode, and I don't suppose there's any thread-safety guarantee here.
Yes. There is no documented way to kick out one specific instance. |
I tried messing around with this a bit, still not sure it's exactly what I was hoping for. import s3fs
# i started out by connecting to a profile that did not yet exist
# my plan was to try "reconnecting" after creating the profile
s3 = s3fs.S3FileSystem(profile="my-profile")
s3.ls("s3://my-bucket/my-prefix") Traceback to the
# add the profile to ~/.aws/credentials and ~/.aws/config
s3.ls("s3://my-bucket/my-prefix")
# of course returns same error due to cached values
# so let's try reconnecing
s3.connect(refresh=True)
s3.ls("s3://my-bucket/my-prefix")
# unfortunately still same error
s3.clear_instance_cache()
s3.ls("s3://my-bucket/my-prefix")
# still same error
# restarting kernel
# now the profile exists (but intentionally gave it bad keys)
s3 = s3fs.S3FileSystem(profile="my-profile")
s3.ls("s3://my-bucket/my-prefix")
# PermissionError: The AWS access key Id you provided does not exist in our records.
s3.clear_instance_cache()
s3.connect(refresh=True)
# Did not fix it.
s3.clear_instance_cache()
s3 = s3fs.S3FileSystem(profile="my-profile")
s3.ls("s3://my-bucket/my-prefix")
# did fix it but seems less ideal because I break all my instances instead of reconnecting on just one
# and I need to reinitialize the instance |
I should clarify: the instance cache only makes a difference when making new instances, i.e., calling S3FileSystem() directly or via fsspec.open, fsspec.filesystem. It won't affect instances you have as local variables. Within a session, the
this should really fix it. It remakes all of the kwargs from initial values and makes new client objects, just like a new instance would, so we should figure out why it's not doing what you require. |
Thanks for the tips! I looked at Lines 491 to 502 in 023aecf
I am only instantiating my object with Then I saw this line: Lines 525 to 526 in 023aecf
In my code, after creating an instance that gets the s3.session = None
s3.connect(refresh=True) Then it worked! Would it make sense for that to also have |
You are quite right, the |
Awesome, may I submit a PR with that change? I might add a bit to the docstring as well if that is okay. |
Yes please! |
When I am writing functions that use
s3fs
, I find myself usingskip_instance_cache
a lot. Doing so seems to noticeably worsen performance, which makes sense. The reason I am skipping caching of the instances is because with the cache, I have found that if I have certain issues (likeProfileNotFound
in~/.aws/credentials
orPermissionError
) then even if I subsequently generate the credentials,s3fs
will continue to throw the same errors. Ultimately I have to restart my kernel (I think this is the solution) in order to reset things.It would be nice if
s3fs
automatically removed profiles from the cache if they get some kind of access / permission error like above.Alternatively, or perhaps in addition to, it would be cool if
s3fs.S3FileSystem()
had like acheck
argument that could (optionally) test the credentials during instantiation. For example, runningaws sts get-caller-identity
, per: https://stackoverflow.com/a/66874437/9244371More context on the problem I'm trying to solve:
In my environment, I have to generate new keys on a regular basis that provide access to a particular bucket with a known profile name. I will write functions that other users will run to interact with certain buckets (so I know the profile). I skip instance caching because I don't yet know if they have generated valid keys. I can then catch some exception and generate keys and then try again.
The text was updated successfully, but these errors were encountered: