Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-75056] Upgrade pac4j to version 6.1.0 #491

Conversation

pankajy-dev
Copy link

Testing done

image

The plugin is compiling and passing all the tests.

Jenkins issue 75056

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

@pankajy-dev pankajy-dev requested a review from a team as a code owner December 23, 2024 10:12
}

/**
* This method is needed as there seems to be a bug in pac4j and hasChanged is not able to return true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious: Is there an official bug report link? Maybe helpful to add this here...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is needed

Is this statement true? It was not needed in #455.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pankajy-dev, @basil is expecting a more detailed explanation.

If we configure the plugin to issue the ID Token with a nonce, the refresh token shouldn't have nonce, and if present it should be the same as in the ID token. However pac4j expects that the refresh token contains the nonce and considers it should be new. That's the bug reported to pac4j.
When the plugin is calling pac4j to refresh the token and the nonce is in use from the doCommenceLogin, this method was always returning false (sincerely, I don't recall why as I checked close to 3 months ago), making the process halt. Once the bug is fixed the parent class method should work again, so this override can be removed.

It was not needed in #455.

it might be for 3 potential reasons:

  1. Either the pac4j bug is fixed in pac4j-6.1.0, what I don't think it's the case
  2. Or the differences between pac4j-6.0.10 and 6.1.0 makes this method not needed anymore (this override comes from one of my initial commits, that I tested using keycloak and AWS Cognito)
  3. Or it's because a workaround we had to introduce (and that's my suspicious)

The workaround is

/**
* Flag when set to true will cause enforce nonce checking in the refresh flow.
* https://openid.net/specs/openid-connect-core-1_0.html#RefreshTokenResponse states the nonce claim should not be present
* and when faced with a provider that adheres to this if using a nonce, the library attempts to validate the "missing" nonce and fails.
* So this is disabled by default, but if the provider does send the nonce in the claim then we do need to verify it.
* But there is no way to know ahead of time if the server is going to send this or not.
*/
private static boolean checkNonceInRefreshFlow =
SystemProperties.getBoolean(OicSecurityRealm.class.getName() + ".checkNonceInRefreshFlow", false);
and the tests (including the ATH as it's using keycloack through test containers) are executed with the workaround active.

Copy link
Member

@basil basil Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be for 3 potential reasons:

  1. Upgrade to pac4j 6.x and migrate to EE 9 #455 is also using 6.1.0, so this reasoning is invalid.
  2. Upgrade to pac4j 6.x and migrate to EE 9 #455 is also using 6.1.0, so this reasoning is invalid.
  3. After removing checkNonceInRefreshFlow workaround you mentioned and retesting Upgrade to pac4j 6.x and migrate to EE 9 #455 in ATH, everything still works fine, so this reasoning is invalid.

sincerely, I don't recall why as I checked close to 3 months ago

I suggest running #455 in a debugger and stepping through the call to opMetadataResolver.init() before you continue this conversation.

Is there a valid reason why this hasChanged workaround is needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, the ATH doesn't configure the use of nonce (because of the bug mainly), so the test you did won't catch the issue

I suggest running #455 in a debugger and stepping through the call to opMetadataResolver.init() before you continue this conversation.

I have a counteroffer: I suggest running #455 configuring the nonce in a debugger and stepping through the call before you continue this conversation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, the ATH doesn't configure the use of nonce (because of the bug mainly), so the test you did won't catch the issue

I forced the use of a nonce with basil@7a79db4 against #455 and ATH is still passing, so as far as I can tell #455 is passing this scenario even without the hasChanged workaround (and therefore the hasChanged workaround is unnecessary). As far as I can tell, #455 is passing all test scenarios (including manual nonce testing) without the hasChanged workaround. If there is some legitimate test scenario where #455 is failing, then I would like to know how to reproduce the problem so that I can better understand the hasChanged workaround in this PR. As of now I am still not fully following the reasoning you have provided above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran an exhaustive set of 12 tests with the Keycloak Docker image, testing token refresh in a variety of nonce scenarios:

Test 1

Trunk: checkNonceInRefreshFlow = false
Result: Refresh succeeded

Test 2

Trunk: checkNonceInRefreshFlow = true
Actual: Refresh failed

Test 3

PR 491: checkNonceInRefreshFlow = false, hasChanged workaround present, init fix absent
Result: Refresh succeeded

Test 4

PR 491: checkNonceInRefreshFlow = true, hasChanged workaround present, init fix absent
Result: Refresh failed

Test 5

PR 491: checkNonceInRefreshFlow = false, hasChanged workaround absent, init fix absent
Result: Initial login (not refresh!) fails with NullPointerException: Cannot invoke com.nimbusds.openid.connect.sdk.op.OIDCProviderMetadata.getAuthorizationEndpointURI() because the return value of org.pac4j.oidc.metadata.OidcOpMetadataResolver.load() is null

Test 6

PR 491: checkNonceInRefreshFlow = true, hasChanged workaround absent, init fix absent
Result: Initial login (not refresh!) fails with NullPointerException: Cannot invoke com.nimbusds.openid.connect.sdk.op.OIDCProviderMetadata.getAuthorizationEndpointURI() because the return value of org.pac4j.oidc.metadata.OidcOpMetadataResolver.load() is null

Test 7

PR 491: checkNonceInRefreshFlow = false, hasChanged workaround present, init fix present
Result: Refresh succeeded

Test 8

PR 491: checkNonceInRefreshFlow = true, hasChanged workaround present, init fix present
Result: Refresh failed

Test 9

PR 491: checkNonceInRefreshFlow = false, hasChanged workaround absent, init fix present
Result: Refresh succeeded

Test 10

PR 491: checkNonceInRefreshFlow = true, hasChanged workaround absent, init fix present
Result: Refresh failed

Test 11

PR 455, checkNonceInRefreshFlow = false
Result: Refresh succeeded

Test 12

PR 455, checkNonceInRefreshFlow = true
Result: Refresh failed

Analysis

Tests 1 and 2 establish a baseline of current behavior on trunk. Refresh succeeds, but only when checkNonceInRefreshFlow = false. On trunk and in this test environment, refresh fails as expected when checkNonceInRefreshFlow = true.

It has been claimed in #491 (comment) that the hasChanged workaround is needed to allow refresh to succeed when checkNonceInRefreshFlow = true. Were this claim true, refresh would have succeeded in tests 4, 6, and 8. Refresh failed in tests 4 and 8, so the claim is false. Initial login failed in tests 5 and 6, consistent with the claim being false.

As I have been saying, the problem has been misdiagnosed. The hasChanged code is a workaround for something, but the reasoning given in #491 (comment) is invalid, as demonstrated by tests 3 and 4. Rather, the hasChanged code is working around a more basic issue, revealed in tests 5 and 6. A cleaner solution for that issue is my init fix, which restores the status quo (without regression!) both when the hasChanged workaround is present (tests 7 and 8) as well as when the hasChanged workaround is absent (tests 9 and 10). Tests 11 and 12 are a duplicate of tests 9 and 10 against PR 455, demonstrating that my original solution from over two months ago in PR 455 remains bug-free relative to the status quo on trunk.

My recommendations remain the same:

  • Remove the unnecessary hasChanged workaround in favor of my simpler and more maintainable init fix. Since this is the primary difference between PR 491 and PR 455, this suggestion is tantamount to closing PR 491 in favor of PR 455.
  • In lieu of coming up with a second counteroffer, run PR 455 in a debugger and step through the call to opMetadataResolver.init() to understand the root cause of the issue.
  • Review PR 455, either demonstrating a reproducible problem with my init approach or otherwise approving PR 455.

}

/**
* This method is needed as there seems to be a bug in pac4j and hasChanged is not able to return true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fcojfernandez fcojfernandez changed the title Jenkins 75056 pac4j upgrade 6.1.0 [JENKINS-75056] Upgrade pack4j to version 6.1.0 Dec 23, 2024
… like jackson lib coming from jackson2 plugin
@pankajy-dev pankajy-dev changed the title [JENKINS-75056] Upgrade pack4j to version 6.1.0 [JENKINS-75056] Upgrade pac4j to version 6.1.0 Dec 23, 2024
…solver.java

Co-authored-by: Francisco Javier Fernandez <31063239+fcojfernandez@users.noreply.github.com>
@eva-mueller-coremedia
Copy link
Contributor

I wonder if this PR super-seeds #455?

@pankajy-dev
Copy link
Author

CI is failing as pom uses 5.3 version of plugin-pom which needs min maven 3.9.6

I think CI needs an update to use the updated Maven version.

@michael-doubez
Copy link
Contributor

Depends on #485

@pankajy-dev
Copy link
Author

There is another PR for same changes with passing CI, therefore this PR will be closed in favor of #455

@pankajy-dev pankajy-dev marked this pull request as draft January 3, 2025 12:11
@fcojfernandez fcojfernandez mentioned this pull request Jan 3, 2025
6 tasks
@fcojfernandez
Copy link
Member

fcojfernandez commented Jan 3, 2025

@pankajy-dev this PR should unblock you #496

Copy link
Member

@basil basil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

codecov bot commented Jan 8, 2025

Codecov Report

Attention: Patch coverage is 98.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 72.29%. Comparing base (41f98d9) to head (7076e2f).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
...va/org/jenkinsci/plugins/oic/OicSecurityRealm.java 97.05% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #491      +/-   ##
============================================
+ Coverage     71.73%   72.29%   +0.56%     
- Complexity      222      228       +6     
============================================
  Files            17       18       +1     
  Lines          1033     1054      +21     
  Branches        148      148              
============================================
+ Hits            741      762      +21     
  Misses          201      201              
  Partials         91       91              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pankajy-dev pankajy-dev marked this pull request as ready for review January 8, 2025 10:07
@pankajy-dev pankajy-dev requested a review from basil January 8, 2025 10:33
@pankajy-dev
Copy link
Author

pankajy-dev commented Jan 8, 2025

There is another PR for same changes with passing CI, therefore this PR will be closed in favour of #455

Since the security assessment is already done here we will be using this PR.

@@ -533,6 +541,8 @@ private OidcConfiguration buildOidcConfiguration() {
conf.setResourceRetriever(getResourceRetriever());
if (this.isPkceEnabled()) {
conf.setPkceMethod(CodeChallengeMethod.S256);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Set PKCE state in any case?

if (this.isPkceEnabled()) {
    conf.setPkceMethod(CodeChallengeMethod.S256);
    conf.setDisablePkce(false);
} else {
     conf.setPkceMethod(ull);
     conf.setDisablePkce(true);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might make sense

@@ -0,0 +1,37 @@
package org.jenkinsci.plugins.oic;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file seem to need to be formatted: With mvn spotless:apply?

@@ -1375,8 +1399,9 @@ public boolean handleTokenExpiration(HttpServletRequest httpRequest, HttpServlet
return true;
}

private void redirectToLoginUrl(HttpServletRequest req, HttpServletResponse res) throws IOException {
if (req != null && (req.getSession(false) != null || Strings.isNullOrEmpty(req.getHeader("Authorization")))) {
private void redirectToLoginUrl(@NonNull HttpServletRequest req, @NonNull HttpServletResponse res)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the annotations? they contradict the method body. For example, if res is NonNull then

if (res != null) {
            res.sendRedirect(Jenkins.get().getSecurityRealm().getLoginUrl());
        }

will happen always

Copy link
Contributor

@eva-mueller-coremedia eva-mueller-coremedia Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be based on a not so glorious comment by myself: #491 (comment)

I wanted to say: either tighten the contract by adding the @NonNull annotation (then there is no need for the null check) or leave the null check...

Sorry for not be more clear in my previous comment.

Copy link
Member

@basil basil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fcojfernandez
Copy link
Member

I've repeated the tests in #491 (comment) and now we can agree. I'm closing this PR in favor of #455 , that I'll approve after compare with this same PR.

Thanks @pankajy-dev for this PR!
Thanks @eva-mueller-coremedia @jtnord @jglick for participating. If you want to jump to the other PR you are more than welcome, the only difference is:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants