Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faulty RemoteJwks first try will never be retried #7803

Closed
1 task
Ati59 opened this issue Feb 7, 2023 · 15 comments
Closed
1 task

Faulty RemoteJwks first try will never be retried #7803

Ati59 opened this issue Feb 7, 2023 · 15 comments
Assignees
Labels
Area: extauth Committed: 1.18 Prioritized Indicating issue prioritized to be worked on in RFE stream Type: Bug Something isn't working

Comments

@Ati59
Copy link
Contributor

Ati59 commented Feb 7, 2023

Gloo Edge Version

1.13.x (latest stable)

Kubernetes Version

None

Describe the bug

If the RemoteJwks url is not reachable at the extauth service launch, it will throw an error failed to fetch JWKS but will never retried to get the the jwks even if refreshInterval is set.

Then every call through a route using the AuthConfig will get a 403 error.

On ext-auth logs :

{"level":"error","ts":"2023-02-07T13:30:10.094Z","caller":"jwks/utils.go:24","msg":"failed to fetch JWKS","version":"1.12.40","error":"Get \"http://172.18.2.2:8080/realms/master/protocol/openid-connect/certs\": dia
l tcp 172.18.2.2:8080: connect: connection refused","stacktrace":"github.com/solo-io/ext-auth-service/pkg/config/utils/jwks.FetchJwksWithClient\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/config/
utils/jwks/utils.go:24\ngithub.com/solo-io/ext-auth-service/pkg/config/utils/jwks.FetchJwks\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/config/utils/jwks/utils.go:16\ngithub.com/solo-io/ext-auth-
service/pkg/config/oauth/token_validation/jwt/jwks.NewRemoteJwksSource\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/config/oauth/token_validation/jwt/jwks/remote.go:71\ngithub.com/solo-io/ext-auth
-service/pkg/config.(*authServiceFactory).NewOAuth2JwtAccessToken\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/config/factory.go:218\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/config.(
*extAuthConfigTranslator).authConfigToService\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/config/translator.go:273\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/config.(*extAuthConfigT
ranslator).getConfigs\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/config/translator.go:98\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/config.(*extAuthConfigTranslator).Translate\n\t/
go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/config/translator.go:82\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/config.(*configGenerator).GenerateConfig\n\t/go/src/github.com/solo-io/sol
o-projects/projects/extauth/pkg/config/generator.go:86\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run.func1.1\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/runn
er/xds.go:116\ngithub.com/solo-io/gloo/projects/gloo/pkg/api/v1/enterprise/options/extauth/v1.applyExtAuthConfig.func1\n\t/go/pkg/mod/github.com/solo-io/[email protected]/projects/gloo/pkg/api/v1/enterprise/options/ext
auth/v1/ext_auth_discovery_service_xds.sk.go:111\ngithub.com/solo-io/solo-kit/pkg/api/v1/control-plane/client.(*client).Start\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/api/v1/control-plane/client/clien
t.go:137\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run.func1\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/runner/xds.go:145\ngithub.com/solo-io/go-utils/conte
xtutils.(*exponentialBackoff).Backoff\n\t/go/pkg/mod/github.com/solo-io/[email protected]/contextutils/backoff.go:70\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run\n\t/go/src/githu
b.com/solo-io/solo-projects/projects/extauth/pkg/runner/xds.go:154\ngithub.com/solo-io/ext-auth-service/pkg/server.Server.Run.func2\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/server/server.go:14
9"}
{"level":"error","ts":"2023-02-07T13:30:10.098Z","caller":"config/generator.go:114","msg":"Errors encountered while processing new server configuration","version":"1.12.40","error":"1 error occurred:\n\t* failed to
 get auth service for auth config with id [gloo-system.accesstoken-auth]; this configuration will be ignored: failed to fetch JWKS: Get \"http://172.18.2.2:8080/realms/master/protocol/openid-connect/certs\": dial t
cp 172.18.2.2:8080: connect: connection refused\n\n","stacktrace":"github.com/solo-io/solo-projects/projects/extauth/pkg/config.(*configGenerator).GenerateConfig\n\t/go/src/github.com/solo-io/solo-projects/projects
/extauth/pkg/config/generator.go:114\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run.func1.1\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/runner/xds.go:116\ngit
hub.com/solo-io/gloo/projects/gloo/pkg/api/v1/enterprise/options/extauth/v1.applyExtAuthConfig.func1\n\t/go/pkg/mod/github.com/solo-io/[email protected]/projects/gloo/pkg/api/v1/enterprise/options/extauth/v1/ext_auth_d
iscovery_service_xds.sk.go:111\ngithub.com/solo-io/solo-kit/pkg/api/v1/control-plane/client.(*client).Start\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/api/v1/control-plane/client/client.go:137\ngithub.c
om/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run.func1\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/runner/xds.go:145\ngithub.com/solo-io/go-utils/contextutils.(*exponent
ialBackoff).Backoff\n\t/go/pkg/mod/github.com/solo-io/[email protected]/contextutils/backoff.go:70\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run\n\t/go/src/github.com/solo-io/solo
-projects/projects/extauth/pkg/runner/xds.go:154\ngithub.com/solo-io/ext-auth-service/pkg/server.Server.Run.func2\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/server/server.go:149"}
{"level":"error","ts":1675776616.4879637,"logger":"ext-auth.ext-auth-service","msg":"Auth Server does not contain auth configuration with the given ID","version":"undefined","x-request-id":"b9568388-3cb7-4e1b-8b0d-
135454e91eb4","requestContext":{"AuthConfigId":"gloo-system.accesstoken-auth","SourceType":"virtual_host","SourceName":"gloo-system.gateway-proxy-listener-::-8080-gloo-system_vs"},"stacktrace":"github.com/envoyprox
y/go-control-plane/envoy/service/auth/v3._Authorization_Check_Handler.func1\n\t/go/pkg/mod/github.com/envoyproxy/[email protected]/envoy/service/auth/v3/external_auth.pb.go:699\ngithub.com/solo-io/go-utils/h
ealthchecker.GrpcUnaryServerHealthCheckerInterceptor.func1\n\t/go/pkg/mod/github.com/solo-io/[email protected]/healthchecker/grpc.go:69\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.go
lang.org/[email protected]/server.go:1135\ngithub.com/solo-io/ext-auth-service/pkg/server.requestIdInterceptor.func1\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/server/logging.go:86\ngoogle.golang.org
/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1138\ngithub.com/grpc-ecosystem/go-grpc-middleware/logging/zap.UnaryServerInterceptor.func1\n\t/go/pkg/mod/github.com/grp
c-ecosystem/[email protected]/logging/zap/server_interceptors.go:31\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1138\ngoogle.golang.org/grp
c.chainUnaryInterceptors.func1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1140\ngithub.com/envoyproxy/go-control-plane/envoy/service/auth/v3._Authorization_Check_Handler\n\t/go/pkg/mod/github.com/envoy
proxy/[email protected]/envoy/service/auth/v3/external_auth.pb.go:701\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1301\ngoogle.golang.org/grpc.(*
Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1642\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:938"}

On gateway-proxy logs :

[2023-02-07T13:30:59.213Z] "HEAD /get HTTP/1.1" 403 UAEX 0 0 52 - "-" "curl/7.85.0" "22916f1f-ce5e-4385-ad0d-55fcfea928f3" "httpbin.domain.local" "-"
[2023-02-07T13:31:00.414Z] "HEAD /get HTTP/1.1" 403 UAEX 0 0 11 - "-" "curl/7.85.0" "0e781a03-9441-491f-a4a7-685dd7475c38" "httpbin.domain.local" "-"
[2023-02-07T13:31:02.367Z] "HEAD /get HTTP/1.1" 403 UAEX 0 0 6 - "-" "curl/7.85.0" "c4c7eb49-473b-45a3-b601-7983f04d0d8a" "httpbin.domain.local" "-"

Steps to reproduce the bug

  1. Define an AuthConfig as-is :
apiVersion: enterprise.gloo.solo.io/v1
kind: AuthConfig
metadata:
  name: accesstoken-auth
  namespace: gloo-system
spec:
  configs:
    - oauth2:
        accessTokenValidation:
          jwt:
            remoteJwks:
              url: ${KEYCLOAK_URL}/realms/master/protocol/openid-connect/certs
              refreshInterval: "10"

And a VirtualService that is using it :

apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: vs
  namespace: gloo-system
spec:
  virtualHost:
    domains:
      - '*'
    routes:
      - matchers:
          - prefix: /
        routeAction:
          single:
            upstream:
              name: httpbin-httpbin-8000
              namespace: gloo-system
    options:
      extauth:
        configRef:
          name: accesstoken-auth
          namespace: gloo-system
  1. Make keycloak unavailable (I change the selector to get all endpoint out of the service)
  2. Restart the ext-auth service, it should not get the jwks and you should see failed to fetch JWKS error on ext-auth logs
  3. Wait for refreshInterval
  4. Trying to curl your exposed app should end up on UAEX errors on gateway-proxy pod and Auth Server does not contain auth configuration with the given ID errors on ext-auth one
    (6. If you make keycloak available again and then restart the ext-auth pod, it will fix the issue)

Expected Behavior

Ext-auth pod should retry to get the jwks based on refreshInterval value so we can get through the authentication process and end up with 200 without having to restart the ext-auth pod.
When the first try is faulty it looks like the "refresh loop" is not launch at all.

Additional Context

No response

Related Issues

┆Issue is synchronized with this Asana task by Unito

@Ati59 Ati59 added the Type: Bug Something isn't working label Feb 7, 2023
@SantoDE
Copy link
Contributor

SantoDE commented Feb 9, 2023

This might be a dupe

@Ati59
Copy link
Contributor Author

Ati59 commented Feb 10, 2023

We have a KCS article on zendesk for this. It is probably the "dupe" ;)

@nfuden nfuden self-assigned this Jul 10, 2023
@sam-heilbron
Copy link
Contributor

Similar issue: #7528

@sam-heilbron
Copy link
Contributor

sam-heilbron commented Sep 13, 2023

@DuncanDoyle
Copy link
Contributor

DuncanDoyle commented Mar 1, 2024

Can still reproduce this on GE 1.15.14. The problem seems to be that when the ExtAuth server loads with an AuthConfig that points to a JWKS endpoint that is not available/reachable, the config does not get loaded. And when the config does not get loaded, the refreshInterval does not get triggered at all. So the only way to get out of that situation is to reload the AuthConfig or restart the ExtAuth server.

{"level":"info","ts":"2024-03-01T10:22:17.510Z","caller":"runner/xds.go:115","msg":"{\"auth_config_ref_name\":\"gloo-system.oauth-auth\",\"configs\":[{\"AuthConfig\":{\"Oauth2\":{\"OauthType\":{\"AccessTokenValidationConfig\":{\"ValidationType\":{\"Jwt\":{\"JwksSourceSpecifier\":{\"RemoteJwks\":{\"url\":\"http://keycloak.example.com/realms/master/protocol/openid-connect/certs\",\"refresh_interval\":{\"seconds\":10}}}}},\"ScopeValidation\":null}}}}}]}","version":"1.15.14"}
{"level":"error","ts":"2024-03-01T10:22:17.539Z","caller":"jwks/utils.go:24","msg":"failed to fetch JWKS","version":"1.15.14","error":"request failed with code 503 Service Unavailable","stacktrace":"github.com/solo-io/ext-auth-service/pkg/config/utils/jwks.FetchJwksWithClient\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/config/utils/jwks/utils.go:24\ngithub.com/solo-io/ext-auth-service/pkg/config/utils/jwks.FetchJwks\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/config/utils/jwks/utils.go:16\ngithub.com/solo-io/ext-auth-service/pkg/config/oauth/token_validation/jwt/jwks.NewRemoteJwksSource\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/config/oauth/token_validation/jwt/jwks/remote.go:71\ngithub.com/solo-io/ext-auth-service/pkg/config.(*authServiceFactory).NewOAuth2JwtAccessTokenAuthService\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/config/factory.go:318\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/config.(*extAuthConfigTranslator).authConfigToService\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/config/translator.go:310\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/config.(*extAuthConfigTranslator).getConfigs\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/config/translator.go:103\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/config.(*extAuthConfigTranslator).Translate\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/config/translator.go:87\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/config.(*configGenerator).GenerateConfig\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/config/generator.go:86\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run.func1.1\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/runner/xds.go:121\ngithub.com/solo-io/gloo/projects/gloo/pkg/api/v1/enterprise/options/extauth/v1.applyExtAuthConfig.func1\n\t/go/pkg/mod/github.com/solo-io/[email protected]/projects/gloo/pkg/api/v1/enterprise/options/extauth/v1/ext_auth_discovery_service_xds.sk.go:111\ngithub.com/solo-io/solo-kit/pkg/api/v1/control-plane/client.(*client).Start\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/api/v1/control-plane/client/client.go:137\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run.func1\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/runner/xds.go:148\ngithub.com/solo-io/go-utils/contextutils.(*exponentialBackoff).Backoff\n\t/go/pkg/mod/github.com/solo-io/[email protected]/contextutils/backoff.go:70\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/runner/xds.go:157\ngithub.com/solo-io/ext-auth-service/pkg/server.Server.Run.func3\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/server/server.go:160"}
{"level":"error","ts":"2024-03-01T10:22:17.541Z","caller":"config/generator.go:114","msg":"Errors encountered while processing new server configuration","version":"1.15.14","error":"1 error occurred:\n\t* failed to get auth service for auth config with id [gloo-system.oauth-auth]; this configuration will be ignored: failed to fetch JWKS: request failed with code 503 Service Unavailable\n\n","stacktrace":"github.com/solo-io/solo-projects/projects/extauth/pkg/config.(*configGenerator).GenerateConfig\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/config/generator.go:114\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run.func1.1\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/runner/xds.go:121\ngithub.com/solo-io/gloo/projects/gloo/pkg/api/v1/enterprise/options/extauth/v1.applyExtAuthConfig.func1\n\t/go/pkg/mod/github.com/solo-io/[email protected]/projects/gloo/pkg/api/v1/enterprise/options/extauth/v1/ext_auth_discovery_service_xds.sk.go:111\ngithub.com/solo-io/solo-kit/pkg/api/v1/control-plane/client.(*client).Start\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/api/v1/control-plane/client/client.go:137\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run.func1\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/runner/xds.go:148\ngithub.com/solo-io/go-utils/contextutils.(*exponentialBackoff).Backoff\n\t/go/pkg/mod/github.com/solo-io/[email protected]/contextutils/backoff.go:70\ngithub.com/solo-io/solo-projects/projects/extauth/pkg/runner.(*configSource).Run\n\t/go/src/github.com/solo-io/solo-projects/projects/extauth/pkg/runner/xds.go:157\ngithub.com/solo-io/ext-auth-service/pkg/server.Server.Run.func3\n\t/go/pkg/mod/github.com/solo-io/[email protected]/pkg/server/server.go:160"}

Note that when you bring down Keycloak after the AuthConfig has already been loaded, the ExtAuth server will start giving errors that it can't refresh, but when you bring Keycloak back up again, it's able to refresh again.

The main question seems to be if we want to accept AuthConfigs that point to non-reachable endpoints. We can't really determine whether the AuthConfig is incorrect, or whether there is an issue with the target endpoint.

@DuncanDoyle
Copy link
Contributor

@nfuden
Copy link
Contributor

nfuden commented Mar 5, 2024

We should have a first time start up version of extauth that forces authconfigs to keep retrying and not fail like they normally would in a case where we are applying new configuration

@htpvu htpvu added the Prioritized Indicating issue prioritized to be worked on in RFE stream label Mar 6, 2024
@sync-by-unito sync-by-unito bot added the Spike label Mar 8, 2024
Copy link

sync-by-unito bot commented Apr 11, 2024

➤ Hanh Vu commented:

ETA of 4/12 for design review.
Implementation ETA is unknown.

@sheidkamp
Copy link
Contributor

sheidkamp commented Apr 16, 2024

Outcome of design review was that the ideal approach for handling the situtation where the auth service is updated with a non-responding JWKs URL is to keep the new AuthService in a pending state until it can retrieve the URLs.

This requires changes in how we generate/translate/communicate the new AuthConfigs. The plan is to implement these structural changes in a separate PR and then add the JWKs specific changes on top of that.

This will require 3 rounds of PRs in the main branches:

  • solo-projects (SP1) - refactor event loop and introduce "pending" state
  • ext-auth-service (EXT1) - introduce mechanism to allow AuthServices to communicate readiness and implement it for Remote JWKs/ refactor Remote JWKs to (conditionally?) not fail on startup when the fetch fails
  • solo-projects (SP2) - changes to use the new ext-auth-service readiness functionality

@sheidkamp
Copy link
Contributor

sheidkamp commented Apr 22, 2024

@DuncanDoyle - The changes needed to implement this are the type of structural changes that we usually don't like to implement in backports.

In this case we are making non-trivial modifications to the ExtAuth pod's xds event loop, and the alternative would involve breaking changes to the exported Generator or Translator interfaces that would normally only accompany a major version update. How big of an ask would it be to make these changes 1.17 only?

@kcbabo - tagging you too while Duncan is on vacation.

Copy link

sync-by-unito bot commented Apr 25, 2024

➤ Nathan F Solo commented:

As there are some interstitial prs hence we are pushing the final due Keith Babo

@sheidkamp
Copy link
Contributor

First solo-projects PR merged (SP1 from #7803 (comment)) merged, EXT1 in review

@sheidkamp
Copy link
Contributor

The Ext Auth changes have been merged, functional changes for the last PR are in place, spiffing up the e2e tests.

@sheidkamp
Copy link
Contributor

This has been merged to solo-projects main, all PRs are complete, and the scenario in the reproducer is succeeding.

@nfuden
Copy link
Contributor

nfuden commented May 6, 2024

Since this materially changes our extauth service's behavior this has been merged to main and will not be backported to 1.15.
ThereforeI have removed the 1.15 tag for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: extauth Committed: 1.18 Prioritized Indicating issue prioritized to be worked on in RFE stream Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants