-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(authz): Authorino for Service Mesh #784
feat(authz): Authorino for Service Mesh #784
Conversation
are we gonna have 605 for UI+workbench and this for kserve? for the overlapping part, should we have 605 in first then rebase for this one? |
The plan was to merge this first, and rebase 605. However, we are reviewing the direction. Hold-on for now. |
74a6f8a
to
1aa08d2
Compare
51c11a9
to
506c1c4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for skillfully refining #605 and contributing those parts to incubation, @israel-hdez! 😊
I've reflected on the (mostly mine) code from a broader time perspective and shared a few insights. It would be fantastic if you could help tidy up the areas where I've left a bit of a mess behind.
pkg/feature/templates/servicemesh/authorino/base/operator-cluster-wide-no-tls.tmpl
Outdated
Show resolved
Hide resolved
Added @aslakknutsen as a reviewer so he can double check if service mesh resources for kserve are in the right state. |
@zdtsw this PR depends on opendatahub-io/odh-model-controller#130 |
5ed909f
to
6d4e014
Compare
6d4e014
to
c1ef3dc
Compare
@bartoszmajsak Ready for another review round. |
pkg/feature/templates/servicemesh/kserve/kserve-predictor-authorizationpolicy.tmpl
Show resolved
Hide resolved
@@ -0,0 +1,18 @@ | |||
apiVersion: security.istio.io/v1beta1 | |||
kind: AuthorizationPolicy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ReToCode @skonto So, this AuthorizationPolicy
will allow to protect InferenceServices
behind auth (per user desire). The pod(s) of the ISVC
will be protected, with the exception of the two paths mentioned below.
Would this be OK for KNative? I saw the queue-proxy container in the pod and was wondering if there would be any issues with it, or something else. On my naive testings, everything worked fine, but I wanted to check with you about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only going to run on the main containerPorts traffic (default 8080) or also on the other ones?
For example we also have traffic on:
- admin port 8022
- http2 on 8013
- metrics ports 9090/9091
- /debug/* (if you enable profiling)
Anything I've forgotten @skonto?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be intercepting everything.
So, I would infer that the default port of 8080 would be used only if the main container does not specify a port, right? From KNative upstream docs, I see that you can set a containerPort
in the PodSpec... May that user-defined port be used transparently?? However, I see you mention 8013 for http2. Are apps required to use that for http2??? Just asking for my knowledge.
In any case, if the user can choose the port, that would be the motivation for the catch-all.
AFAIK, we can whitelist ports -- @aslakknutsen may confirm. That said, this is cluster-wide config. So, to minimize blackholes, we want to protect everything but ports/paths that would always work for infrastructure means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ports I mentioned are on the Queue-Proxy sidecar. Knative will call the ports depending on the traffic type. So the actual user container will run as 8080 (or as defined by the user), but we'll send traffic to Queue-Proxy ports first. But if the rules are not port-related, we need to focus on the list in #784 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we'll send traffic to Queue-Proxy ports first.
You mean that traffic will arrive first to queue-proxy's port 8080 for HTTP1, or 8013 for HTTP2; and later is forwarded to the user container to whatever port it is defined in its spec? Asking just for my knowledge.
I just updated the AuthorizationPolicy
(as noted in my other comment). Please, check if that would be OK for you :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean that traffic will arrive first to queue-proxy's port 8080 for HTTP1, or 8013 for HTTP2; and later is forwarded to the user container to whatever port it is defined in its spec? Asking just for my knowledge.
Exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything I've forgotten @skonto?
That is fine. To add some details.
// BackendHTTPPort is the backend, i.e. `targetPort` that we setup for HTTP/1 services.
BackendHTTPPort = 8012
// BackendHTTP2Port is the backend, i.e. `targetPort` that we setup for HTTP/2 services.
BackendHTTP2Port = 8013
// BackendHTTPSPort is the backend. i.e. `targetPort` that we setup for HTTPS services.
BackendHTTPSPort = 8112
// QueueAdminPort specifies the port number for
// health check and lifecycle hooks for queue-proxy.
QueueAdminPort = 8022
// AutoscalingQueueMetricsPort specifies the port number for metrics emitted
// by queue-proxy for autoscaler.
AutoscalingQueueMetricsPort = 9090
// UserQueueMetricsPort specifies the port number for metrics emitted
// by queue-proxy for end user.
UserQueueMetricsPort = 9091
BackendHTTPSPort is not being used with Istio, it is meant for internal encryption with Kourier.
For profiling if enabled:
// ProfilingPort specifies the default port where profiling data is available when profiling is enabled
ProfilingPort = 8008
BTW @zdtsw, this is alive again. AFAIK, there are still no plans to merge #605 within these timelines -- although @bartoszmajsak would correct me if I'm wrong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few smaller things, but other than that /lgtm. Sorry for round-tripping with the naming, but it just occurred to me.
I will think of what kind of tests we should write, but we can work on this together as a separate PR.
pkg/feature/templates/servicemesh/authorino/base/operator-cluster-wide-no-tls.tmpl
Outdated
Show resolved
Hide resolved
@israel-hdez we keep it as an organ donor and a rebase torture for now ;) |
@zdtsw I just added it. |
config/crd/bases/dscinitialization.opendatahub.io_dscinitializations.yaml
Show resolved
Hide resolved
return k.removeServiceMeshConfigurations(dscispec) | ||
} | ||
|
||
func (k *Kserve) removeServiceMeshConfigurations(dscispec *dsciv1.DSCInitializationSpec) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i did a test: to set servicemesh to Removed, but the FT "configure-kserve-for-external-authz" did not get deleted. is this correct?
till i set kserve to Removed, FT are gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is correct nor incorrect, but the opposite to that 😄
i did a test: to set servicemesh to Removed
If there is no mesh, my expectation would be that the FT and its associated resources must be removed. I think we've been OK with dealing with the DSCI and the DSC separately. However, this is introducing some cohesion and, now, changes to the DSCI should trigger a reconcile to the DSC.
This is declared in the setup of the manager. I haven't reviewed that code.
You may do the following test: set ServiceMesh to Removed
and, instead of setting KServe to Removed, simply add a dummy annotation to the DSC (to force its reconciliation). Most likely, you'll see that the FT will vanish.
Analyzing deeper, I think we shouldn't let the user to freely flip settings. E.g. setting the Mesh to Removed
while setting Serverless to Managed
should not be a valid state. With Vedant's PR #864 coming, the number of invalid states (and needed guards) increases.
I think we should be more proactive on helping the user and start thinking on adding a validating webhook, so that the user gets feedback on invalid configs when doing kubectl apply
, rather than waiting for a reconciliation to happen and looking at the events/status of the DSCI/DSC. If you think this is worth to do, I can volunteer on doing the coding, since most (or all?) the things that require validation are on serving side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Vedant's PR #864 coming, the number of invalid states (and needed guards) increases.
We really need to bring validating webhook to handle such cases there instead. We could have a set of rules there and not even allow to save invalid DSC/DSCI on the cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Vedant's PR #864 coming, the number of invalid states (and needed guards) increases.
We really need to bring validating webhook to handle such cases there instead. We could have a set of rules there and not even allow to save invalid DSC/DSCI on the cluster.
webhook with one simple check has been in review for a while and you even looked in it (#711)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
webhook with one simple check has been in review for a while and you even looked in it (#711)
Ah, this is good.
So, once it is merged, I just need to add serving-specific validations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
webhook with one simple check has been in review for a while and you even looked in it (#711)
Ah, this is good. So, once it is merged, I just need to add serving-specific validations.
Join the review ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#711 is now merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, too late to join the review 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Vedant's PR #864 coming, the number of invalid states (and needed guards) increases.
We really need to bring validating webhook to handle such cases there instead. We could have a set of rules there and not even allow to save invalid DSC/DSCI on the cluster.
webhook with one simple check has been in review for a while and you even looked in it (#711)
I know I did :) it's my subtle attempt to make it shipped to incubation faster. It seems it worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few small nits :) Thanks for your patience @israel-hdez.
if dscispec.ServiceMesh.ManagementState == operatorv1.Unmanaged && k.GetManagementState() == operatorv1.Managed { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be an unsupported state worth reporting somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My rationale was for the case when the user started with a Managed
Mesh. Then, the user wanted to flip from Managed
to Unmanaged
so that odh-operator steps down and the user takes over on the setup.
Although, thinking more, probably this should also be applicable for the case when the DSCI is created with an unmanaged mesh from the beginning (i.e. they bring their own mesh). If this is the case, I think the user would be responsible for doing the setup of Mesh+Authorino+the filters involved here.
name: activator-host-header | ||
namespace: {{ .ControlPlane.Namespace }} | ||
labels: | ||
app.opendatahub.io/kserve: "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required? IIRC this is to mark shared resources managed by the operator, but in this case I do not think there is more than one owner for this Fitler. cc @VaishnaviHire
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I simply looked at KServe installation, and saw it has this label in the resources.
It seems to be injected at deployment here:
opendatahub-operator/pkg/deploy/deploy.go
Line 174 in 45638fd
if err := plugins.ApplyAddLabelsPlugin(componentName, resMap); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this template is only used for servicemesh/kserve, if it is not gonna expanded for multiple component, we could skip this label here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that label is added using ApplyAddLabelsPlugin
to the manifests and as bartosz pointed out is useful when a resource is a shared resource.
@@ -21,6 +22,15 @@ const ( | |||
duration = 5 * time.Minute | |||
) | |||
|
|||
func EnsureAuthNamespaceExists(f *feature.Feature) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@israel-hdez there is .TargetNamespace
coming to the builder in #867 which will work for you, so depending which PR gets merged first, we should remember to clean it up
/cc @cam-garrison
type AuthSpec struct { | ||
// Namespace where it is deployed. If not provided, the default is to | ||
// use '-auth-provider' suffix on the ApplicationsNamespace of the DSCI (e.g. opendatahub-auth-provider). | ||
// The '-applications' suffix is removed from ApplicationsNamespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards the view that the additional lines of code introduced by ResolveAuthNamespace
function might not justify the benefits, especially so if it comes down to personal preference. So if there's no requirement nor any actual problem we are solving with this maybe it's better to just add a suffix? Less code, less reasons to worry :)
pkg/feature/servicemesh/cleanup.go
Outdated
"github.com/opendatahub-io/opendatahub-operator/v2/pkg/gvr" | ||
) | ||
|
||
var log = ctrlLog.Log.WithName("features") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have log in the feature itself now.
var log = ctrlLog.Log.WithName("features") |
pkg/feature/servicemesh/cleanup.go
Outdated
return err | ||
} | ||
if !found { | ||
log.Info("no extension providers found", "feature", f.Name, "control-plane", mesh.Name, "namespace", mesh.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Info("no extension providers found", "feature", f.Name, "control-plane", mesh.Name, "namespace", mesh.Namespace) | |
f.Log.Info("no extension providers found", "control-plane", mesh.Name, "namespace", mesh.Namespace) |
pkg/feature/servicemesh/cleanup.go
Outdated
for i, v := range extensionProviders { | ||
extensionProvider, ok := v.(map[string]interface{}) | ||
if !ok { | ||
log.Info("WARN: Unexpected type for extensionProvider will not be removed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Info("WARN: Unexpected type for extensionProvider will not be removed") | |
f.Log.Info("WARN: Unexpected type for extensionProvider will not be removed") |
pkg/feature/servicemesh/cleanup.go
Outdated
if err != nil { | ||
if k8serrors.IsNotFound(err) { | ||
// Since the configuration of the extension provider is a patch, it could happen that | ||
// the SMCP is already gone, and there will be nothing to unpatch. | ||
return nil | ||
} | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! So I will pay back for fixing my mistake with a little simplification :)
if err != nil { | |
if k8serrors.IsNotFound(err) { | |
// Since the configuration of the extension provider is a patch, it could happen that | |
// the SMCP is already gone, and there will be nothing to unpatch. | |
return nil | |
} | |
return err | |
} | |
if err != nil { | |
// Since the configuration of the extension provider is a patch, it could happen that | |
// the SMCP is already gone, and there will be nothing to unpatch. | |
return client.IgnoreNotFound(err) | |
} |
pkg/feature/servicemesh/cleanup.go
Outdated
_, err = f.DynamicClient.Resource(gvr.SMCP). | ||
Namespace(mesh.Namespace). | ||
Update(context.TODO(), smcp, metav1.UpdateOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For transparency, this will have to change #868, but I've got your back here.
@zdtsw @VaishnaviHire there's one thing which is holding this PR from merging downstream. We haven't figured out yet how we want to install dependent Authorino Operator. This needs to be resolved in tandem. If we are fine with putting it in incubation and document how to use Community Operator somewhere in README here, then it's good to go. |
Yes that works. Maybe we can add a README on installation of dependent operators. |
Signed-off-by: Edgar Hernández <[email protected]>
Signed-off-by: Edgar Hernández <[email protected]>
* Use feature logger * Don't trim -applications suffix on ResolveAuthNamespace Signed-off-by: Edgar Hernández <[email protected]>
@zdtsw @bartoszmajsak I fixed your feedback.
@bartoszmajsak OK, I removed the trim to the suffix. |
bundle/manifests/opendatahub-operator.clusterserviceversion.yaml
Outdated
Show resolved
Hide resolved
Signed-off-by: Edgar Hernández <[email protected]>
/retest |
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: zdtsw The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
New changes are detected. LGTM label has been removed. |
Signed-off-by: Edgar Hernández <[email protected]>
Signed-off-by: Edgar Hernández <[email protected]>
b7d2219
to
abe7daa
Compare
e32a7c2
into
opendatahub-io:incubation
* feat(authz): Authorino for Service Mesh This first iteration is to cover authentication needs for KServe * Add templates to install Authorino * Add templates to configure Service Mesh to use Authorino to delegate Authorization * Add KServe-specific templates add ability to secure KServe Inference Services * Add relevant fields to DSCInitialization resource * Code for proper cleanup, in case of uninstalling Most (if not all) of this code comes from pull request opendatahub-io#605. Attribution to original authors: @bartoszmajsak, @aslakknutsen, @cam-garrison, et. al. Related opendatahub-io/kserve#128 Signed-off-by: Edgar Hernández <[email protected]> * Fix linter issues Signed-off-by: Edgar Hernández <[email protected]> * Resolve feedback: Bartosz Signed-off-by: Edgar Hernández <[email protected]> * fix: Remove port from the authorization policy Also, add `/metrics` to the ignored paths for auth. Signed-off-by: Edgar Hernández <[email protected]> * Fix feedback: Bartosz Signed-off-by: Edgar Hernández <[email protected]> * More feedback: Bartosz Co-authored-by: Bartosz Majsak <[email protected]> * Fix feedback: Reto - Adjust AuthorizationPolicy Signed-off-by: Edgar Hernández <[email protected]> * Fix more feedback: Bartosz - Remove Authorino namespace field from DSCI. - Move around some code in kserve.go to servicemesh_setup.go Signed-off-by: Edgar Hernández <[email protected]> * chore: adds sec. prefix to authorino label selector * fix: adds base dir to manifest sources * chore: uses security instead of sec as a prefix in authorino label * fix: /healthz is called by _something_, skipp * fix: adopt ODH-ADR-0006 for clean up label * fix: uses correct CRD name for authconfigs Co-authored-by: Cameron Garrison <[email protected]> * Remove left-over file Signed-off-by: Edgar Hernández <[email protected]> * Feedback: remove auth-refs ConfigMap Signed-off-by: Edgar Hernández <[email protected]> * Add missing role.yaml changes Signed-off-by: Edgar Hernández <[email protected]> * Go back to installing Authorino on its own namespace Signed-off-by: Edgar Hernández <[email protected]> * Feedback: Add clean-up for KServe/OSSM-auth Signed-off-by: Edgar Hernández <[email protected]> * Feedback: Simplify namings Signed-off-by: Edgar Hernández <[email protected]> * fix: add auth-refs cm * Feedback: adjust labels and a log message Signed-off-by: Edgar Hernández <[email protected]> * Bugfix: Extension provider terminating with error when SMCP is gone Signed-off-by: Edgar Hernández <[email protected]> * Fix: add missing RBAC for ConfigMaps func Signed-off-by: Edgar Hernández <[email protected]> * Fix: Run `make bundle` and commit resulting changes Signed-off-by: Edgar Hernández <[email protected]> * Feedback: Wen - Better feature namings Signed-off-by: Edgar Hernández <[email protected]> * Feedback: Bartosz * Use feature logger * Don't trim -applications suffix on ResolveAuthNamespace Signed-off-by: Edgar Hernández <[email protected]> * Feedback: Wen - revert image placeholder was replaced Signed-off-by: Edgar Hernández <[email protected]> --------- Signed-off-by: Edgar Hernández <[email protected]> Co-authored-by: Bartosz Majsak <[email protected]> Co-authored-by: Aslak Knutsen <[email protected]> Co-authored-by: Cameron Garrison <[email protected]> (cherry picked from commit e32a7c2)
* Update bundle * feat(authz): Authorino for Service Mesh (opendatahub-io#784) * feat(authz): Authorino for Service Mesh This first iteration is to cover authentication needs for KServe * Add templates to install Authorino * Add templates to configure Service Mesh to use Authorino to delegate Authorization * Add KServe-specific templates add ability to secure KServe Inference Services * Add relevant fields to DSCInitialization resource * Code for proper cleanup, in case of uninstalling Most (if not all) of this code comes from pull request opendatahub-io#605. Attribution to original authors: @bartoszmajsak, @aslakknutsen, @cam-garrison, et. al. Related opendatahub-io/kserve#128 Signed-off-by: Edgar Hernández <[email protected]> * Fix linter issues Signed-off-by: Edgar Hernández <[email protected]> * Resolve feedback: Bartosz Signed-off-by: Edgar Hernández <[email protected]> * fix: Remove port from the authorization policy Also, add `/metrics` to the ignored paths for auth. Signed-off-by: Edgar Hernández <[email protected]> * Fix feedback: Bartosz Signed-off-by: Edgar Hernández <[email protected]> * More feedback: Bartosz Co-authored-by: Bartosz Majsak <[email protected]> * Fix feedback: Reto - Adjust AuthorizationPolicy Signed-off-by: Edgar Hernández <[email protected]> * Fix more feedback: Bartosz - Remove Authorino namespace field from DSCI. - Move around some code in kserve.go to servicemesh_setup.go Signed-off-by: Edgar Hernández <[email protected]> * chore: adds sec. prefix to authorino label selector * fix: adds base dir to manifest sources * chore: uses security instead of sec as a prefix in authorino label * fix: /healthz is called by _something_, skipp * fix: adopt ODH-ADR-0006 for clean up label * fix: uses correct CRD name for authconfigs Co-authored-by: Cameron Garrison <[email protected]> * Remove left-over file Signed-off-by: Edgar Hernández <[email protected]> * Feedback: remove auth-refs ConfigMap Signed-off-by: Edgar Hernández <[email protected]> * Add missing role.yaml changes Signed-off-by: Edgar Hernández <[email protected]> * Go back to installing Authorino on its own namespace Signed-off-by: Edgar Hernández <[email protected]> * Feedback: Add clean-up for KServe/OSSM-auth Signed-off-by: Edgar Hernández <[email protected]> * Feedback: Simplify namings Signed-off-by: Edgar Hernández <[email protected]> * fix: add auth-refs cm * Feedback: adjust labels and a log message Signed-off-by: Edgar Hernández <[email protected]> * Bugfix: Extension provider terminating with error when SMCP is gone Signed-off-by: Edgar Hernández <[email protected]> * Fix: add missing RBAC for ConfigMaps func Signed-off-by: Edgar Hernández <[email protected]> * Fix: Run `make bundle` and commit resulting changes Signed-off-by: Edgar Hernández <[email protected]> * Feedback: Wen - Better feature namings Signed-off-by: Edgar Hernández <[email protected]> * Feedback: Bartosz * Use feature logger * Don't trim -applications suffix on ResolveAuthNamespace Signed-off-by: Edgar Hernández <[email protected]> * Feedback: Wen - revert image placeholder was replaced Signed-off-by: Edgar Hernández <[email protected]> --------- Signed-off-by: Edgar Hernández <[email protected]> Co-authored-by: Bartosz Majsak <[email protected]> Co-authored-by: Aslak Knutsen <[email protected]> Co-authored-by: Cameron Garrison <[email protected]> (cherry picked from commit e32a7c2) * fix(authz): Fix broken external auth configuration There are two misconfigurations being fixed: * In the SMCP, the service hostname of Authorino was coded with `-authorization` suffix, but the right suffix is `-authorino-authorization`. * In the `kserve-predictor` AuthorizationPolicy, the hardcoded `opendatahub-odh-auth-provider` provider name was used, but it should have been the template `{{ .AppNamespace }}-auth-provider`. In `pkg/feature/feature.go` the patch manifests (i.e. the ones containing `.patch` in the filename) are always applied. Thus, the first bullet is solved by fixing the patch file that adds the `extensionProvider` to the SMCP. For the second bullet, the faulty AuthorizationPolicy is created with a regular manifest template which is only applied if the resource does not exist. Thus, a patch manifest is added to properly fix the faulty policy (including operator upgrades). Signed-off-by: Edgar Hernández <[email protected]> (cherry picked from commit e4252a0) * fix: Rework operator precondition checks (opendatahub-io#899) * init commit * tmp: switch to subsciption * tmp * fix up testing * linter on import * minor self nits * add bracket, make * use found,err for checking subscription Co-authored-by: Bartosz Majsak <[email protected]> * fix import + test error expected outputs * directly return errs rather than log and ret Co-authored-by: Bartosz Majsak <[email protected]> * remove unused log var from condiitons * move const fixtures to separate package * move creating op subscription to function * rename noop features in testing * remove redundant comments Co-authored-by: Bartosz Majsak <[email protected]> * move CreateSubscription to fixtures --------- Co-authored-by: Bartosz Majsak <[email protected]> (cherry picked from commit f44528e) * chore: follow up review comments from previous PR (opendatahub-io#858) * update: follow up comments - cleanup commented out code - rename function - cleanup unnecessary sleep Signed-off-by: Wen Zhou <[email protected]> * update: add check on return err + remove apierrs.IsNotFound check Signed-off-by: Wen Zhou <[email protected]> * Update pkg/deploy/deploy.go Co-authored-by: Bartosz Majsak <[email protected]> * update(review): create new function DeleteSubscription Signed-off-by: Wen Zhou <[email protected]> * update: return for get and delete subscription - get: return 'sub, nil' or 'nil, err' here error can be real one or notfound Signed-off-by: Wen Zhou <[email protected]> * Update pkg/deploy/deploy.go Co-authored-by: Bartosz Majsak <[email protected]> * fix(linter) Signed-off-by: Wen Zhou <[email protected]> --------- Signed-off-by: Wen Zhou <[email protected]> Co-authored-by: Bartosz Majsak <[email protected]> (cherry picked from commit a81a3da) * fix(authz): ensures extauthz provider is removed from control plane during cleanup (opendatahub-io#905) ### Renames migration folder The reason for this is to have a simple naming convention instead of suggesting storing migration patches in dedicated folders named after tickets. Additionally, the feature explicitly orders files instead of assuming that the underlying fsys implementation fulfills such a contract. ### Ports opendatahub-io#605 test for extension provider This test ensures the addition of an extension provider for external authorization and that it is removed from the control plane properly using a custom cleanup function. We have missed it in the original work. ### Fix: aligns provider name between template and cleanup logic This is short-term fix for the existing codebase. In the long term (which is actively worked on) we need to improve the way of how we are storing config information to limit cases where we rely on pre/suffixes. Cases like this should be kept as its own thing instead, as it represents the concept in the infrastructure/authz setup. * chore: indentation Signed-off-by: Wen Zhou <[email protected]> * fix: use old package path till we cherry-pick refactor commit Signed-off-by: Wen Zhou <[email protected]> --------- Signed-off-by: Wen Zhou <[email protected]> Co-authored-by: Edgar Hernández <[email protected]> Co-authored-by: Edgar Hernández <[email protected]> Co-authored-by: Cameron Garrison <[email protected]> Co-authored-by: Wen Zhou <[email protected]> Co-authored-by: Bartosz Majsak <[email protected]>
Description
This first iteration is to cover authentication needs for KServe
Most (if not all) of this code comes from pull request #605. Attribution to original authors: @bartoszmajsak, @aslakknutsen, @cam-garrison, et. al.
Related opendatahub-io/kserve#128
Fixes https://issues.redhat.com/browse/RHOAIENG-1856
How Has This Been Tested?
TODO
Merge criteria: