-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Api availability measurement #1162
Api availability measurement #1162
Conversation
Hi @vamossagar12. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test Thanks, @vamossagar12! I'll try to take a look tomorrow. Adding Wojtek in case he has capacity to take a look today |
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
/uncc |
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
c29e802
to
4cb2627
Compare
beacee0
to
c54da97
Compare
@vamossagar12 - sorry for delay, I'm pretty overloaded. I will try to look into it today or tomorrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vamossagar12 - made a pass on it, sorry for long delay
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
Thanks @wojtek-t I have more or less implemented the changes you suggested. Just had a couple of questions post which I can push the changes. |
@wojtek-t i made the changes but not able to push them. Keep getting this error after rebasing with the latest from master: .git/hooks/pre-push: line 11: ./verify/verify-boilerplate.sh: No such file or directory I see some changes have been made recently in the /verify folder but not able to figure out what the problem is. |
@vamossagar12 - the verify-boilerplace.sh script still exists and wasn't touched: BTW - the hooks must be something in your local environment - I don't have them set up. |
Well, actually I set the hooks up since they were mentioned in the README of perf-tests. If it isn't needed, I can remove it. Would it have any implications though? |
We have presubmits anyway - so these should be caught by them anyway. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please squash commits
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_metrics.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_metrics.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_metrics.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_metrics.go
Outdated
Show resolved
Hide resolved
eb3cbd1
to
d26909c
Compare
Done. Had a couple of questions. |
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
d26909c
to
1f07817
Compare
1f07817
to
7f8fb7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for delay - I was on vacation for last 2.5 weeks.
I added some comments, but i'm mostly fine with what is now. Please change the PR out of Draft and let's proceed with it.
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
7f8fb7d
to
02bea73
Compare
No worries :) I have made the changes and also removed the PR out of draft. |
content, err := util.PrettyPrintJSON(hostAvailabilityMetricsOutput) | ||
if err != nil { | ||
createMasterAvailabilitySummaryErrors.Append(err) | ||
} else{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix gofmt:
https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/perf-tests/1162/pull-perf-tests-verify-lint/1285548209413820419/build-log.txt
[both here and in the second file]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran gofmt -s for these 2 files but this error still got thrown. Is that the right command?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gofmt -w -s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you.. ran the command and pushed again.
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
a.wg.Wait() | ||
|
||
errList := errors.NewErrorList() | ||
err := a.createClusterAvailabilitySummary(probeFrequency) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - I should have caught that earlier.
Actually I would really prefer to have a single summary rather than N summaries for every master and cluster separately (see my comment below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't yet addressed.
consecutiveFailedProbes int | ||
} | ||
|
||
type apiAvailabilityOutput struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I would prefer the output to be:
type apiAvailabilitySummary struct {
IP string
AvailabilityPercentage float32
LongestUnavailableDuration time.Duration
}
type apiAvailabilityOutput struct {
ClusterMetrics apiAvailabilitySummary
MasterMetrics []apiAvailabilitySummary
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @wojtek-t , slight confusion here. The final output for MasterMetrics is an array here. If you see the output of the measurement that would be returned, all outputs would be consolidated into 1.
Do you think this isn't similar to what you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I care about 'availability' summary as a whole. Yes - I want a single summary, instead of N separate ones.
Alternatively, we may consider two measurements: one for the cluster, second for all masters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I might want to make some tweaks to the apiAvailabilityMeasurement struct in that case. Currently it looks like this:
type apiAvailabilityMeasurement struct {
isRunning bool
stopCh chan struct{}
hosts []string
summaries []measurement.Summary
clusterLevelMetrics *apiAvailabilityMetrics
hostLevelMetrics map[string]*apiAvailabilityMetrics
wg sync.WaitGroup
}
So, if I understand correctly, instead of an array of summaries, you want 1 cluster level summary and array of MasterLevel summaries in a single object. I would also need to remove the clusterLevelMetrics and hostLevelMetrics from the above struct.
Regarding 2 separate measurements, do we expose them as separate measurements- which seems slightly confusing to me. With a unified measurement, a user would be able to get all availability metrics in one summary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - I meant "two summaries" - one for the cluster, second for all masters.
But, I think having a single summary is cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. so in that case, i will make the code changes to have one summary in the format you explained above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wojtek-t I have made the code changes to expose only 1 summary for both ClusterLevel and MasterMetrics. Plz review when you get the chance..
clusterloader2/pkg/measurement/common/api_availability_metrics.go
Outdated
Show resolved
Hide resolved
55c58b3
to
1aca309
Compare
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
a.wg.Wait() | ||
|
||
errList := errors.NewErrorList() | ||
err := a.createClusterAvailabilitySummary(probeFrequency) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't yet addressed.
consecutiveFailedProbes int | ||
} | ||
|
||
type apiAvailabilityOutput struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I care about 'availability' summary as a whole. Yes - I want a single summary, instead of N separate ones.
Alternatively, we may consider two measurements: one for the cluster, second for all masters.
1aca309
to
d394c53
Compare
d394c53
to
768f563
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vamossagar12 - thanks, it's pretty close now; added two more comments
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
clusterloader2/pkg/measurement/common/api_availability_measurement.go
Outdated
Show resolved
Hide resolved
768f563
to
847e106
Compare
Thanks @wojtek-t , I made those changes as well. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vamossagar12, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR is for #1096 which created a new measurement to track the availability of api-server by probing the /healthz end point