-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrated metrics to prometheus #434
base: master
Are you sure you want to change the base?
Conversation
sample.stop(meterRegistry.timer("snapshot-factory.new-snapshot.time")) | ||
sample.stop( | ||
meterRegistry.timer( | ||
"snapshot-factory.seconds", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe snapshot_factory to be consistent with other metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, I was looking at some our other metrics, but looks like this way would be better, I changed to dots everywhere
groupSample.stop(meterRegistry.timer("snapshot-factory.get-snapshot-for-group.time")) | ||
groupSample.stop( | ||
meterRegistry.timer( | ||
"snapshot-factory.seconds", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here - snapshot_factory for consistency
.checkpoint("snapshot-updater-groups-published") | ||
.name("snapshot-updater-groups-published").metrics() | ||
.name("snapshot-updater.count.total") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snapshot_updater/snapshot.updater - . is mapped to _, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. is, - isn't
.record( | ||
stopTimer - startTimer, | ||
TimeUnit.MILLISECONDS | ||
TimeUnit.SECONDS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change the unit here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prometheus has some naming conventions and default units that are recommended to be used for consistency, so according to them it's better to use seconds
...pl/allegro/tech/servicemesh/envoycontrol/server/callbacks/MetricsDiscoveryServerCallbacks.kt
Outdated
Show resolved
Hide resolved
...pl/allegro/tech/servicemesh/envoycontrol/server/callbacks/MetricsDiscoveryServerCallbacks.kt
Outdated
Show resolved
Hide resolved
.name("snapshot-updater-merged").metrics() | ||
.name("snapshot.updater.count.total") | ||
.tag("status", "merged") | ||
.tag("type", "global") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what this type means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so as there were separate metrics for groups and global snapshot-updater I introduced it as a type label, so this type is like "snapshot-type"
Co-authored-by: kozjan <[email protected]>
meterRegistry.gauge(metricName, Tags.of("status", "instance-changed"), it.instanceChanges) | ||
meterRegistry.gauge(metricName, Tags.of("status", "snapshot-changed"), it.snapshotChanges) | ||
meterRegistry.gauge("cache.groups.count", it.cacheGroupsCount) | ||
it.meterRegistry.more().counter("services.watch.errors.total", listOf(), it.errorWatchingServices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to have consistent name. So if we have watched-services
, then it should be watched-services.errors.total
}, 0, interval, TimeUnit.SECONDS) | ||
}, FluxSink.OverflowStrategy.LATEST) | ||
return aclFlux.doOnCancel { | ||
meterRegistry.counter("cross-dc-synchronization.cancelled").increment() | ||
meterRegistry.counter("cross.dc.synchronization.cancelled").increment() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why there is not total
at the end of metric name?
@@ -226,10 +226,10 @@ class ConsulServiceChanges( | |||
if (ready) { | |||
val stopTimer = System.currentTimeMillis() | |||
readinessStateHandler.ready() | |||
metrics.meterRegistry.timer("envoy-control.warmup.time") | |||
metrics.meterRegistry.timer("envoy-control.warmup.seconds") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both -
and .
will be replaced by _
in final metric name. So I suggest to use _
as separator in all metrics names instead -
and .
No description provided.