[OSJ-182] Add Metrics instrumentation to dynamic metadata resolvers - Shibboleth Jira

Basics

Technical

Logistics

Basics

Technical

Logistics

Description

Looking at adding some Metrics instrumentation to the dynamic metadata resolvers.

Proposed initially:

timer for fetch from origin source
counter for # of fetches from origin source
counter for # of resolve requests

I guess the ratio of the last 2 is sort of the "cache hit" percentage. I think they have a fancy RatioGauge for that, but I'm not clear on how to use yet.

2 questions:

Is there an existing standard for the metric names? I know we were probably going to base on package/class name. Would something like "class name + component ID + specific metric work"? For the class name, unlike with our logger usage, I was thinking it might be best to use the concrete name of the class, rather than potentially the abstract name. Example:
org.opensaml.saml.metadata.resolver.impl.FunctionDrivenDynamicHTTPMetadataResolver.myMDQResolver.fetchFromOriginSourceTimer
Is it correct that Timer should always be used in a try/finally idiom, to ensure that the timer is always stopped?

Environment

None

Linked issues

is related to

OSJ-185

Re-evaluate how component-specific Metrics should be named, registered, managed

Activity

Show:

Scott Cantor November 8, 2016 at 10:52 PM

The main issue I want to work on later is that right now these don't show up when you hit /metrics/metadata and I'd like to see how we can address that, but it's not important for the release.

Brent Putman November 8, 2016 at 10:50 PM

I should have said: Not planning anything else for 3.3.0. There may be other metrics we add later and/or for metadata stuff other than specifically the dynamic ones.

Brent Putman November 8, 2016 at 10:46 PM

I'm not planning anything else here, so I'm going to close this out.

I am seeing the dynamic resolver metrics data exposed in the admin/metrics endpoint, which is pretty cool.

For interested parties other than Scott: Easy to get at from localhost, pretty print optional:

curl -k -s https://localhost/idp/profile/admin/metrics | python26 -m json.tool | less

Brent Putman November 8, 2016 at 12:51 AM

Ok, I had forgotten about the "metrics" prefix thing, so that's not a concern.

Yes, I think for the next release we should look more at the component ID stuff, and whether/how to do something similar to what you earlier described somewhere with a map, indexed by component ID, etc. I still don't have my head wrapped around how that would work (who/what manages the (presumably global) map, etc). And based on what I later discovered, there's definitely some careful implementation needed there to ensure gauges with references to the containing component are cleaned up at the appropriate time, to prevent GC failures and leaks.

I'll open a new issue, referencing this one.

Brent Putman November 8, 2016 at 12:45 AM

Fixed a minor issue in r4573 which resulted in cache failure loads due to completely filtering out metadata to be misreported as "loaded".

Something went wrong on our end

If this keeps happening, share this information with your admin, who should contact support.

Hash D3XM4T
Trace ff6c72b4f65b402e8b3d6ecafde4a4e4

Pinned fields

Click on the next to a field label to start pinning.

Details

Assignee

Brent Putman

Reporter

Brent Putman

Components

SAML 2 Metadata

Fix versions

3.3.0

OpenSAML - Java