Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporter panics on invalid/unexpected metric names #80

Open
xkilian opened this issue Mar 15, 2019 · 4 comments
Open

Exporter panics on invalid/unexpected metric names #80

xkilian opened this issue Mar 15, 2019 · 4 comments

Comments

@xkilian
Copy link

xkilian commented Mar 15, 2019

The results from this issue were broken up into several issues. The part that remains here is:

  • graphite_exporter should implement a counter of rejected metrics and expose it as parts of its own metrics.
  • graphite_exporter should not panic and exit when an invalid metric is received.
  • Graphite_exporter in debug mode should log the offending metrics in a log file (or in the debug output of the web page for simplicity sake)

Original investigation:

==========

The majority of our metrics contain the field _ in the various parts of the metric.

labels.go has a validation function, validateLabelValues, that checks if the expected number of labels is consistent with the original number of dot delimited fields.

with a metric like:
hostname_function.source.description.metric_blah.count 2.0 timestamp

The exporter will get confused between hostname_function and hostname function. Due to the splitting the underscores.
_ is a valid Graphite AND Prometheus value and should be handled.
UPDATE:When using matching this problem can be bypassed by assigning parts of the graphite metric name to labels and extracting a good metric name. See comment below. Issue is still valid as it should not panic when encountering an unexpected/unsupported/supported_but_problematic metric name.

Suggested fix: A temporary token should be used to treat initial underscores detected and restoring them when creating the label names to be exposed to Prometheus.

0.5 Panics
0.4.2 Drops the offending metrics and only exposes the "valid" metrics. But it considers the graphite_Exporter as Down and throws errors in the syslog metric after HELP is INVALID.

func validateLabelValues(vals []string, expectedNumberOfValues int) error {
	if len(vals) != expectedNumberOfValues {
		return fmt.Errorf(
			"%s: expected %d label values but got %d in %#v",
			errInconsistentCardinality, expectedNumberOfValues,
			len(vals), vals,
		)
	}

	for _, val := range vals {
		if !utf8.ValidString(val) {
			return fmt.Errorf("label value %q is not valid UTF-8", val)
		}
	}

	return nil
}
@xkilian
Copy link
Author

xkilian commented Apr 4, 2019

After further testing here are my conclusions and suggestions:

  • graphite_exporter should implement a counter of rejected metrics and expose it as parts of its own metrics.
  • graphite_exporter should not panic and exit when an invalid metric is received.
  • Graphite_exporter in debug mode should log the offending metrics in a log file (or in the debug output of the web page for simplicity sake)
  • Documentation should explain that any underscores in the received graphite formatted metric will be rejected UNLESS there is a regex or glob match to extract the various fields and build metric names that are valid and will not be rejected due to various internet validations.
  • Provide examples using regex not just glob in the documentation
  • Provide an example using a catch-all regex for metrics not matching an existing regex.
  • Explain the difference between using the configuration for dropping all non match metrics versus using a catch all regex to provide an indication that metrics are not matching and would have been dropped.

@matthiasr
Copy link
Contributor

I agree with all of these points! In particular,

Documentation should explain that any underscores in the received graphite formatted metric will be rejected

would be a stop-gap measure to document a known issue – if underscores are valid in Graphite, then we need to handle them.

@matthiasr matthiasr changed the title Metrics with _ in names cause the exporter to panic in 0.5 and dropped with errors in 0.4.2 Exporter panics on invalid/unexpected metric names Apr 8, 2019
@matthiasr
Copy link
Contributor

Thank you for the thorough investigation and write-up. I broke out your list in 3 separate issues:

Help with any of these is highly appreciated!

@xkilian
Copy link
Author

xkilian commented Apr 9, 2019

My pleasure. I will try and improve the docs with my suggestions. (When I get a bit of free time)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants