Dark Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

decreasing counters in deployed environments #305

Answered by dmagliola
coldnebo asked this question in Q&A
decreasing counters in deployed environments #305
Mar 5, 2024 * 1 comments * 3 replies
Answered by dmagliola Return to top
Discussion options

coldnebo
Mar 5, 2024

we're seeing decreasing counters in our deployed servers. We're using Kubernetes pods with puma set to 2 workers, it appears that Prometheus::Client.registry is returning two different registries (one per worker process)... and our rack exporter is being directed to available worker by puma, so the count varies... here's how that looks when monitoring logs locally under controlled circumstances:

user@user2-l:../~$ curl -sL http://user2-l.dhcp.company.com:38080/app/metrics
# TYPE http_request_count counter
# HELP http_request_count The count of HTTP requests handled by the Rack application.
http_request_count 11.0
user@user2-l:../~$ curl -sL http://user2-l.dhcp.company.com:38080/app/metrics
# TYPE http_request_count counter
# HELP http_request_count The count of HTTP requests handled by the Rack application.
http_request_count 12.0
user@user2-l:../~$ curl -sL http://user2-l.dhcp.company.com:38080/app/metrics
# TYPE http_request_count counter
# HELP http_request_count The count of HTTP requests handled by the Rack application.
http_request_count 9.0

Notice that the counter resets to 9. If we look at a lot of this we see another request bounce up to 13, then down to 10. The total number of requests is always higher than either of these counters. This seems to demonstrate that there are two registry instances (each worker process has its own Singleton).

I tried to move the registry creation outside of the process by using Puma's before_fork, but that didn't seem to change the behavior, which raised questions about whether the client is fork-safe.

I'm not sure about what I've found, so it could be for other reasons. Are there any guides that people can recommend for the right way to setup the prometheus client in the context of multiple processes? (we haven't started using multithreaded Rails yet, but if someone has encountered this there, any ideas are welcome).

AFAIK there is no way to scrape specific workers in puma, but if there was, then we could fix this problem by scraping all the workers separately instead of just the pods.

You must be logged in to vote

Are you using the DirectFileStore to store your metrics?

Read more here. Make sure to read the caveats!

View full answer

Replies: 1 comment 3 replies

Comment options

dmagliola
Mar 5, 2024
Maintainer

Are you using the DirectFileStore to store your metrics?

Read more here. Make sure to read the caveats!

You must be logged in to vote
3 replies
Comment options

coldnebo Mar 5, 2024
Author

we avoided DirectFileStore because one of our apps exploded into thousands of files. indeed, I have read this part before:

Large numbers of files: Because there is an individual file per metric and per process (which is done to optimize for observation performance), you may end up with a large number of files. We don't currently have a solution for this problem, but we're working on it.

So I assume that this combined with the other issue open for prefork servers means that Prometheus client doesn't support multiple-processes/multithreading in Ruby without DirectFileStore. Good to know, I'll take a deeper look at DirectFileStore. In pods there shouldn't be much of a difference, AFAIK, the filesystem is a memory filesystem I think?

Comment options

SuperQ Mar 5, 2024
Maintainer

The typical solution for puma is to ID the workers by their worker ID number, instead of the OS PID. This way the number of files is limited to the number of Puma workers. The files are re-used between worker process restarts.

Comment options

dmagliola Mar 5, 2024
Maintainer

Prometheus client doesn't support multiple-processes/multithreading in Ruby without DirectFileStore

Right. The only reason DirectFileStore exists is precisely to support multiprocess. Multithreading you can totally do with the default store, but not multiprocess.

one of our apps exploded into thousands of files

Yeah, this, together with performance of exports is the #1 thing we want to fix. We're struggling a bit for time to dedicate to it, but it's the top of our list.

Answer selected by coldnebo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants