-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GIP] dropping ogc-server-statistics and analytics #5
Comments
I agree on removing it, but we should consider a replacement solution. Globally, I have the feeling we should ease or at least document the integration of analytics tools, since it is a quite common need Maybe this could be an interesting workshop during the geocom/following codesprint |
agreed |
Concerning this point, what about use : https://github.com/timescale/timescaledb ? From what I could see ES is a sinkhole for resources and not easy to use. Using timescaledb database with well configured data retention combined with a solution like Grafana and their dashboard can be an alternative to the current analytics tools. |
Love it ! |
@jusabatier Can you remind us how you use it / feed the logs into it ? |
fwiw i use influxdb for similar needs, but they're on the same level with timescaledb. For log "ingestion" promtail & https://github.com/grafana/loki is used to send metrics to influxdb,but you can also use telegraf or fluentd for the logs. https://linuxfr.org/news/loki-centralisation-de-logs-a-la-sauce-prometheus |
Here is some example config to feed database via log4j2 using JDBC appenders (commented) : https://github.com/georchestra/cadastrapp/blob/master/cadastrapp/src/main/resources/log4j2.properties It's feed same way as postgresql as it's an extension. And you can find how to configure retention in the timescaledb docs : https://docs.timescale.com/timescaledb/latest/how-to-guides/data-retention/ |
as @jusabatier @landryb and @jeanpommier Another idea, without changing architecture or add more framework and since elasticsearch and kibana are already installed, we could probably test some simple logs insertion via logstash and create a route to get Kibana accessible for admin user. With a specific dashboard.
This point will probably need more explantation, I know we already spoke about it, but a specific point should be done on this important point when it will occur. Have you an idea when you'd like to replace security-proxy by georchestra-gateway ? |
True, but already installed for Geonetwork4 so could be shared |
that's more or less discouraged, as kibana is configured for GN indexes only, and there's some hairy url rewriting being done too... |
And the default usage made by GN4 (index metadata only) is quite moderate, which allows a "relatively light" ES setup. Logs are known to become quickly massive data, specially if we want some retention time, which we will need for analytics. I'd rather have some experiments first with lighter tools like loki. How far did you go with Loki, @landryb ? |
i have a promtail/loki/grafana dashboard with nginx metrics for mapserver/mapproxy logs. This was a poc done by students in 2021. It's been running in production for 2 years.
i've never got around fully digging more into it to expand it for other needs and fine-tune it more, but the logic is sound. and its lightweight.
the loki datadir takes 14Gb with only those nginx metrics from 2 years. i've other truenas/proxmox dashboards in grafana but those are not related to logs parsing. |
OK, so, to sum up, we have two technical solutions here:
Do we do POCs or is there one emerging from a technical / strategic POV ? Naturally, I would favor timescaledb since it requires less additional stacks to the already existing components, and offers the potential to rewrite the analytics backend if we want to provide key metrics in the console or anywhere else. |
I apologize in advance for inserting noise in this discussion:
|
there were some tests during the community sprint with loki and ES as alternatives, @jeanpommier can give more details
iirc influxdb is more used for metrics coming from telegraf, loki stores his data in his own database |
Florent Berault from MEL says that his needs for analytics is more than just OGC WMS/WFS/etc. |
Hi @fvanderbiest |
Who ?
Camptocamp, with funding from MEL
Target Module
The
ogc-server-statistics
logger will be removed.Which implies also a removal of the analytics webapp and a rework of the front console application.
What ?
As said above, we plan to remove the OGC logging feature from geOrchestra core.
Why ?
We're preparing the replacement of the older security-proxy by the georchestra-gateway.
ogc-server-statistics
How ?
Essentially
git rm -rf analytics
andgit rm -rf ogc-server-statistics
.Any potential pitfalls and ways to circumvent them ?
There's no plan yet to provide an equivalent feature by <<insert here any fancy tech like ELK or ... >>
Maybe we should ?
When ?
One should expect geOrchestra 24.0 to be free from ogc-server-statistics, which means funding will be required to get the equivalent feature by then.
State of the vote:
The text was updated successfully, but these errors were encountered: