SYSOPS METRICS / HEALTH CHECKS Right now as of May 7th 2018 here are the remaining outstanding issues for metrics: TODO OUTSTANDING ISSUES 1) Need to make my own dashboard for non endpoint stats for Graphana. Actually a dashboard that covers all AyaNova would be good https://www.influxdata.com/blog/how-to-use-grafana-with-influxdb-to-monitor-time-series-data/ 2) Save the dashboard as JSON text for the manual 3) See about making my own Grafana / INfluxdb container and include it in compose.yml for AyaNova server so can deploy it easily (with my own panels pre-built) 4) DOCUMENT 5) Skim below and see if I have covered it all. Need to revise my metrics to bring in mvccore instead of mvc?: - https://github.com/AppMetrics/AppMetrics/issues/261#issuecomment-404051808 Might be time to order all this to the best effectiveness if it isn't already. OLD OLD OLD OLD This is old stuff I was using during research and initial implementation some of it may still be relevant later =-=-=-=-=-=-=-=-=- APRIL 26 2018 - DID SOME RESEARCH, THIS IS ACTUALLY A VERY COMPLEX TOPIC AND BEST HANDLED WITH A 3RD PARTY TOOL - There is an open source metrics tool and an open source db it can work with the is a time series data store (influxdb, elasticsearch) designed for exactly this scenario - Influxdb has a docker container available - Shitty thing is I would need some of this information for support purposes built in, not requiring some fancy 3rd party tools which are very cool for a large setup, but a small one man show doesn't require that. - Perhaps RAVEN can have a big corporate edition that is all intended to be containerized and comes with influxdb and preconfigured with metrics on. - It handles both metrics and "HEALTH CHECK" issues in one package - I'm not sure if this is a v1.0 feature, though it would help in development to see what's what route writes - If it can be an optional thing that can be turned on then that would be ideal - https://al-hardy.blog/2017/04/28/asp-net-core-monitoring-with-influxdb-grafana/ - https://www.app-metrics.io/ Ops Metrics - CASE 3502 Add metrics - CASE 3502 Metric: record count in each table or at least major ones as a snapshot metric so can compare month to month. - CASE 3497 ACTIVE user count - Log user login, last login and login per X period - CASE 3499 "Slow" I want to know if anything is slow, not what the user says but what the code determines - some kind of internal metrics to track changes over time in operations with thresholds to trigger logs maybe? - Has to be super fast, maybe an internal counter / cache in memory and a periodic job that writes it out to DB, i.e. don't write to db metrics on every get operation etc - Average response time? - Busyness / unique logins or tokens in use? A way to see how many distinct users are connecting over a period of time so we know how utilized it is? - Utilization? - Areas / routes used in AyaNova and how often / frequently they are used (we could use this for feature utilization) - CPU peak usage snapshot - Disk space change over time snapshots HEALTH CHECKS - Comes with appmetrics: - https://al-hardy.blog/2017/04/17/asp-net-core-health-checking/