Metric collector v2#286
Conversation
| cpuCurrent := podMetric.Containers[0].Usage.Cpu().ToDec().AsApproximateFloat64() * 1000 | ||
| memoryCurrent := podMetric.Containers[0].Usage.Memory().ToDec().AsApproximateFloat64() / 1000 / 1000 |
There was a problem hiding this comment.
Is Containers[0] guaranteed to be the right container?
There was a problem hiding this comment.
Good point. For apps, jobs yes!
For builds it seems that the larger container appears in the list as the first container.
I will double-check.
Co-authored-by: Norman Böwing <9320860+norman465@users.noreply.github.com>
Co-authored-by: Norman Böwing <9320860+norman465@users.noreply.github.com>
|
|
||
| 1. The metrics collector exposes Prometheus metrics on `localhost:9100/metrics` | ||
| 2. The embedded Prometheus agent scrapes these metrics every 30 seconds | ||
| 3. The agent also discovers and scrapes pods with the `codeengine.cloud.ibm.com/userMetricsScrape: 'true'` annotation |
There was a problem hiding this comment.
I think the agent would discover those pods and scrape additional custom metrics. Default CPU/Memory metrics would be collected for all workload irrespective of that label, right?
| - `eu-de` - EU Central (Frankfurt) | ||
| - `eu-es` - EU Spain (Madrid) | ||
| - `eu-gb` - EU GB (London) | ||
| - `jp-tok` - Japan (Tokyo) |
There was a problem hiding this comment.
| - `jp-tok` - Japan (Tokyo) | |
| - `jp-tok` - Japan (Tokyo) | |
| - `in-che` - India (Chennai) |
| ["eu-de"]="https://eu-de.monitoring.cloud.ibm.com" | ||
| ["eu-es"]="https://eu-es.monitoring.cloud.ibm.com" | ||
| ["eu-gb"]="https://eu-gb.monitoring.cloud.ibm.com" | ||
| ["jp-tok"]="https://jp-tok.monitoring.cloud.ibm.com" |
There was a problem hiding this comment.
| ["jp-tok"]="https://jp-tok.monitoring.cloud.ibm.com" | |
| ["jp-tok"]="https://jp-tok.monitoring.cloud.ibm.com" | |
| ["in-che"]="https://in-che.monitoring.cloud.ibm.com" |
| #!/bin/bash | ||
| set -euxo pipefail | ||
|
|
||
| docker build --platform linux/amd64 . No newline at end of file |
There was a problem hiding this comment.
what is the purpose of this compile & build verify? It's currently not used
| sb.WriteString("# HELP ibm_codeengine_instance_cpu_usage_millicores Current CPU usage in millicores\n") | ||
| sb.WriteString("# TYPE ibm_codeengine_instance_cpu_usage_millicores gauge\n") | ||
| for _, m := range metrics { | ||
| labels := fmt.Sprintf("ibm_codeengine_instance_name=\"%s\",ibm_codeengine_component_type=\"%s\",ibm_codeengine_component_name=\"%s\"", | ||
| escapeLabelValue(m.Name), | ||
| escapeLabelValue(m.ComponentType), | ||
| escapeLabelValue(m.ComponentName)) | ||
| sb.WriteString(fmt.Sprintf("ibm_codeengine_instance_cpu_usage_millicores{%s} %d\n", labels, m.Cpu.Current)) | ||
| } | ||
| sb.WriteString("\n") |
There was a problem hiding this comment.
I would prefer to move those all out so every single type has its own method to have a clean format method that calls them one afther another then
This PR enables resource metric integration within Code Engine by running a metrics-collector that emits CPU, and memory usage metrics to IBM Cloud Logs.
Furthermore, this PR contains a dashboard that can be imported into IBM Cloud Monitoring:

On top, this PR demonstrates a away on how Code Engine jobs and apps can emit custom metrics, which are sent to Sysdig
See Readme for further details: https://github.com/IBM/CodeEngine/blob/metric-collector-v2/metrics-collector/README.md
To demonstrate custom metric collection, this PR provides an enrichment of the network-test-app (see readme https://github.com/IBM/CodeEngine/blob/metric-collector-v2/network-test-app/README.md)
