Performance and System Monitoring¶

The Testing Automation Engine includes a pre-installed instance of the third-party tool Prometheus for monitoring and performance analytics. This setup provides critical insights into system and process performance metrics.

Table of Contents¶

Performance Metrics Table
Meaning of Parameters
Key Features of Prometheus
Data Centralization with Remote Write
KPI Export and Grafana Integration
Benefits of Prometheus-Grafana Setup

Performance Metrics Table¶

Metric	Parameters	Description
device_battery_level	Device test_node instance job tenant	Monitors the battery level of devices during testing.
device_network_operator	Device test_node instance job tenant	Tracks the network operator used by devices during testing.
device_radio_access_type	Device test_node instance job tenant	Indicates the radio access type used by devices during testing.
device_signal_strength	Device test_node instance job tenant	Measures the signal strength of devices during testing.
device_signal_rsrp	Device test_node instance job tenant	Reference Signal Received Power (RSRP): Measures average LTE reference signal strength.
device_signal_rsrq	Device test_node instance job tenant	Reference Signal Received Quality (RSRQ): Indicates LTE signal quality using RSRP and RSSI.
device_signal_rssnr	Device test_node instance job tenant	Signal-to-Interference-plus-Noise Ratio (SINR): Reflects LTE signal clarity and resistance to interference.
device_signal_rssi	Device test_node instance job tenant	Received Signal Strength Indicator (RSSI): Represents total received signal power including noise.
data_speed	Device Test_Case instance job tenant	Aggregates Download/Upload Speed, Jitter, Latency, Packet Loss, Round-Trip Time (RTT), Data Roaming Success Rate and Data Session Setup Time.
device_temperature	Device test_node instance job tenant	Monitors the temperature of devices during testing for stability analysis.
call_duration	Device Test_Case instance job tenant	Call Duration = Time (BYE) − Time (ACK) from device; calculated for every call (VoLTE and non-VoLTE) and exported to Prometheus.
call_setup_time	Device Test_Case instance job tenant	Call Setup Time = Time (180 Ringing) − Time (SIP INVITE); computed for every VoLTE call regardless of SIP logcat state and exported to Prometheus.
test_case_results_total	instance job tenant result testcase	Tracks the total number of results for each test case in the system.
testnode_command_results_total	command instance job tenant result testnode	Tracks the total results of commands executed on test nodes.
testnode_up_status	instance job tenant name	Tracks the status of test nodes to determine if they are up and running.
total_mobiledevices	instance job tenant	Tracks the total number of mobile devices involved in testing.
total_testcases	instance job tenant	Tracks the total number of test cases available in the system.
total_testexecutions	instance job tenant	Tracks the total number of test executions performed in the system.
total_testnodes	instance job tenant	Tracks the total number of test nodes available in the system.
total_testsuites	instance job tenant	Tracks the total number of test suites executed in the system.
ts_disabled_rate	Test_Suite instance job tenant	Calculates the disabled rate of test suites.
ts_success_rate	Test_Suite instance job tenant	Calculates the success rate of test suites.

Meaning of Parameters¶

Parameter	Description	Explanation
command	Counts the successful execution of commands on test nodes.	This parameter tracks the number of functions or operations successfully executed on a test node, such as SPEED_TEST or DATA_ROAMING commands.
Device	Refers to the physical or virtual device being tested.	This could include smartphones, IoT devices, or emulated devices used during testing.
instance	Specifies the instance or environment where tests are executed.	This refers to the environment (e.g., production, staging) in which the tests are being run, ensuring the results are contextual.
job	Defines a specific job in the test pipeline.	Jobs are tasks or steps (e.g., build, test) managed by CI/CD pipelines during testing.
name	Identifier for the test node.	Provides unique naming or labels for nodes being used during execution.
result	Outcome of a test case or command (pass/fail).	Tracks success, failure, or exceptions generated during testing.
tenant	Indicates the client or organization for whom testing is performed.	Represents multi-tenancy structures where tests align with a particular client’s environment or needs.
testcase	Specific test scenario or script.	Defines individual test scripts that validate particular features or use cases.
Test_Node	Physical or virtual node where tests are executed.	Nodes are computing resources (e.g., virtual machines, Docker containers) used to run tests.
Test_Suite	A group of test cases executed together.	Collections of related tests for a specific functionality or module to validate grouped execution.
Test_Case	Label of the executed test case.	Matches a single run in the test matrix; used by `data_speed`, `call_duration` and `call_setup_time` metrics.

Key Features of Prometheus in the Testing Automation Engine¶

Comprehensive Metrics:
CPU utilization
Memory allocation
Disk usage
Input/output (I/O) operations
System processes
Detailed Test Execution Insights: Prometheus delivers metrics that evaluate the performance and health of test execution processes.
Key Performance Indicators (KPIs): The engine includes built-in KPIs for various test execution environments, enabling granular monitoring.

Data Centralization with Remote Write¶

Prometheus’ remote write capability allows metrics from multiple nodes in the testing engine cluster to be transmitted to an external Prometheus server. This enables:

Centralized Data Collection: All metrics from distributed nodes are aggregated into a single Prometheus instance.
Custom Alerting: Users can define alerts based on custom KPI thresholds, ensuring timely notifications for anomalies.

KPI Export and Grafana Integration¶

Metrics collected by Prometheus can be exported using the KPI shipping capability to an external Prometheus instance. This enables:

Data Aggregation: Metrics from all test environment virtual machines (VMs) are consolidated.
Visualization with Grafana: - The external Prometheus instance acts as the primary data source for Grafana. - Grafana creates dashboards for real-time monitoring and performance analytics.
Robust Alerting: Alerts can be defined using user-configured thresholds or pre-exported KPIs, supporting efficient issue detection.

Benefits of the Prometheus-Grafana Setup¶

Real-time visualization and monitoring for test execution processes.
Improved system health and performance analytics.
Customizable dashboards and alerts tailored to testing requirements.