Our step-by-step guide to evaluating runtime security tools

Choosing the right runtime security tool is critical for protecting modern cloud-native environments. We recently undertook a rigorous evaluation process using real-world attack simulations on our Kubernetes clusters and Linux servers. Why? Because traditional cloud audit logs do not provide enough detail, leaving critical gaps in threat detection, incident response, and forensic analysis. Our evaluation meticulously examined each critical stage from initial access to lateral movement and data exfiltration.

While we won't be naming the specific vendor in this post, we want to share our detailed methodology and key learnings, providing a blueprint you can adapt for your own security tool evaluations.

Why are runtime security tools necessary?

Without runtime security tools, detecting “suspicious activities” and understanding “what actually happened” during an attack can become extremely challenging.

Limitations of cloud audit logs

Lack of runtime details
Cloud audit logs primarily record operations and data access within the cloud. However, they do not capture runtime-level activities on systems such as Kubernetes servers – overlooking fine-grained command executions, process behaviors, and transient network activities.
Gaps in investigation and forensics
In Kubernetes environments, the absence of continuous, real-time logging can lead to the loss of critical activity records once a container terminates.

Although well-known open-source runtime security tools are available, we decided to evaluate a commercial product to assess additional capabilities and enterprise-level support through attack simulation testing.

The role and purpose of runtime security tools

Runtime security tools address these cloud audit log limitations by continuously monitoring systems in real time, offering the following functionalities:

Threat detection
They monitor command executions, system calls, and network events in real-time to instantly detect abnormal behaviors, which enables the security team to respond rapidly. While some public cloud providers now offer limited runtime monitoring capabilities, these native solutions typically lack the depth and comprehensive coverage of dedicated security tools.
Incident response
By maintaining detailed chronological records of system activities, these tools provide security teams with the evidence needed to reconstruct attack timelines, determine the full scope of compromise, and conduct thorough forensic investigations after an incident occurs.
Scalability in investigations
Unlike traditional endpoint-by-endpoint forensic analysis, runtime security tools allow teams to collect, store, and analyze data centrally across the entire environment. This enables the efficient investigation of incidents without manually correlating disparate data sources.

(Note: Products that also offer container information or server vulnerability monitoring are outside the scope of this discussion.)

Key evaluation points

Our primary objective in evaluating a runtime security tool was to determine its effectiveness in real-world security investigations. While evaluations often focus on the volume of detections or overall coverage, in actual operations, an overload of false positives – or tens of alerts for a single attack chain – can paralyze incident response teams. Therefore, our in-depth investigation centered on whether the tool could be used to support security operations with understanding and responding to actual attacks.

Detection capability
- Built-in rule
  We assessed whether the built-in rule sets could effectively detect a variety of attack techniques and provide the necessary detail for accurate detection.
- Custom detection capabilities
  We evaluated the ease with which additional rules could be integrated and considered the quality of telemetry data delivered by the product, which enabled us to build our own monitoring solutions leveraging our unique understanding of our environment.
- Alert quality
  We also verified the rate of false positives. We confirmed that it effectively focuses on genuine security threats requiring action while minimizing noise that could cause alert fatigue.
Incident response
- Richness of logs
  We evaluated whether the logs capture sufficient details – including executed commands, network connections, DNS queries, and process information – to fully reconstruct the incident. The ability to piece together the entire attack scenario and determine the full impact is crucial during incident response.
- Log searchability
  We assessed how effectively the tool allowed us to search, filter, and correlate events across multiple systems. The ability to quickly query massive volumes of data is essential for timely investigations during security incidents.

Evaluation process

We divided our evaluation process into four major phases:

Development of attack scenarios
We designed scenarios that mimicked real-world attack flows. These scenarios, developed in collaboration with our Red Team, included the following elements:
- attacks exploiting GitLab-specific vulnerabilities (e.g., CVE-2021-22205)
- attacks leveraging the compromise of developer laptops
- detailed step-by-step attack procedures
Infrastructure setup
We deployed two parallel environments:
- Kubernetes environment
- Virtual machine (VM) environment
We installed an older version of GitLab to test known vulnerabilities and carried out similar evaluation flows in both the Kubernetes and VM environments.
Execution of attacks
We executed the attack flow for each scenario and meticulously recorded the timeline – from initial access to lateral movement and data exfiltration.
Analysis of results
We conducted a comprehensive evaluation of detection capabilities, log richness, and areas for improvement, clearly outlining the strengths and weaknesses of the tools.

Attack scenarios

Scenario 1: Exploitation of a known GitLab vulnerability

Scenario 1: Exploitation of a known GitLab vulnerability

Attack flow
1. Initial access
  We simulated an attack by exploiting CVE-2021-22205, a known GitLab vulnerability that allows remote code execution. This granted us unauthorized access to the target system.
2. Command execution
  After gaining access, we executed a reverse shell to interact remotely with the compromised machine and take control.
3. Deployment of a C2 agent
  We installed a Command and Control (C2) agent to evaluate persistence techniques, enabling us to execute further commands and manage the system remotely.
4. Lateral movement
  We then moved laterally within the environment, accessing Kubernetes API secrets and PostgreSQL databases.
5. Data exfiltration
  We exfiltrated sensitive data via a dedicated C2 channel.

The following table summarizes the attack techniques used at each phase:

Initial access	Command and control	Enumeration	Credential access	Lateral movement	Collection	Exfiltration
Exploit GitLab application using known RCE vulnerability	Execute known reverse shell command	Harvesting info on the box	Get environment variables	Get secret from Kubernetes API	Get data from Cloud Storage	Exfiltration over C2 channel
	Install post-exploitation C2 agent		Get K8s token	Access to database	DNS exfiltration
	SOCKS proxy		Get cloud token via Cloud metadata server

Scenario 2: Compromise of a developer’s laptop

Scenario 2: Compromise of a developer’s laptop

Attack flow
1. Initial compromise
  We simulated an attacker compromising a developer’s laptop and abusing legitimate credentials to gain unauthorized access to internal resources.
2. Privilege escalation
  Using the compromised credentials, we escalated privileges within the Kubernetes environment.
3. Container manipulation
  We deployed a privileged container to extract sensitive information.
4. Data exfiltration and persistence
  We exfiltrated sensitive data while maintaining persistent access.
  
  The following table summarizes the attack techniques used at each phase:

Initial access	Execution	Privilege escalation	Credential access	Lateral movement	Exfiltration
Valid account (kubectl)	Create a new container	Create a privileged container	Get K8s secrets via privilege of the node	Enter a container in the same node	Upload credential data to the attacker’s server
			Get an environment variable in the containers via `crictl` command on the node

Execution of the attacks

During the execution of the attack scenarios, we followed these processes to obtain detailed records:

Verification of detections: We confirmed whether each attack command was detected and if the key points of each scenario were properly flagged.
Timeline recording: Every event was logged in sequence to assess how well command executions and network communications were captured.
Scoring and analysis: We scored each event based on detection effectiveness to quantitatively evaluate the tool’s performance.

What we learned

Don't overestimate – test commercial products yourself

Identifying and addressing detection gaps (collaboration with vendors)
Our evaluation revealed that several critical scenarios and events were not detected or not logged. Consequently, we held meetings with the vendor and submitted multiple improvement requests. As a result, the vendor enhanced the product by adding new features and improving detection capabilities, with many issues identified during our evaluation subsequently addressed.
Understanding the limitations
Many modern runtime security tools use eBPF to monitor Linux system calls for detection. However, because commands executed within a C2 framework do not generate new processes, tracing these attack events proved challenging.
Recognizing tool boundaries
Our findings highlighted that, during incident response, relying solely on runtime security tools is insufficient. It is essential to combine them with other logs, such as Kubernetes audit logs and cloud logs, to gain a comprehensive view.

The importance of continuous runtime event logging in Kubernetes

In Kubernetes environments, there is a risk of losing forensic data when containers terminate, making continuous logging indispensable. Our evaluation confirmed that establishing a scalable, persistent logging infrastructure is crucial. Without proper runtime security tools, a significant amount of critical information could be lost post-attack.

Summary

We do not simply install security tools – we evaluate their utility to help ensure that our customers can safely use GitLab.com. Thorough product assessments like the one outlined above not only reveal unique use cases and areas for improvement that vendors might overlooks, but also provide valuable insights that benefit both the vendor and internal teams in organizing how the tool is best utilized.

Our step-by-step guide to evaluating runtime security tools