Examine our research from the last year in the ReliaQuest 2024 Annual Cyber-Threat Report
Reduce Alert Noise and False Positives
Boost your team's productivity by cutting down alert noise and false positives.
Automate Security Operations
Boost efficiency, reduce burnout, and better manage risk through automation.
Dark Web Monitoring
Online protection tuned to the need of your business.
Maximize Existing Security Investments
Improve efficiencies from existing investments in security tools.
Beyond MDR
Move your security operations beyond the limitations of MDR.
Secure with Microsoft 365 E5
Boost the power of Microsoft 365 E5 security.
Secure Multi-Cloud Environments
Improve cloud security and overcome complexity across multi-cloud environments.
Secure Mergers and Acquisitions
Control cyber risk for business acquisitions and dispersed business units.
Operational Technology
Solve security operations challenges affecting critical operational technology (OT) infrastructure.
Force-Multiply Your Security Operations
Whether you’re just starting your security journey, need to up your game, or you’re not happy with an existing service, we can help you to achieve your security goals.
Detection Investigation Response
Modernize Detection, Investigation, Response with a Security Operations Platform.
Threat Hunting
Locate and eliminate lurking threats with ReliaQuest GreyMatter
Threat Intelligence
Find cyber threats that have evaded your defenses.
Model Index
Security metrics to manage and improve security operations.
Breach and Attack Simulation
GreyMatter Verify is ReliaQuest’s automated breach and attack simulation capability.
Digital Risk Protection
Continuous monitoring of open, deep, and dark web sources to identify threats.
Phishing Analyzer
GreyMatter Phishing Analyzer removes the abuse mailbox management by automating the DIR process for you.
Integration Partners
The GreyMatter cloud-native Open XDR platform integrates with a fast-growing number of market-leading technologies.
Unify and Optimize Your Security Operations
ReliaQuest GreyMatter is a security operations platform built on an open XDR architecture and designed to help security teams increase visibility, reduce complexity, and manage risk across their security tools, including on-premises, clouds, networks, and endpoints.
Blog
Company Blog
Case Studies
Brands of the world trust ReliaQuest to achieve their security goals.
Data Sheets
Learn how to achieve your security outcomes faster with ReliaQuest GreyMatter.
eBooks
The latest security trends and perspectives to help inform your security operations.
Industry Guides and Reports
The latest security research and industry reports.
Podcasts
Catch up on the latest cybersecurity podcasts, and mindset moments from our very own mental performance coaches.
Solution Briefs
A deep dive on how ReliaQuest GreyMatter addresses security challenges.
White Papers
The latest white papers focused on security operations strategy, technology & insight.
Videos
Current and future SOC trends presented by our security experts.
Events & Webinars
Explore all upcoming company events, in-person and on-demand webinars
ReliaQuest ResourceCenter
From prevention techniques to emerging security trends, our comprehensive library can arm you with the tools you need to improve your security posture.
Threat Research
Get the latest threat analysis from the ReliaQuest Threat Research Team. ReliaQuest ShadowTalk Weekly podcast featuring discussions on the latest cybersecurity news and threat research.
Shadow Talk
ReliaQuest's ShadowTalk is a weekly podcast featuring discussions on the latest cybersecurity news and threat research. ShadowTalk's hosts come from threat intelligence, threat hunting, security research, and leadership backgrounds providing practical perspectives on the week's top cybersecurity stories.
March 26, 2024
About ReliaQuest
We bring our best attitude, energy and effort to everything we do, every day, to make security possible.
Leadership
Security is a team sport.
No Show Dogs Podcast
Mental Performance Coaches Derin McMains and Dr. Nicole Detling interview world-class performers across multiple industries.
Make It Possible
Make It Possible reflects our focus on bringing cybersecurity awareness to our communities and enabling the next generation of cybersecurity professionals.
Careers
Join our world-class team.
Press and Media Coverage
ReliaQuest newsroom covering the latest press release and media coverage.
Become a Channel Partner
When you partner with ReliaQuest, you help deliver world-class cybersecurity solutions.
Contact Us
How can we help you?
A Mindset Like No Other in the Industry
Many companies tout their cultures; at ReliaQuest, we share a mindset. We focus on four values every day to make security possible: being accountable, helpful, adaptable, and focused. These values drive development of our platform, relationships with our customers and partners, and further the ReliaQuest promise of security confidence across our customers and our own teams.
More results...
In 2020, there was an estimated 59 trillion gigabytes of data in the world. Most of which was created in the latter half of the 2010s decade. This figure continues to grow. To convert this raw, chaotic data into valuable intelligence we use data mining tools and analytical techniques. Digital Shadows (now ReliaQuest) routinely uses several of these tools and techniques to assist its clients with determining trends within the cyber threat landscape. This has included our recent blogs on vulnerability intelligence and initial access brokers (IAB). In this blog, we’ll detail some of the common techniques used to support our research.
The guiding principle behind data analysis is the data, information and intelligence pipeline. Raw data comes in many forms including text based, numerical, date, boolean and many more. In order to be useful, it must be converted to information. This is achieved through cleaning the data before running statistical tests and/or analytical algorithms on it. The results of the analysis are then interpreted to present an overall view of the intelligence picture. Ultimately, this process exists to give anyone who needs to make security decisions the capability to make more informed ones.
This blog is part of a two-blogs series where we’ll dive into how we use data analysis in threat intelligence here at Digital Shadows (now ReliaQuest). Today will focus on the initial steps we take before working on our data and some use cases related to data visualization and data analysis. These models will help us cover the basics of data analysis before we’ll delve into the more advanced techniques.
Data can come from many sources. The first step in data mining is finding where the data is located and determining whether it is appropriate and sufficient for the intended analysis. This will involve studying the data schema and structure (or lack thereof) and having an unambiguous knowledge of how the data was collected. Knowing this is fundamental for determining what conclusions the data can theoretically support. The following factors must also be considered: scope for sample bias, collection gaps and sample size.
Depending on the analysis being performed, the presence of sample bias doesn’t necessarily make a dataset unsuitable if the conclusions are heavily caveated. An incomplete conclusion drawn from a slightly biased sample can still offer some useful insights. Data from multiple sources can be combined to perform an integrated analysis. When performing such a task, it is vital to consider how the datasets are related.
After the dataset has been extracted, and is known to be appropriate for the task at hand, it must be cleaned. Data cleaning actions will depend on the type of data in each column, but they include:
After finding, extracting, joining and cleaning the data, the analysis can commence.
When sampling a continuous variable, most values will be found near the average. As one moves away from the average in either direction, the frequency of values decreases. Extreme values are rare. This is because continuous variables are a summation of multiple factors, and there are more possible combinations that make the middle values than the extremes.
To visualize these values, we often use a model called the “normal distribution”. A normal distribution is a probability distribution used to model phenomena that have a default behavior and cumulative possible deviations from that behavior. Wherever one sees a mean value (i.e. the arithmetic average of a set of values), there will almost certainly be a bell curve behind it. This is the underpinning principle behind most statistical analysis techniques.
A normal distribution will typically have a mean equal to the median and mode. 68% of the data will fall within one standard deviation, 95% within 2 standard deviations and 99% within 3 standard deviations. Normal distributions can have skew, in which the peak of the curve is biased in a particular direction, as well as kurtosis, in which the curve is broader or narrower.
Normal distributions are universal and visualizing them can help us determine the overall spread of the data . For example, the figure above shows the normal distribution of CVSS scores. While it may not be a very smooth bell curve, it does show that most vulnerabilities have a score between 5 and 7 and that extreme CVSS scores are rare.
The graphs used to visualise and show patterns in data come in many forms. The best graphs are simple and intuitive, while simultaneously conveying as much useful information as possible. They must be able to convey a message without requiring more than a couple of lines of explanatory text below.
The most appropriate graph types will depend on the type of analysis being performed and what conclusions one wishes to portray. Bar graphs are mainly used to compare categories, or show a trend over time. Line graphs are normally used to compare continuous variables, but can also be used when one wishes to depict several trends over time without overcrowding the page. In addition to those very commonly used graphs, exist highly specialised plots.
One example that has been used in a Digital Shadows (now ReliaQuest) blog on Initial Access Brokers is the boxplot. This model is used to visualize and compare several normal distributions. The figure below depicts a boxplot used to compare the price distributions of several types of initial access.
The box indicates the interquartile range, where half of the data points lie. The whiskers on the ends of the box indicate the extreme ends of the distribution, where each whisker accounts for a quarter of the data. Outliers are usually depicted separately, but have been omitted from this particular analysis.
From this graph we can see that WebShell is the most valuable access type on average and shows the greatest spread. We can infer from this that WebShell allows a threat actor the highest level of access on average, but that the level of access also varies greatly. RDP shows the lowest average value despite offering effectively a “hands on keyboard” level of access to a machine. From this it can be inferred that most RDP offerings are low-privileged machines, and are of little value to threat actors.
Null hypothesis based statistical testing is a model used to determine whether the relationship seen between samples is reflected in the population, or is down to sampling error. The starting assumption is called the null hypothesis, which assumes that there is no significant relationship between the samples and that any relationship observed is by chance alone. The type of test used will depend on the type of relationship being tested for, the type of data within each sample and whether population statistics are known. The table below summarizes some of the most commonly used tests.
The output figures of these statistical tests are converted into a ‘p’ value. A p value is the probability, expressed as a decimal, of observing results at least as extreme if the null hypothesis is correct. If this value is less than 0.05 (i.e. there is a less than 5% chance of us seeing the result if there is no significant difference between the populations) then we reject the null hypothesis and assume the results to be statistically significant.
An example of ANOVA in action can be seen in the IAB graph above, where it shows a significant difference between the types of access as indicated by the presence of p<0.05 in the plot title. This tells us that we can draw conclusions from the graph.
This blog covered the early stages of the data analysis process, spanning data collection through to basic analytical techniques. Some data problems however, require more advanced solutions. Such problems often require the use of machine learning methods, which will be covered in the second part of this blog series. If you’d like to access Digital Shadows (now ReliaQuest)’ constantly-updated threat intelligence library providing insights on a wide range of cyber threats, sign up for a demo of SearchLight (now ReliaQuest’s GreyMatter Digital Risk Protection) here.