Richard will be presenting ‘Asset Discovery: Making Sense of the Ocean of OSINT’ at 13.50 on 9th August 2019 in Recon Village.

 

When performing OSINT reconnaissance against a target, is often very difficult to accurately define the scope. There are so many sources of information and so many diverse types of data. It quickly becomes overwhelming. While there are many excellent OSINT tools already available to the discerning OSINTer, their focus is usually on breadth of collection. We subscribe to the UNIX philosophy of “do one thing and do it well”. Our experience is that asset traceability and narrowly-focused discovery help us to discover the best results. To that end, we’ve developed a tool: the “Offensive Orca” which we will be releasing at the Recon Village workshop at DEF CON.

 

The end goal that we have in our targeting is to discover vulnerable or misconfigured systems. By vulnerable, we mean that a network service running on a machine is vulnerable to a (public) exploit. By misconfigured, we mean that a network service is revealing sensitive information. Setting the goal of our OSINT research explicitly upfront helps us to discard unnecessary data sources and ensures our collection is useful.

 

Scoping

In order to achieve our goal, we need to find the systems belonging to our target. Scoping is critical here. The consequences of misidentifying a system are severe. In order to have confidence in our targeting, we must be able to trace a discovered asset back to the initial piece of asset data. Any reconnaissance engagement starts with an initial pool of asset data. No engagement starts in a vacuum. Small pieces of information, such as company name, domain names and IP ranges, are typically provided to drive an engagement. Depending on an engagement, you may receive more or less of these initial pieces of asset data and they will also be more or less reliable!

For each piece of asset data, a lookup needs to be performed, e.g., from a company name to a set of domains, IP ranges/addresses, from a domain to list of subdomains and so on. When we look up a piece of asset data, whatever result that is generated is stored in the Orca’s database, it also stores the ID of the piece of data that was used to seed that lookup.

To discover domains from a company name, we can automatically Google the company name and check which domains are returned in the results. We can also use SHODAN to perform an organizational search to discover domains from a company name.

 

Manual Verification

We immediately hit our first problem. How do we know that the domains returned are associated in any way to our target? The tragic answer is that we don’t, without manual checking. That’s why the Orca prompts the operator after each domain so that the necessary checks can be made. It is a time-consuming and frustrating task but doing that initial work up-front saves a world of pain later. There are a few tips’n’tricks for this, but it’s ultimately target-dependent and requires knowing some context about the target in terms of which sector, geography, etc. it operates in. The importance of this cannot be overstated. If the initial asset list is not appropriately curated, then it will be hard to have confidence in any future results.

 

Discovering Hostnames

Hostnames can be discovered by a process of subdomain enumeration. There are excellent sources of data for which subdomains/hostnames exist for a particular domain. Our two preferred sources are Rapid7’s Forward DNS data set (https://opendata.rapid7.com/sonar.fdns_v2/) and Certificate Transparency logs (https://crt.sh/).

crt certificate search

 

Combining these two sources together is very powerful. The Orca can do this for you. Certificate Transparency is particularly interesting as it a real-time stream so there are cases where you can catch a machine having a certificate issued but before a full managed security configuration is applied. The OWASP Amass tool (https://github.com/OWASP/Amass) is our current go-to tool for performing this process when we are not using the Orca. The advantage of this subdomain enumeration process is that if a hostname belongs to a domain, we can have high confidence that the two are related. This dramatically cuts down on false positives.

IP ranges can be found via free text searches of WHOIS data, especially the organization name or net name. This is also an error prone process. As with the previous section, the operator must curate the results from this search. This WHOIS data can be collected manually from the RIRs responsible providing a convincing use-case can be made, or access can be purchased from one of several WHOIS data providers. A note of caution around cloud providers: unless the operator is extremely sure about the providence of the IP range, it is recommended to exclude cloud provider ranges from your asset discovery process.

 

Discovering Vulnerabilities

Once a curated set of hostnames and IP addresses has been discovered, it is then required to figure out what services are running on these hosts. If we are conducting a passive reconnaissance exercise, we need to use a third-party service such as SHODAN, if there is not this kind of requirement, we can use an active scanning tool such as masscan or nmap. In the case of SHODAN, it now returns CVE information which greatly assists the lookup process. Previously, an operator would have to take the CPE (Common Platform Enumeration) information, e.g., “cpe:/a:microsoft:internet_explorer:8.0.6001:beta”, and look it up in a CVE database such as those maintained by Mitre.

When a list of CVEs has been created for our validated set of hosts, it is worth considering how we provide an assessment of this list. Not all CVEs are remotely exploitable and, in this case, we are performing OSINT against a remote target, so local-only vulnerabilities and exploits are not directly useful. We typically use third-party services such as ExploitDB or the Metasploit exploit collection to see which vulnerabilities we have discovered have public exploits available. Most don’t. By restricting ourselves to only remote services which are directly exploitable in practice, we can avoid the alert fatigue associated with a high number of theoretical vulnerabilities.

We now have a set of curated findings that we can deliver to a client, use ourselves for follow-up operations, feed into an automated system, and so on.

 

Comprehensive, Narrowly-Scoped

In summary, our OSINT approach is centered on comprehensive asset discovery coupled with narrow scoping to avoid false positives. By setting an explicit and clear goal upfront about the results we want, namely exploitable or misconfigured systems, we can avoid a lot of the noise generated by a typical OSINT discovery process. We use a standard reporting approach, that is, an Excel spreadsheet, to enable consumers of our OSINT the maximum flexibility and integration with their existing workflows when it comes to processing the results of our work.

 

To see how SearchLight can help you discover assets online, stop by our booth at Black Hat 2019, #1014. Or check out our product announcement from CEO, Alastair Paterson, for more details.

 

And to stay up to date with the latest from the team, subscribe to our email list here: