Uncertainties in the Language of Uncertainty – and why we need to talk about it
February 25, 2016
If you know much about Digital Shadows SearchLight, you would know that one of our strengths in the provision of cyber situational awareness is the human in the loop. Having worked here for a while now, I cannot imagine providing assessments without the human effort in an analysis. While the concept of “noise to signal” is brilliant; there are also important cognitive and external limitations – on top of often ambiguous, conflicting intelligence – to consider.
The whole point of Language of Uncertainty (LoU), also known as Words of Estimative Probability (WEP), is to be able to express a level of confidence in a probabilistic judgement and to provide parameters for each level of confidence. After all, even the term “possible” can be conceived very differently by different consumers. Indeed, in more serious cases failing to do so could potentially lead to intelligence failure.
The concept of LoU was first posed by Sherman Kent, which was then driven by ambiguities highlighted in the U.S. National Intelligence Estimates (NIE). Digital Shadows uses LoU across all intelligence products, and the below table provides an example of rank by each level of confidence:
Although analysts make estimates all the time based on open source and deep web intelligence, it is very difficult to determine confidence in, for instance, the ownership of a LinkedIn profile. Anyone can sign up and there is rarely a way to determine the account holder. Consequently, the analyst has to assess the probability of it being legitimate or fake.
The struggle becomes really apparent when the analyst, who may have encountered thousands of examples of this, knows that there are reported cases where threat actors have made significant effort in creating spoof LinkedIn profiles of high-profile executives and building a credible-looking network. Those probabilities may be completely different to a consumer, who may not be mindful of those spoof profile cases; therefore, their probabilistic judgement may be quite different from the analyst. Both the analyst’s and the consumer’s thinking here is quite obviously biased because they both rely on their historical knowledge.
The other obvious problem is that, “realistic possibility” which refers to 25-50% confidence can be interpreted by some readers as “likely”, if not “almost certain”. In addition, looking at the table and the three examples below, the difference between “realistic possibility” and “likely” is both 5%, as well as 45%. Naturally, the room for misinterpretation and error is quite substantial. LoU must be used consistently and transparently according to available intelligence; and efforts must be made to ensure consumers understand the thinking behind the assessments. Below is how the analyst would articulate the various probabilities:
1) Profile on LinkedIn for John Smith that is fake. The profile has limited connections and two other profiles with the same VIP name and job title were found.
2) Profile on LinkedIn for John Smith that is likely fake. The profile has limited connections and two other profiles with the same VIP name and job title were found.
3) There is a profile on LinkedIn for John Smith for which it is a realistic possibility it is fake. The profile has 500+ connections and recommendations provided by other employees.
So what’s the solution? This question has been widely discussed across the Intelligence Community for years and there is still no agreed consensus. LoU may be imperfect, but it enables analysts to explicitly express what they know (LinkedIn profile and its public contents), what they don’t know (who is the account holder) and what they think (LoU).
In my upcoming posts, I will talk about some of the daily challenges of a cyber intelligence analyst and introduce the analytic tools and techniques used here at Digital Shadows. Stay tuned!