In the current era of security threat attacks and cyber warfare, it comes as a vital need to persistently monitor and scrutinize the ongoing threats and day-to-day malicious activities. An essential component of threat intelligence analysis at any level is the competency to defeat prejudices and analyze information. Converting data into information that can be ensured often requires analysis. In certain cases, the analysis will be simple, for instance scrutinizing a feed into a firewall deny-and-alert ruleset. While in other cases, it involves extracting the pertinent information from a bigger work, such as a report, and comprehending which components apply to the organization’s assets.
As a generic knowledge-based activity, analysis can be disintegrated into the taxonomy of sub-types that encompasses categorization, parsing, evaluation, monitoring, and prediction. During the analysis stage, raw data is converted into information in the form of trends, patterns, clusters, sequences and so on. However, this task is accomplished via a sequence of primitive implications such as selection, cataloging, abstraction, specification, decomposition, matching, comparison, instantiation, interconnection, and transformation. If the information produced during analysis phase provides necessary understanding for extenuating a perilous event, then it can be termed as intelligence.
Cyber threat intelligence is the output of analysis based on identification, accumulation, and enhancement of pertinent data and information. Usually, the analysis consists of findings, facts, and forecasts that delineate the element of study and allow the appraisal and anticipation of events and consequences. The analysis should be unbiased, opportune, and most prominently accurate. A key role of the analyst is to seek for opportunities to produce new kinds of intelligence through blend from current intelligence. For instance, an analyst might spend quality time reading through white papers to extract IOCs that can be bestowed to network defenders. After reading such papers, the analyst might discover trends that can be drawn together into a tactical intelligence product for upper management.
More often interplay between collection and analysis process occurs, when analysts understand that the collection is not providing the required raw material; or possibly that different information needs to be accumulated for appropriate analysis. Typically, the analysts implement four basic types of reasoning to produce intelligence accurately. This includes deduction, induction, abduction and the scientific method. The analyst should be conscious of the diverse analytical pitfalls as bias and misperceptions can influence the analysis. And therefore, the result is value-added litigable information customized to a specific need.
The analysis is performed implementing a blend of human analysts and machines. Machines usually perform simpler, high volume chores that curtail a huge quantity of input data down to a more manageable subdivision. Later, human analysts apply a crucial level of judgment to this sieved data to ensure that the ultimate intelligence product contains minimum false positives.
Depending on the intelligence requirements, analytical strategies can either be hypothesis-driven or data-driven. During the data-driven analysis, machines play a leading number-crunching role. While during hypothesis-driven analysis, human analysts apply their curiosity, intuition, and imagination. The hypothesis-driven analysis is the most effective and powerful approach as it is backed by a combination of innovative human and systematic machine.
Machine-based analytical methods
Employing machines to carry out or support threat intelligence analysis is considered a mature discipline. Based on the type of threat being analysed, machine-based analytical techniques are categorized as follows.
Generally, these are threats formerly encountered and identified by means of discovering similar attributes. Based on an analyst’s knowledge and tech-savvy, it can be expressed in the form of a production rule or other form of machine-executable procedure.
Unknown knowns are threats that are known about but have never been witnessed or have been previously witnessed but are not found at present because of altered attributes. However, they can be discovered using matching techniques such as:
• Hard matching – It is a technique where a threat is discovered by matching against a repeated identifier.
• Fuzzy matching – Under this technique, repeat identifiers are resolved through a fuzzier form of matching. Nevertheless, this returns a list of results depending on likely relevance, though the exact spellings or words may not match exactly.
• Geo-matching – It is a technique where geolocation data is used for identifying clusters of significant activity or hotspots.
• Social network analysis – Under this technique, networks of new, unknown threats are identified based on its association with other, known, threats.
Unknown unknowns are threats that have not been formerly encountered. However, they can be identified implementing two types of techniques:
• Supervised learning – It is a technique where threat attributes are induced by the machine (in the form of rules) based on historical examples where the result (threat or no threat) is known.
• Unsupervised learning – Under this method, the machine applies an automated learning technique on a huge quantity of data in order to identify threat attributes for itself.