Big Data Analytics

Reveal Actionable Insights From Petabytes of Internet Data

Derived Data Sets

RiskIQ extracts and analyzes the petabytes of data collected from across the internet to create new data sets that aid in discovering, understanding, and mitigating risk. The result is searchable data sets such as Host Pairs, which show the relationship between parent and child pages in a redirect sequence, and lists such as the RiskIQ Accomplice List which showcases websites and URLs that are not malicious themselves, but link or redirect to URLs that host malware, phishing, or scams. RiskIQ also maintains our own Phish List, Scam List, and Zero-Day List which include never-before-seen URLs and pages that host phishing, scam, or malware content that we find while crawling the internet with our virtual users.

White Paper: Using Internet Data Sets to Understand Digital Threats

Correlation

Correlation

The RiskIQ platform utilizes correlation algorithms to constantly improve our detection capabilities and virtual user technology.

As the platform and virtual users crawl more websites every day, RiskIQ’s analytic capabilities become more tuned and confident over time. This allows for accurate, automated detection and confirmation of phishing pages, imposters, and scams without the need for human intervention. In instances where the platform is less confident in its correlation or decision, RiskIQ security analysts step in to review and confirm events and detections. This ensures that our customers are protected and the platform evolves intelligently.

Threat Research

Threat Research

RiskIQ’s senior security research team focuses on identifying and investigating new and emerging threats, such as Magecart. The team’s research goes into the RiskIQ platform to improve detection for our customers and provide the ability to protect from these threats. Once we find and confirm a new threat, we alert our customers and create or modify our learning algorithms to automatically detect and classify pages containing the new threat.

Outside of the RiskIQ platform, RiskIQ’s threat researchers are highly regarded in the information security community. The team publishes research and shares the information we have about threat actor groups with the broader community through news outlets, reports, and PassiveTotal® public projects.

 

 

Data Science

Threat Research

The internet is a large place and making sense of it’s a daunting challenge. At RiskIQ, we use the huge amount of internet data in unique and innovative ways. The core of many of our products is continuously improving the way we interact with and use this data. Our data science team focuses on bringing new insight to internet data and finding ways that to connect seemingly disparate data sets. By learning from the vast amount of data, we can fine tune our correlation algorithms to detect and alert users to malicious content and infrastructure even before a site is fully weaponized.

 

Mobile App Analysis

Mobile App Analysis

RiskIQ virtual users take inventory of mobile app stores and download applications they encounter on the web as if they were using a mobile device. Using these techniques, the platform is able to link apps between stores and across publishers and platforms. Once downloaded and inventoried, RiskIQ analyzes the applications themselves to determine if there are brand infringing elements, spyware, or malware hiding within the code.

 

Web Page Comparison and Hashing

Web Page Comparison and Hashing

The RiskIQ platform finds similarities between websites to locate brand and copyright infringement, phishing pages that look like official pages, and other malicious activity on the web. Using the similarity between pages, RiskIQ determines if two (or more) websites share similar structure, content, or components, which, coupled with our correlation models allows us to classify pages and generate events for customers.

Web Components

Web Components

When RiskIQ virtual users crawl a website, we extract information about the framework and components of the website itself. This could include the CMS type hosting content, the operating system of the web server that is hosting the content, application frameworks for web applications like Apache, and other information. This information can be helpful in determining potential elements that might be compromised due to vulnerabilities.