GDPR is now less than 30 days away, and while businesses are scrambling to ensure they are compliant, another discussion is happening within the information security space amongst analysts—what’s going to happen to WHOIS? Greatly celebrated for its ability to form connections and break open cyber threat investigations, it’s not completely clear if WHOIS will go away entirely due to GDPR, but one thing’s for sure, it won’t remain what it is today.
For anyone who’s not been following the ICANN news or registrar changes, the concept of losing WHOIS may come as a surprise. The reason regulators have their sights on WHOIS centers around the changes to what’s considered personal or private information by GDPR. WHOIS—commonly thought of as the phone book of the Internet—serves as a registry of personal information for those who’ve registered domains on the Internet; available to anyone for query and considered a big leak of privacy.
To the casual observer, it makes sense to remove WHOIS from the public or at the very least, hide data deemed personal, but in doing so, these changes make it difficult for cyber threat analysts to differentiate between legitimate, compromised, and malicious domains. Additionally, without point-of-contact information for a domain owner, it’s even more difficult to communicate when a website may be compromised or infringing on a company’s trademarks or brand.
Some of you may be thinking to yourself, “well, my domain is privacy protected, doesn’t that already hide contact details?”, and the answer is yes. Over the past couple of years, analysts have seen a rise in the use of privacy protection services which ultimately render the analytical content of the WHOIS record less useful, but this is not the norm for all the tens of thousands of domains being registered every day.
One proposal to minimize WHOIS disruption while still respecting privacy concerns would require individual email addresses to be hashed using the same encrypted hash algorithm across databases. The idea being that the registrant email would be hashed uniformly allowing for analysts to pivot off it, while still obscuring the personal email address itself.
As an experiment, we implemented an extreme version (hashing all the fields) of this concept inside of PassiveTotal, RiskIQ’s threat analysis platform and demonstrated how connections could still be made, but that a lot of contextual data is lost. Furthermore, there is no consensus that providing this pivoting mechanism in a public WHOIS directory would be GDPR-compliant as it may allow connections to be drawn that would identify a person not otherwise identifiable:
Assuming WHOIS is just going away completely unless and until an accredited access model is implemented, not all hope is lost. Fortunately, RiskIQ and many others within the space have recognized the value in having multiple data sets to aid in threat investigations. RiskIQ has made it a core part of its business to collect as much Internet data as necessary for threat intelligence and incident response and currently has 11 data sets beyond WHOIS including passive DNS, SSL certificates, subdomains, OSINT, host pairs, trackers and more. While these data sets aren’t a complete substitute for WHOIS, they will often surface more information or connections that would have otherwise gone unnoticed.
At RiskIQ, we believe that doing any work—good or bad—on the Internet will result in “signals,” pieces of information generated from performing any action, which can then be used to form analyst connections. Using a process we define as “Infrastructure Analysis,” it’s possible for anyone to use a starting indicator like an IP address and easily pivot around to discover related entities.
In the above image, we define the starting point as a piece of malware. Within that malware, maybe we identify an IP address and an SSL Certificate used to encrypt command and control traffic. Maybe that SSL Certificate includes a domain for which it was issued and an IP address for where it was hosted. And finally, maybe that IP address has a different domain connected to through passive DNS or the domain has a unique tracking script within the web page it’s hosting.
Security teams who subscribe to using more data sets in their investigations know the value of forming chains like the one described above. More data ultimately results in more connections or more supporting evidence for an analyst hypothesis. If WHOIS continues to go “dark” temporarily—and we hope it doesn’t completely—we are relatively speaking, still in a great position to enable defenders to protect their organizations and accelerate their investigations.
PassiveTotal's ever-expanding data provides new context to adversaries’ infrastructure and now includes more in-depth monitoring capabilities. Security teams can be alerted in real-time to changes in DNS and domain resolution, WHOIS registration, and the appearance of other new keywords of interest. The latest release also includes a project workflow to quickly organize and group related threat infrastructure components found during investigations. This allows analysts and research teams to be more efficient and agile in their investigations. To try it for yourself, sign up for RiskIQ Community Today .