Typosquatting – also called URL hijacking, a sting site, or a fake URL, is a form of cybersquatting, and possibly brandjacking which relies on mistakes such as typographical errors made by internet users when inputting a website address into a web browser. Should a user accidentally enter an incorrect website address, they may be led to any URL (including an alternative website owned by a cybersquatter).
The typosquatter’s URL will usually be one of four kinds, all similar to the victim site address (e.g. example.com):
- A common misspelling, or foreign language spelling, of the intended site: exemple.com
- A misspelling based on typos: examlpe.com
- A differently phrased domain name: examples.com
- A different top-level domain: example.org
- An abuse of the Country Code Top-Level Domain (ccTLD): example.cm by using .cm, example.co by using .co, or example.om by using .om. A person leaving out a letter in .com in error could arrive at the fake URL’s website.
Once in the typosquatter’s site, the user may also be tricked into thinking that they are in fact in the real site, through the use of copied or similar logos, website layouts or content. Spam emails sometimes make use of typosquatting URLs to trick users into visiting malicious sites that look like a given bank’s site, for instance.
Host – A host is a unique computer with a Web server (for RiskIQ purposes) that serves the pages for one or more Web sites. Without diving too deep, the host includes the full canonical name (www.domain.com) or a “naked domain” (domain. com). These are two unique hosts since they can be unique web servers.
- Full canonical - http://www.domain.com/path/index.htm
- Naked domain - http://domain.com/path/index.htm
Domain – A domain is concatenated using the full stop (dot, period). Domains run from right to left starting with the TLD and the unique domain label.
Sub-Domains – A domain can have unique hosts on a parent domain. To uniquely distinguish these devices, the full qualified domain name (FQDN) is leveraged to identify these hosts. These are represented in the hierarchy like: subdomain.domain.TLD being read from right to left. These unique hosts are RiskIQ digital web assets.
TLD – Top level domain—a three or two letter extension that ends a domain. If the TLD is different, then the domain is different and the host will be unique. Common TLDs are .com, org, .gov, .mil, .edu, .net. There is also county code TLDs (ccTLD) that may be leveraged too as: .us, .mx, .uk, etc. There is a newer TLD category called generic top-level domains (gTLD). These are adding thousands of new TLDs into the mix:
- Traditional TLD - http://www.domain.com/path/index.htm
- ccTLD - http://www.domain.co.uk/path/index.htm
- gTLD - http://www.domain.healthcare/path/index.htm
Host Pair(s) – Two domains (a parent and a child) that shared a connection observed from a RiskIQ web crawl.
Parent – The parent in the host pair relationship is the domain that contains a source (redirect, iframe, image, or script) from another domain (Child).
Child – The child in the host pair relationship is the domain that is providing the source (redirect, iframe, image, or script) to the upstream domain (Parent).
Domains can have multiple child sources, multiple parent sources, as well as being a child and/or parent to other domains.