🤖 AI Summary
This study systematically characterizes the behavioral traits and infrastructure abuse patterns of 1.52 million malicious domains detected by VirusTotal between January and May 2026. Integrating multi-source data—including WHOIS records, passive DNS, and IP/ASN information—the analysis spans eight dimensions: domain lifecycle, registration concentration, bulk registration, brand impersonation, and others. The work reveals, at scale for the first time, that attackers rapidly activate domains within weeks of registration, heavily concentrate registrations among a few registrars and top-level domains, extensively leverage Cloudflare for domain fronting, and most frequently impersonate WhatsApp and Google. The project publicly releases a labeled dataset, significantly enhancing threat intelligence and defensive capabilities against malicious domains.
📝 Abstract
We present a longitudinal study of approximately 1.52 million malicious domains observed on VirusTotal (VT) between January and May 2026. Domains were selected on the basis of detection by at least five independent VT scanning engines and a first-seen date within the study window. We group the dataset into compromised domains and attacker created domains, which account for approximately 89.3% of the dataset. Combining WHOIS registration records and passive DNS (PDNS) data with the VT dataset, we characterise attacker behaviour across eight dimensions: temporal distribution, compromisedvs.attack classification, domain age at first detection, registrar and TLD preferences, DNS query volume as a damage proxy, hosting infrastructure concentration (IP and ASN level), bulk registration patterns, and brand impersonation. Key findings include: the majority of attacker created domains are short lived registrations used within weeks of creation; a small number of registrars and TLDs account for most abuse; Cloudflare infrastructure is heavily exploited for domain fronting; bulk registration events involving thousands of domains from a single registrar on a single day are widespread; and several global brands, particularly WhatsApp and Google, are heavily impersonated. We share the annotated dataset in the GitHub repo https://github.com/mufimash/malicious_domains
for further research.