A case study by Zynamics illustrates the trouble with assuming each new discovery of malware is new, unique or sophisticated. They used a 50% litmus of similarity and discovered descendants came from only a few bots:
Files, which exhibit a mutual similarity of more than 50 % have been assigned to the same family. The next step was to have the files named by an anti-virus-program (ClamAV). We replaced the MD5 sums with the names in the tree. The result was this graph.
The graph enables us to draw interesting conclusions:
- We could clearly assign several bots to a family even though ClamAV did not identify them.
- Many “distinct” bots show a strong similarity to other bots and should actually be assigned to one single family (e.g. Trojan.GoBot and Trojan.Downloader.Delf as well as Worm.Korgo.Y and Worm.Padobot.I). This seems to be due to problems in the naming-process.
- Basically, all bots are representatives of two big (GoBot, PadoBot) and 3 small families (Sasser, PoeBot, Crypt-8) as well as some “repairs”.