The old cliché "You’re not paranoid if they really are out to get you" turns out to apply quite nicely to the world of P2P file-sharing. A trio of intrepid researchers from the University of California-Riverside decided to see just how often a P2P user might be tracked by content owners. Their startling conclusion: "naive" users will exchange data with such "fake users" 100 percent of the time.
Anirban Banerjee, Michalis Faloutsos, and Laxmi Bhuyan collected more than 100GB of TCP header information from P2P networks back in early 2006 using a specially-doctored client. The goal of the research was a simple one: to determine "how likely is it that a user will run into such a ‘fake user’ and thus run the risk of a lawsuit?" The results are outlined in a recent paper (PDF), "P2P: Is Big Brother Watching You?"
For years, P2P communities have suspected that affiliates of the RIAA, the MPAA, and others have been haunting P2P networks to look for those who might be swapping copyrighted files. It’s more than a hunch; it’s well documented that companies like SafeNet (formerly Media Sentry) engage in this sort of work, and that their testimony is routinely produced at trials. It helped to bring down Jammie Thomas, in fact.
But identifying these organizations is hard. The nature of their business is to remain shadowy, but P2P advocates have spent years compiling "blocklists" of IP ranges that are suspected of belonging to such companies. Connect to a "user" who has an IP address in one of the blocklists and bam: you’ve just been tracked swapping a file.
By parsing all of the TCP headers that they collected over the course of 90 days, the UC-Riverside researchers came to several conclusions:
- If you don’t use a blocklist, you will be tracked. Every one of the researchers’ test clients that did not use a blocklist soon connected to an IP address found within those lists. It turns out that 12 to 17 percent of all IP addresses on the network belonged to these blocklisted ranges.
- Trackers aren’t that hard to avoid. While "naive" clients may all connect to blocklisted users, it wasn’t that hard to stay away from the vast majority of such "fake users." Researchers found that "avoiding just the top 5 blocklisted IPs reduces the chance of being tracked to about 1 percent."
- Content owners hide their tracks. Much of this tracking work is farmed out from content owners to companies like SafeNet and BayTSP, and these companies in turn take care to hide their tracks. When the researchers ran reverse DNS lookups on the blocklisted ranges, they found that only 0.5 percent of those addresses resolved back to media companies in an obvious way.
- Meet the BOGONS. One of the strategies for remaining anonymous is to operate from BOGON IP ranges. These ranges are unallocated blocks of addresses that should ordinarily not be used on the public Internet. Of the top fifteen blocklist entities that were discovered during testing, 12 were in BOGON ranges. The researchers note that "these sources deliberately wish to conceal their identities while serving files on P2P networks," and reverse DNS queries on these addresses produce little useful information.
The takeaway here is simple: P2P users who don’t utilize the blocklists are just about guaranteed to be tracked by "fake users" operating out of those ranges, and thus seem to open the door to possible litigation should the dice be rolled against them.
The study does have one major caveat, however; it does not attempt to determine if the blocklists actually correspond to tracking organizations like SafeNet. The researchers note that "this would be interesting and challenging future work." While using a blocklist makes it easy to avoid connecting to IP addresses found on that list, it’s not clear that every range on the lists is really a tracker. Conversely, there’s no way to know if addresses not on the list might in fact be tracking users.