How File Sharing Reveals Your IdentityFollowing the death of Napster, all of the file sharing networks that rose to main-stream popularity were decentralized. The most popular networks include Gnutella (which powers Limewire, BearShare, and Morpheus) and FastTrack (which powers KaZaA and Grokster). The decentralization provides legal protection for the companies that distribute the software, since they do not have to run any component of the network themselves: once you get the software, you become part of the network, and the network could survive even if the parent company disappears.
All of these networks operate as a web or mesh of neighboring node connections. Your node connects to a few other nodes in the network, and those nodes connect to a few other nodes, which in turn connect to a few other nodes, and so on. This layout is similar to a real-life social network: you know people, and those people know other people, who in turn know other people, and so on. A portion of one of these networks might look something like this:
When you search for files in the network, you send a search request to your neighbors, they send the request on to their neighbors, and so on. Eventually, your request reaches many nodes in the network. For example, you might send out a search for "metallica mp3". Lots of nodes receive your request, but only a few of them are actually sharing any metallica music. Those that do have matches send their results back to your node. These results look something like this:
Notice the "My Address" portion of these responses. The address listed is the Internet address of the computer that has the file. If you are unfamiliar with Internet addresses (commonly called IP addresses), they are like a "phone number" for a computer on the Internet. Computers use these addresses to make connections to each other over the Internet, and your node can use this address to make a connection to the node that is sharing the three Metallica files shown above. Also like phone numbers, Internet addresses can be traced to find out who owns them---we will cover this point in more detail in the next section. Suppose that the blue node is the node at 184.108.40.206 that returned the three Metallica results:
To download a file from the blue node, your node makes a direct connection to it using the address 220.127.116.11. After your node connects to the blue node, the blue node knows your computer's Internet address as well, say 18.104.22.168 (going back to the phone analogy, this is like the blue node using caller ID). The blue node sends the Metallica file to you over this connection, and then you close the connection. This connection, which is separate from the other neighbor connections in the network, would look like this:
Just by performing a search, you got the Internet address of someone who is sharing Metallica. When you downloaded a file from that person, they got your Internet address as well. Well, it is only an Internet address, right?
How the RIAA Finds People to SueYour Internet Service Provider, or ISP, provides you with your Internet address, much in the same way that your phone company provides you with your phone number. And, like a phone company, an ISP knows who is using each Internet address that it gives out. In general, your ISP will keep your identity private. So, though the person sharing Metallica might contact your ISP and ask, "Who is using 22.214.171.124?", your ISP will likely keep its lips sealed. Your ISP will keep its lips sealed unless it is scared, an nothing scares an ISP more than the RIAA (except maybe the FBI and NSA, but so far, these organizations have yet to jump on the anti-file-sharing bandwagon).
Suppose that you are sharing a large collection of your favorite music, and assume that your collection contains more than 1000 songs. Also, suppose that most of the songs in this collection are "owned" by record labels that are represented by the RIAA. When someone searches for "mp3" in your file sharing network, your node returns a lot of results. Now suppose that one of the nodes in the network happens to be owned by the RIAA:
The RIAA performs a search in the network for songs that it cares about. Since RIAA record labels "own" the vast majority of music that is published in throughout the world, we can simplify things by assuming that the RIAA cares about most songs. Thus, the RIAA performs a search for "mp3", and your node returns over 1000 results, which look something like this:
Though an average file sharing user would usually initiate a download in response to this gold mine of search results, the RIAA has all the information that it needs, so it stops right here. With the list of 1000+ infringing songs in hand, it files a subpoena against your ISP ("Who is using 126.96.36.199?") and demands that your ISP hand over your personal information. You can look at an example subpoena, courtesy of the Electronic Frontier Foundation's online subpoena database.
Once the RIAA has your personal information, it is ready to file a lawsuit against you for copyright infringement.
The Key Privacy WeaknessWith standard file sharing networks, the key weakness is that Internet addresses for everyone who is sharing files are readily available. By sharing copyrighted files without permission, you may be breaking the law as it stands (though the legality of copyrighted file sharing is still up for debate). Returning to the phone analogy, using standard file sharing networks is much like making prank phone calls to someone who has caller ID---it is risky and stupid. With the wide use of caller ID, anyone who makes prank phone calls these days knows to dial "*67" before each call to hide his or her identity from the party being called (a feature commonly called "caller ID blocking").
The reason Internet addresses are available in standard file sharing networks is because they have to be: there is no way to make a direct connection to a node for a download without knowing that node's Internet address. Likewise, there is no way for a node to accept your download connection without also being able to determine your Internet address. Data transmission on the Internet simply works this way, and there is nothing like "caller ID blocking" built into the Internet. The only way to protect identities is to build something on top of the Internet to avoid direct connections between downloaders and uploaders, and thus avoid the necessity of sharing Internet addresses.
How MUTE Protects Your PrivacyThe main way that MUTE protects your privacy is by avoiding direct connections between downloaders and uploaders. Earlier, we described how search requests are sent around standard networks: you send a search to your neighbors, and they send it to their neighbors, who in turn send it to their neighbors, and so on. By using the network to route search requests, these networks deliver a particular request to many nodes without making direct connections to any of them. Of course, when it comes time to transfer a file, these networks use direct connections.
MUTE routes all messages, including search requests, search results, and file transfers, through the network of neighbor connections. Thus, though you know the Internet addresses of your neighbors, you do not know the Internet address of the node you are downloading from.
A map of a MUTE network looks identical to the maps of standard networks shown earlier. If you perform a search for "metallica mp3", you still might receive back three results, but these results look a little different:
Notice that the "My Address" portion of these responses no longer contains an Internet address. The address shown, which is abbreviated with "..." to fit in the table, is 7213D29781593840CF00CDD1E9A7A425AE16DCA5. This is a MUTE "virtual" address. Each node in the MUTE network has a virtual address that it generates randomly each time it starts up. Your neighbors in the network (those nodes that actually do know your Internet address) do not know what your virtual address is, so no one in the network can connect your virtual address to your Internet address, and thus no one can obtain your real-world identity.
MUTE uses virtual addresses to route messages through the network using an ant-inspired technique. Thus, to download a metallica file, your node would send a download request through the network to 7213D...2DCA5, and your node would mark that request with your own virtual address, say D1E9A59380CD425AE16D40CF0CA57A7213D29781. The node sharing metallica would send the requested file back to you using your virtual address. The entire transfer is routed through the network, which would look something like this:
Though the transfer is routed through a node owned by the RIAA, all the node sees are the virtual addresses of you and your file sharing partner. The RIAA can send out a search for "mp3", and it will still get back 1000 results from you, but these results would look like this:
The RIAA can subpoena your ISP using your virtual address, but your ISP does not know who is using this address. Thus, the RIAA's standard tactic is useless in the MUTE network.
Another Possible Spy TacticGiven that the standard search-and-subpoena tactic does not work in a MUTE network, the RIAA might try to target individual nodes with more intense monitoring. For example, the RIAA might set up a computer on your local network that would listen to all of your Internet traffic (this is similar to the FBI tapping your phone line). If the RIAA listened to all of your traffic, they would see everything that you sent to each of your neighbors in the MUTE network, which would look something like this:
Notice that the RIAA now has you cornered: it can see that there are download requests coming out of your node, but no corresponding requests coming into your node. In other words, you must be generating the requests and not simply passing on requests that you have received from your neighbors. Using this tactic, the RIAA could determine the Internet address associated with your virtual address and then file a subpoena. How does MUTE block this tactic?
MUTE protects the contents of each neighbor connection in the network using military-grade encryption. Though the RIAA might tap your network and see all of your Internet traffic, all MUTE messages would be unreadable. Thus, the RIAA would not be able to corner you in the network or obtain an Internet address in connection with your virtual address.
Of course, your neighbors are able to decrypt the messages you send through them. Thus, if the RIAA was able to hijack every single one of your neighbor nodes, it could again corner you and link your Internet address to your virtual address. However, it is unlikely that the RIAA would be able to take over a large number of nodes in the network, and since you discover your neighbors in a somewhat randomized way, it is unlikely that every single one of your neighbors would be an RIAA node.
They say, "DON'T EVER ANTAGONIZE THE HORN."
Their lips move, but no sound comes out.
Who really listens to the people on TV?