SOPA and Network Architecture

February 1, 2012

18 January 2012 was ‘The Day the Internet Stood Still’ as thousands of sites around the world were either blacked out or operated with protest notices on display. And all because of a couple of pieces of legislation being discussed in the USA, the Stop Online Piracy Act and the Protect IP Act (SOPA[1] and PIPA[2] respectively). The high profile of the protesters – a list that includes Wikipedia, Google, Reddit, and Wired amongst others – translated into a level of coverage hardly seen for a technology story. Unfortunately, this also meant that a legal topic was the subject of exaggerations, inaccuracies, and outright falsehoods being presented from both sides of the Internet piracy debate. Just a day after, the Internet started trying to absorb the shocking news that the digital locker website Megaupload had been shut down by American authorities, and its owner and several employees had been arrested in New Zealand, and are currently awaiting extradition.

These two important news items can be seen as part of the global debate about Internet regulation that is currently taking place. On one side we have those who advocate an open web, while on the other hand we have a growing number of content owners who want to attempt to exercise some form of control against some of the most important excesses. While I have publicly expressed some misgivings about the focus of the current debate,[3] it is undeniable that with regards to SOPA and PIPA there is indeed room for concern outside the USA about the two pieces of legislation (not to mention concern inside that country), as it is very possible that SOPA and PIPA could have serious extraterritorial consequences. This has been a point that has been constantly repeated in the days following the Internet blackout by those opposing the bills, but the danger is much graver than anyone thinks. The fact is that the existing network architecture of the Internet relies heavily on the USA, and any legislation that affects the core would have cascading consequences elsewhere. Allow me to elaborate.

There is a branch of research in physics and mathematics called network theory, which is the systematic study of any netlike or complex system or collection of interrelated things; networks are broken into their basic elements (namely nodes and links), and studied to discern patterns. In network theory, there is a concept called centrality which measures the importance of a node in any given network. This is calculated by the number of links a node has to neighbouring nodes, the shortest number of paths to other nodes in the network, and the average shortest path. A node is said to be central in a network if it is linked to a large number of other nodes, if it can be connected to other nodes quickly (the six degrees of separation phenomenon), and if the average distance to other nodes is short. When plotting charts describing networks, central nodes can be sometimes easily identified like in the chart on network neutrality below (but which can be downloaded from the panel opposite in more legible form: see Guadamuz fig 1), where blue indicates more central nodes:

network centrality graphicNetwork centrality.[4]

Node and hub centrality is an important indication that there is a power law[5] at work in a network, a high concentration of centrality in some nodes may give rise to a scale-free network, where some nodes are more important than others.[6] There is not enough space here to explain what is a scale-free network in detail, suffice it to say that it is one in which some nodes have considerably more links than could be expected by average, so these types of networks result in hubs and even super-hubs that act as important connectors in the system’s structure. The Internet is a scale-free network, so centrality comes into play in two ways. First we have the physical network, the wires, routers and hubs that make up its physical architecture. Second there is the logical level of centrality, which consists of websites, links, hyper-links, but also include the DNS system and Internet governance structures. It should be no surprise to anyone to learn that any way you look at the Internet, the USA is extremely central. Take for example this picture of the global submarine cable network  (which can be downloaded from the panel opposite in more legible form: see Guadamuz fig 2) :

submarine cable network mapMapping physical centrality.[7]

This is just one aspect of the large dominance that the USA has in the Internet’s infrastructure. Things get even more interesting when you look at the logical architecture, where the USA has managed to remain considerably ahead of other countries. While anyone can create an Internet server by just installing the appropriate software onto any computer connected to the Web, you need a registrar if you want a domain name that resolves in the system (such as technollama.co.uk). Most top level domains are registered in the USA (.com, .org, .net, .biz), and statistics show that the US is the country with the most domain names registered under its jurisdiction, with 78,453,258 in late January 2012.[8] The closest country is Germany with over six million registrations. In fact, not even combining all of the other countries in the world can you reach the total of domains registered in the USA.

A similar picture emerges with regards to hosting, ie where content is actually placed in a server. For example, a UK provider registered my own blog’s domain name, but a US company hosts it. Having conducted a traceroute of the path that any user would take to reach the blog’s contents, I have to assume that the content is also hosted in the USA. In fact, 9 out of the top 10 hosting companies are American and, of these, the largest host in the world is GoDaddy (WildWestDomains in the following chart, which can be downloaded from the panel opposite in more legible form: see Guadamuz fig 3), ).

 

top hosting companies chartTop hosting companies in the world.[9]

The end result is a skewed map of the world, where whatever happens in the United States affects the Internet disproportionately. Take what happened to Megaupload, the aforementioned digital locker site. While the company was registered in Hong Kong, and most of its operations ran from New Zealand, its .com domain was registered to an American company named DotRegistrar. Similarly, the site had also hired some of its hosting services to companies in the States, where at some point it had leased more than 1,000 servers to companies like Carpathia Hosting and Cogent Communications. This state of affairs opened up Megaupload to enforcement by American authorities, which is precisely what took place.

One does not need to know anything about networks to know that DNS registration and hosting are the Achilles heel of both pirating and legitimate sites. However, centrality has an important lesson to teach us with regards to the dangers of SOPA and PIPA in other ways. Imagine that most digital lockers, torrent trackers and other sites operating in the shadier side of the Web were to move shop and get out of the USA tomorrow, but the USA retained its central role in the web’s architecture. This is precisely where SOPA wants to step in, as it would create several dispositions that would try to get to websites that are currently outside of the US jurisdiction, ie are registered and hosted elsewhere in the world. The most controversial norm in SOPA (though there is talk of it being removed) was contained in s 102, which stated that:

“A service provider shall take technically feasible and reasonable measures designed to prevent access by its subscribers located within the United States to the foreign infringing site (or portion thereof) that is subject to the order, including measures designed to prevent the domain name of the foreign infringing site (or portion thereof) from resolving to that domain name’s Internet Protocol address. Such actions shall be taken as expeditiously as possible, but in any case within 5 days after being served with a copy of the order, or within such time as the court may order.”[10]

This would have established a filtering responsibility for ISPs and other intermediaries against alleged copyright infringers. The problem with such a feature when introduced in such a central country is that it could easily trickle downstream to other physical and logical clients elsewhere, which would mean that SOPA would be used to filter content to all of us. This is not as far-fetched as it may sound. So far we have not witnessed too many incidents regarding downstream filtering because most of those practices take place in countries that are not central at all to the Web’s inner workings. Seen from an architectural perspective, countries like China, Syria, Egypt and Pakistan, where national filtering takes place, are mostly large Intranets, which is why it was so easy for Egyptian authorities to shut down all Web access during that country’s revolution. The US has an entirely different role in the network, so anything that is filtered there could end up being filtered in places that have never heard of SOPA, be it justified or not. I cannot imagine that this is a bug and not a feature; some in the copyright industries seem determined to export their draconian models to other territories.

The optimist in me wants to believe that we have seen the last of SOPA and PIPA. Unfortunately, Megaupload proved that things are about to get bumpier. One thing is certain, despite some exaggerations, badly-drafted and/or misguided pieces of legislation may still pose global problems.

Andrés Guadamuz is Vice-president of Innova Technology, Costa Rica, alguadamuz@innovatechnology.net. His latest book, Networks, Complexity and Internet Regulation: Scale-free Law, was published by Edward Elgar Publishing Ltd in October 2011. Read more from Andrés at his technollama blog: www.technollama.co.uk 



[1] http://thomas.loc.gov/cgi-bin/query/z?c112:H.R.3261:.

[2] http://thomas.loc.gov/cgi-bin/query/z?c112:S.968:.

[3] See: http://www.technollama.co.uk/web-activism-grows-up-but-beware-its-narrow-focus.

[4] Taken from Wikipedia: http://en.wikipedia.org/wiki/File:Graph_betweenness.svg.

[5] http://en.wikipedia.org/wiki/Power_law

[6] If you are interested in some of these concepts, I go through most of the basic ideas in network theory in my book entitle “Networks, Complexity and Internet Regulation: Scale-Free Law”, published by Edward Elgar.

[7] http://www.submarinecablemap.com/.

[8] http://www.webhosting.info/domains/country_stats/.

[9] http://www.webhosting.info/webhosts/tophosts/Country/US.

[10] http://thomas.loc.gov/cgi-bin/query/F?c112:1:./temp/%7Ec11233QJJ4:e11714:.