Intranets and the Internet

It is early into 2017 and at fintech events we can still hear a variety of analogies used to describe what blockchains and distributed ledger technology (DLT) are and are not.

One of the more helpful ones is from Peter Shiau (formerly of Blockstack.io) who used an automobile analogy involving the Model T to describe magic internet chains:1

The Ford Motor Company is well known for its production engineering innovation that gave us the Model T. To this day, the Ford Model T is one of the best selling automobiles of all-time thanks to the sheer number produced and affordability for American middle class families.  And while it was remarkable that Ford was able to sell so many cars, it is well understood Ford’s true innovation was not the Model T but in fact the modern assembly line.

It was this breakthrough that enabled Ford to build a new car every 93 minutes, far more quickly than any of its competitors. Not unlike the Model T, cryptocurrencies like Bitcaoin, are every bit the product of a similar innovative process breakthrough that today we call a “blockchain.”

Carrying the analogy a little further, what is even more powerful about this modern equivalent of the assembly line is that it is not just useful for building cars but also vans and trucks and boats and planes. In just the same way, a blockchain is not just useful for creating a cryptocurrency, but can be applied to a many different processes that multiple parties might rely on to reach agreement on the truth about something.

Less helpful, but all the same plentiful, are the many red herrings and false equivalences that conferences attendees are subjected to.

Arguably, the least accurate analogy is that public blockchains can be understood as being “like the internet” while private blockchains “are like intranets”.

Why is this one so wrong and worthy of comment?

Because it is exactly backwards.

For example, if you want to use a cryptocurrency like Bitcoin, you have to use bitcoin; and if you want to use Ethereum, you have to use ether.  They are not interoperable.  You have to use their proprietary token in order play in their walled garden.

As described in detail below, the internet is actually a bunch of private networks of internet service providers (ISPs) that have legal agreements with the end users, cooperate through “peering” agreements with other ISPs, and communicate via a common, standardized routing protocols such as BGP which publishes autonomous system numbers (ASNs).

In this respect, what is commonly called “the Internet” is closer to interoperable private, distributed ledger networks sharing a common or interoperable communication technology than anarchic, public cryptocurrency blockchain networks, which behave more like independent isolated networks.

Or in short: by design, cryptocurrencies are intranet islands whereas permissioned distributed ledgers — with interoperability hooks (“peering” agreements) — are more like the internet.2

Sidebar

Let’s do a short hands-on activity to see why the original analogy used at fintech conferences is a false equivalence with implications for how we need to frame the conversation and manage expectations in order to integrate DLT in to our reference and business architecture.

If you are using a Windows-based PC, open up a Command window.  If you’re using a Mac or Android device, go to a store and buy a Windows-based PC.

Once you have your Command window open, type in a very simple command:

tracert: www.google.com

Wait a few seconds and count the hops as your signal traces the route through various network switches and servers until you finally land on your destination.  From my abode in the SF area, it took 10 hops to land at Google and 7 hops to land at Microsoft.

If you did this exercise in most developed countries, then the switches and servers your signal zigged and zagged through were largely comprised of privately owned and operated networks called ISPs.  That is to say, what is generally described as “the internet” is just a bunch of privately run networks connected to one another via several types of agreements such as: transit agreements, peering agreements, and interconnect agreements.

By far the most widely used agreement is still done via the proverbial “handshake.”  In fact, according to a 2012 OECD report, 99.5% of internet traffic agreements are done via handshakes.  There is also depeering, but more on that later.

What do all these agreements look like in practice?

According to the 2016 Survey of Internet Carrier Interconnection Agreements (pdf):

The Internet, or network of networks, consists of 7,557 Internet Service Provider (ISP) or carrier networks, which are interconnected in a sparse mesh. Each of the interconnecting links takes one of two forms: transit or peering. Transit agreements are commercial contracts in which, typically, a customer pays a service provider for access to the Internet; these agreements are most prevalent at the edges of the Internet, where the topology consists primarily of singly connected “leaf” networks that are principally concerned with the delivery of their own traffic. Transit agreements have been widely studied and are not the subject of this report. Peering agreements – the value-creation engine of the Internet – are the carrier interconnection agreements that allow carriers to exchange traffic bound for one another’s customers; they are most common in the core of the Internet, where the topology consists of densely interconnected networks that are principally concerned with the carriage of traffic on behalf of the networks which are their customers.

Colloquially it is a lot easier to say “I want to use the Internet” instead of saying “I want to connect with 7,557 ISPs interconnected in a sparse mesh.”

Back to topology, each ISP is able to pass along traffic that originated from other networks, even if these external networks and the traffic therein originate from foreign countries, because the physical systems can speak to one another via standardized transport protocols like TCP and UDP and route via BGP.3 4

Thus there is no such thing as a physical “internet rail,” only an amalgam of privately and publicly owned networks stitched together.

And each year there is inevitably tension between one more ISP and consequently depeering takes place.  A research paper published in 2014 identified 26 such depeering examples and noted that while depeering exists:

Agreements are very quite affair and are not documented for, they are mostly handshake agreements where parties mutually agree  without  any  on  record  documentation.  This  argument is supported by the fact that 141,512 Internet Interconnection Agreements out of 142,210 Internet Agreements examined till March 2011 were Handshake Agreements.

This is the main reason you do not hear of disputes and disagreements between ISPs, this also dovetails into the “net neutrality” topic which is beyond the scope of this post.

Intranets

Just as the internet is an imperfect analogy for blockchains and DLT in general, so is its offspring the “intranet” is a poor analogy for a permissioned blockchains.  As noted above, the internet is a cluster of several thousand ISPs that typically build business models off of a variety of service plans in both the consumer and corporate environments.

Some of these server plans target corporate environments and also includes building and maintaining “private” intranets.

What is an intranet?

An intranet is a private network accessible only to an organization’s staff. Generally a wide range of information and services from the organization’s internal IT systems are available that would not be available to the public from the Internet. (Source)

And while more and more companies migrate some portion of their operations and work flows onto public and private “clouds,” intranets are expected to be maintained given their continued utility.  From an infrastructure standpoint, notwithstanding that an intranet could be maintained one or more more servers through Software Defined Networks (SDNs), it is still a subset of a mash up of ISPs and mesh networks.

What does this have to do with magic internet chains?

A private blockchain or private distributed ledger, is a nebulous term which typically means that the validation process for transactions is maintained by known, identified participants, not pseudonymous participants.  Depending on the architecture, it can also achieve the level of privacy that is associated with an intranet while staying clear of the hazards associated with preserving true pseudonymity.

Why is the “intranet” analogy so misleading and harmful?

For multiple reasons.

For starters, it is not really valid to make a sweeping generalization of all identity-based blockchains and distributed ledgers, as each is architected around specific use-cases and requirements.  For instance, some vendors insist on installing on-premise nodes behind the firewall of an enterprise.  Some vendors setup and run a centralized blockchain, from one or two nodes, for an enterprise. Some others tap into existing operational practices such as utilizing VPN connections.  And others spin up nodes on public clouds in data centers which are then operated by the enterprise.

There are likely more configurations, but as noted above: from a topological perspective in some cases these private blockchains and distributed ledgers operate within an intranet, or on an ISP, or even as an extranet.

Fundamentally the biggest difference between using an ISP (“the internet”) and using an intranet is about accessibility, who has access rights.  And this is where identity comes into play: most ISPs require the account holder to provide identification materials for what is effectively KYC compliance.

Thus while you may be visit a coffee shop like Starbucks who provides “free” access, Starbucks itself is an identified account holder with an ISP and the ISP could remove Starbucks access for violating its terms of service.  Similarly, most coffee shops, airports, schools, etc. require users to accept a terms of service acknowledging that their access can be revoked for violating it.

Source: FireFox 51.0.1

In short, both the internet and intranet are in effect part of identity and permission-based networks.  There is no such thing as an identity-less internet, only tools to mask the users identity (e.g., Tor, Peerblock, Whisper).  In the same way that, “private” intranets are a fallacy.

Anarchic chains, which were designed to operate cryptocurrencies like Bitcoin, attempt to create an identity-less network on top of an identifiable network, hence the reason people involved in illicit activities can sometimes be caught.

Identity

Interestingly, where the internet analogy does hold up is in how public, anarchic blockchains are no less challenged by the effort and complexity of truly masking identity. I mentioned this in a footnote in the previous post, but it deserves being highlighted once more. Anarchic blockchains inspired by cryptocurrencies such as Bitcoin, used blocks because Satoshi wanted identity-free consensus (e.g., pseudonymity).  That implies miners can come and go at will, without any kind of registration, which eliminated the choice of using any existing consensus algorithm.

As a result, Satoshi’s solution was proof-of-work (PoW).  However, PoW is susceptible to collisions (e.g., orphan blocks).  When a collision occurs you have to wait longer to obtain the same level of work done on a transaction. Thus you want to minimize them, which resulted in finding a PoW on average every ten minutes.  This means that in a network with one minute propagation delays, not unlikely in a very large network (BGP sees such propagation times) then you waste ~10% of total work done, which was considered an acceptable loss rate in 2008 when Satoshi was designing and tweaking the parameters of the system.

Distributed ledgers such as Corda, use a different design and exist precisely as an identified network, where members cannot just come and go at will, and do have to register. With Corda, the team also assumes relatively low propagation times between members of a notary cluster.  One of the key differences between mere PoW (i.e. hashcash) and a blockchain is that in the latter, each block references the prior – thus PoWs aggregate.  It can be tough to do that unless all transactions are visible to everyone and there is a single agreed upon blockchain but if you do not, you will not get enough PoW to yield any meaningful security

When fintech panels talk about the notion of “open” or “closed” networks, this is really a red herring because what is being ignored is how identity and permission work and are maintained on different types of networks.

From the standpoint of miner validation, in practice cryptocurrencies like Bitcoin are effectively permission-based: the only entity that validates a transaction is effectively 1 in 20 semi-static pools each day.  And the miners/hashers within those pools almost never individually generate the appropriate/winning hash towards finding a block.  Each miner generates trillions of invalid hashes each week and are rewarded with shares of a reward as the reward comes in.

And if you want to change something or possibly insert a transaction, you need hashrate to do so.  Not just anyone running a validating node can effect change.

More to the point, nearly all of these pools and many of the largest miners have self-doxxed themselves.  They have linked their real world identities to a pseudonymous network whose goals were to mask identities via a purposefully expensive PoW process.  As a result, their energy and telecommunication access can be revoked by ISPs, energy companies, and governments.  Therefore calling anarchic or public blockchains “open” is more of a marketing gimmick than anything else at this stage.

Clarity

AOL and CompuServe were early, successful ISPs; not intranets.5  Conflating these terms makes it confusing for users to understand the core technology and identify the best fit use-cases. 6

Alongside the evolution of both the “cloud” and ISP markets, it will be very interesting to watch the evolution of “sovereign” networks and how they seek to address the issue of identity.

Why?

Because of national and supranational laws like General Data Protection Regulation (GDPR) that impacts all network users irrespective of origin.

For instance, Marley Gray (Principal Program Manager Blockchain at Microsoft) recently explained in an interview (above) how in order to comply with various data regulations (data custody and sovereignty), Microsoft acquired fiber links that do not interact with the “public” internet.  That is to say, by moving data through physically segregated “dark” networks, Microsoft can comply with requirements of its regulated customers.

And that is what is missing from most fintech panels on this topic: at the end of the day who is the customer and end-user.

If it is cypherpunks and anarchists, then anarchic chains are built around their need for pseudonymous interactions.  If it is regulated enterprises, then identity-based systems are built around the need for SLAs and so forth.  The two worlds will continue to co-exist, but each network has different utility and comparative advantage.

Acknowledgements: I would like to thank Mike Hearn, Stephen Lane-Smith, Antony Lewis, Marcus Lim, Grant McDaniel, Emily Rutland, Kevin Rutter, and Peter Shiau for their constructive feedback. This was originally sent to R3 members on March 31, 2017.

Endnotes

  1. His analogy is reused with permission. []
  2. From a network perspective, some of the integration and interop challenges facing DLT platforms could be similar to the harried IPv4 vs IPv6 coexistence over the past decade.  Who runs the validating nodes, the bridges — the links between the chains and ledgers — still has to be sorted out.  One reviewer noted that: If you equate IPv4 (TCP/UDP/ICMP) to DLTv4 where BGPv4 enables IPv4 networks to interact, we need an equivalent for BPGv4, say DLTGPv4 (DLT Gateway Protocol) for DLTv4 fabrics (ISPv4s) to interact and the same thing for IPv6 and DLTv6 where DLTv6 is a different DLT technology than DLTv4.  So the basic challenge here is solving integration of like DLT networks. []
  3. Venture capitalists such as Marc Andreessen and Fred Wilson have stated at times that they would have supported or invested in something akin to TCPIPcoins or BGPcoins.  That is to say, in retrospect the missing element from the “internet stack” is a cryptocurrency.  This is arguably flawed on many levels and if attempted, would likely have stagnated the growth and adoption of the internet, see page 18-19. []
  4. One reviewer noted that: Because of the IPv4 address restrictions (address space has been allocated – relying on auctions etc for organizations to acquire IPv4 addresses), some sites now only have an IPv6 address.  Most devices today are dual stack (support IPv4 and IPv6), but many ISPs and older devices still only support IPv4 creating issues for individuals to access IPv6 resulting in the development of various approaches for IPv4 to IPv6 (e.g. GW46 – my generic label).  I think, the question with DLTGW46 is whether to go dual stack or facilitate transformation between v4 and v6. []
  5. A reviewer who previously worked at AOL in the mid ’90s noted that: “In its early days, AOL was effectively a walled garden.  For example, it had its own proprietary markup language called RAINMAN for displaying content. And access to the internet was carefully managed at first because AOL wanted its members to stay inside where content was curated and cultural norms relatively safer — and also desirable for obvious business reasons.” []
  6. One reviewer commented: “In my opinion, the “internet” cannot be created by a single party. It is an emergent entity that is the product of multiple ISPs that agree to peer – thus the World Wide Web. DLT-based and blockchain-based services first need to develop into their own robust ecosystems to serve their own members. Eventually, these ecosystems will want to connect because the value of assets and processes in multiple ecosystems will increase when combined.” []

A brief history of R3 – the Distributed Ledger Group

What’s in a name?

I was at an event last week and someone pulled me aside asked: why do you guys at R3 typically stress the phrase “distributed ledger” instead of “blockchain”?

The short answer is that they are not the same thing.

In simplest terms: a blockchain involves stringing together a chain of containers called blocks, which bundle transactions together like batch processing, whereas a distributed ledger, like Corda, does not and instead validates each transaction (or agreement) individually.1

The longer answer involves telling the backstory of what the R3 consortium is in order to highlight the emphasis behind the term “distributed ledger.”

Inspired by IMF report, page 8

Genesis

R3 (formerly R3 CEV) started out as a family office in 2014.2 The “3” stood for the number of co-founders: David Rutter (CEO), Todd McDonald (COO), and Jesse Edwards (CFO). The “R” is the first initial of the CEO’s last name.  Very creative!

During the first year of its existence, R3 primarily looked at early stage startups in the fintech space.  The “CEV” was an acronym: “crypto” and “consulting,” “exchanges,” and “ventures.”

Throughout 2014, the family office kept hearing about how cryptocurrency companies were going to obliterate financial institutions and enterprises.  So to better understand the ecosystem and drill into the enthusiasm around cryptocurrencies, R3 organized and held a series of round tables.

The first was held on September 23, 2014 in NYC and included talks from representatives of: DRW, Align Commerce, Perkins Coie, Boost VC, and Fintech Collective.  Also in attendance were representatives from eight different banks.

The second round table was held on December 11, 2014 in Palo Alto and included talks from representatives of: Stanford, Andreessen Horowitz, Xapo, BitGo, Chain, Ripple, Mirror, and myself.  Also in attendance were representatives from 11 different banks.

By the close of 2014, several people (including myself) had joined R3 as advisors and the family office had invested in several fintech startups including Align Commerce.

During the first quarter of 2015, David and his co-founders launched two new initiatives.  The first was LiquidityEdge, a broker-dealer based in NYC that built a new electronic trading platform for US Treasurys.3  It is doing well and is wholly unrelated to R3’s current DLT efforts.

The second initiative was the incorporation of the Distributed Ledger Group (DLG) in Delaware in February 2015.  By February, the family office had also stopped actively investing in companies in order to focus on both LiquidityEdge and DLG.

In April 2015 I published Consensus-as-a-Service (CaaS) which, at the time, was the first paper articulating the differences between what became known as “permissioned” and “permissionless” blockchains and distributed ledgers.  This paper was then circulated to various banks that the small R3 team regularly interacted with.

The following month, on May 13, 2015, a third and final round table was held in NYC and included talks from representatives of Hyperledger (the company), Blockstack, Align Commerce and the Bank of England.  Also in attendance were representatives from 15 banks as well as a market infrastructure operator and a fintech VC firm.  In addition to the CaaS paper, the specific use-case that was discussed involved FX settlement.4

The transition from a working group to a commercial entity was formalized in August and the Distributed Ledger Group officially launched on September 1, 2015 although the first press release was not until September 15.  In fact, you can still find announcements in which the DLG name was used in place of R3.

By the end of November, phase one of the DLG consortium – now known as the R3 consortium – had come to a conclusion with the admission of 42 members.  Because of how the organization was originally structured, no further admissions were made until the following spring (SBI was the first new member in Phase 2).

So what does this all have to do with “distributed ledgers” versus “blockchains”?

Well, for starters, we could have easily (re)named or (re)branded ourselves the “Blockchain Group” or “Blockchain Banking Group” as there are any number of ways to plug that seemingly undefinable noun into articles of incorporation.  In fact, DistributedLedgerGroup.com still exists and points to R3members.com.5 So why was R3 chosen?  Because it is a bit of a mouthful to say DistributedLedgerGroup!

Corda’s genesis

Upon launch, the architecture workstream lead by our team in London (which by headcount is now our largest office), formally recognized that the current hype that was trending around “blockchains” had distinct limitations.  Blockchains as a whole were designed around a specific use-case – originally enabling censorship-resistant cryptocurrencies. This particular use-case is not something that regulated financial institutions, such as our members, had a need for.

While I could spend pages retracing all of the thought processes and discussions surrounding the genesis of what became Corda, Richard Brown’s view (as early as September 2015) was that there were certain elements of blockchains that could be repurposed in other environments, and that simply forking or cloning an existing blockchain – designed around the needs of cryptocurrencies – was a non-starter.  At the end of that same month, I briefly wrote about this view in a post laying out the Global Fabric for Finance (G3F), an acronym that unfortunately never took off. In the post I specifically stated that, “[i]t also bears mentioning that the root layer may or may not even be a chain of hashed blocks.”

In October 2015, both James Carlyle and Mike Hearn formally joined the development team as Chief Engineer and lead platform engineer respectively.  During the fall and winter, in collaboration with our members, the architecture team was consumed in the arduous process of funneling and filtering the functional and non-functional requirements that regulated financial institutions had in relation to back office, post-trade processes.

By the end of Q1 2016, the architecture team gestated a brand new system called Corda.  On April 5, 2016, Richard published the first public explanation of what Corda was, what the design goals were and specifically pointed out that Corda was not a blockchain or a cryptocurrency.  Instead, Corda was a distributed ledger.

Prior to that date, I had personally spent dozens of hours clarifying what the difference between a blockchain and a distributed ledger was to reporters and at events, though that is a different story.  Unfortunately even after all these explanations, and even after Richard’s post, the Corda platform was still inappropriately lumped into the “blockchain” universe.

Following the open sourcing of Corda in November 2016, we formally cut the “CEV” initials entirely from the company name and are now known simply as R3.  Next year we plan to make things even shorter by removing either the R or 3, so watch out domain squatters!

Today

As of February 2017, the R3 consortium is formally split into two groups that share knowledge and resources: one group is focused on building out the Corda platform and the other, the Lab and Research Center, is focused on providing a suite of services to our consortium members.  I work on the services side, and as described in a previous post, my small team spends part of its time filtering vendors and projects for the Lab team which manages several dozen projects at any given time for our consortium members.

The Lab team has completed more than 20 projects in addition to 40 or so ongoing projects.  Altogether these involved (and in some cases still involve) working with a diverse set of platforms including Ethereum, Ripple, Fabric, Axoni, Symbiont and several others including Corda.  Since we are member driven and our members are interested in working and collaborating on a variety of different use-cases, it is likely that the services side will continue to experiment with a range of different technologies in the future.

Thus, while it is accurate to call R3 a technology company focused on building a distributed ledger platform and collaborating with enterprises to solve problems with technology, it is not accurate to pigeonhole it as a “blockchain company.”  Though that probably won’t stop the conflation from continuing to take place.

If you are interested in understanding the nuances between what a blockchain, a database, and a distributed ledger are, I highly recommend reading the multitude of posts penned by my colleagues Antony Lewis and Richard Brown.

  1. Blockchains inspired by cryptocurrencies such as Bitcoin used blocks because Satoshi wanted identity-free consensus (e.g., pseudonymity).  That implies miners can come and go at will, without any kind of registration, which eliminated the choice of using any existing consensus algorithm.

    As a result, Satoshi’s solution was proof-of-work (PoW).  However, PoW is susceptible to collisions (e.g., orphan blocks).  When a collision occurs you have to wait longer to obtain the same level of work done on a transaction. Thus you want to minimize them, which resulted in finding a PoW on average every ten minutes.  This means that in a network with one minute propagation delays, not unlikely in a very large network (BGP sees such propagation times) then you waste ~10% of total work done, which was considered an acceptable loss rate in 2008 when Satoshi was designing and tweaking the parameters of the system.

    Distributed ledgers such as Corda, use a different design because it is an identified network, where members cannot just come and go at will, and do have to register. With Corda, the team also assumes relatively low propagation times between members of a notary cluster.  One of the key differences between mere PoW (i.e. hashcash) and a blockchain is that in the latter, each block references the prior – thus PoWs aggregate.  It can be tough to do that unless all transactions are visible to everyone and there is a single agreed upon blockchain but if you do not, you will not get enough PoW to yield any meaningful security. []

  2. The R3CEV.com domain was created on August 13, 2014. []
  3. It may look like an odd spelling, but Treasurys is the correct spelling. []
  4. At the time, I was an advisor to Hyperledger which was acquired by Digital Asset the following month. []
  5. The DistributedLedgerGroup.com domain was created on December 23, 2014 and R3members.com was created on March 15, 2016. []

Non-technical Corda whitepaper released

Earlier today our architecture team released its first public whitepaper on Corda.

The WSJ covered it here and here.

Consequently I am somewhat puzzled by news stories that still refer to a “blockchain” as “Bitcoin technology.”  After all, we don’t refer to combustion engines in cars as “horse-powered technology” or an airplane turbine engine as “bird-powered technology.”

A more accurate phrase would be to say something like, “a blockchain is a type of data structure popularized by cryptocurrencies such as Bitcoin and Ethereum.”  After all, chronologically someone prior to Satoshi could have assembled the pieces of a blockchain into a blockchain and used it for different purposes than censorship-resistant e-cash.  In fact, both Guardtime and Z/Yen Group claim to have done so pre-2008, and neither involves ‘proof-of-work.’

Fun fact: Corda is not a blockchain, but is instead a distributed ledger.