Thursday, January 17, 2013

Open Compute Project Disaggregates the Data Center

The Open Compute Project (OCP), which was launched by Facebook in April 2011, has moved from early design work to seeing the first commercial products based on its specifications hit the open market.

Facebook's original idea was to share its specifications for "vanity-free" servers using barebones AMD and Intel motherboards and optimized for low-power in highly dense data centers.

The Open Compute Project has evolved to challenge the industry to rethink all aspects of data center design, from motherboards, to storage, I/O, power, cooling and rack design.

A big focus is on "breaking the monolith" by developing a next generation "disaggregated rack", where compute, network and storage are separate modules that can be scaled and upgraded independently of other server elements.

This week, the group attracted a full house of 2,000 attendees to the Santa Clara Convention Center in Silicon Valley for its fourth Open Compute Summit.

Here are some notes taken from the keynotes (a webcast is online):


  • Orange and NTT are the latest carriers to join the Open Compute Project.
  • During the first 9 months of 2012, Facebook spent $1.0 billion for servers, networking equipment, storage infrastructure and the construction of data centers.  Facebook is accelerating its data center operations to handle the data deluge from a billion users, its growing number of services, the widespread adoption of smartphones, and the fact that 82% of users are outside the United States while most of its data centers are domestic.
  • Facebook currently stores about 240 billion unique photos and is adding 350 million new photos added per day - this means 7PB of new storage is consumed by Facebook photos per month.  This is accelerating with the adoption of smartphones.  Keeping all of the photos online all of the itme is a Big Data challenge requiring changes to the data center, the servers and the software.
  • Rackspace will design and build its own infrastructure using OCP designs as a starting point.
  • Riot Games will purchase systems based on the OCP design.
  • Avnet, Delta, Emerson, and Sanmina introduced products for the Open Rack design.
  • Intel is working on silicon photonics for Open Rack and is contributing designs that enable 100 Gbps interconnects.
  • Fusion-io is contributing some design work on its new 3.2TB ioScale flash card to OCP.
  • Facebook has developed a new common slot architecture specification for motherboards that it calls “Group Hug”.  The idea to create boards that are completely CPU silicon vendor-neutral and will last through multiple processor generations. The specification uses a simple PCIe x8 connector to link the SOCs to the board.
  • AMD, Applied Micro, Calxeda, and Intel and supporting the Group Hug board idea.

Key components that Facebook is looking for a disaggregated rack:

  • Compute Nodes -- each compute node would use 2 processors, 8 or 16 DIMM slots, no hard drive but instead a small flash boot partition, and a big NIC or 10 Gbps or more.  Moore's Law means that Facebook would expect to depreciate these modules over a 2-year period rather than the typical  3-year replacement cycle.
  • RAM sled  -- a rack tray on rails offering 128GB to 512GB with an FPGA, mobile processor or desktop processor on-board.  Performance should range from 450k to 1 million key/value gets/second. The cost should be $500-700 excluding RAM costs
  • Storage Sled  -- a rack tray on rails holding 15 spinning disc drives.  Instead of a SAS expander, the sled would use a small server.  The cost should be $500 to $700 excluding drives, which should costs less than $0.01 per GB
  • Flash Sled  -  a rack tray on rails offering 500GB to 7TB of flash. The cost should be $500-$700 excluding the cost of the flash.  Facebook would like to keep longer and depreciate over the read/write lifetime of flash - perhaps 4-6 years.
The disaggregated rack will be helpful for Facebook's newly announced Graph Search service because the optimum ratio of RAM/Flash is changing quickly as price declines.

http://www.opencompute.org