Thursday, August 30, 2018

Database for the Instant Experience -- a profile of Redis Labs

The user experience is the ultimate test of network performance. For many applications, this often comes down to the lag after clicking and before the screen refreshes. We can trace the packets back from the user's handset, through the RAN, mobile core, metro transport, and perhaps long-haul optical backbone to a cloud data center. However, even if this path traverses the very latest generation infrastructure, if it ends up triggering a search in an archaic database, the delayed response time will be more harmful to the user experience than the network latency. Some databases are optimized for performance. Redis, an open source, in-memory, high-performance database, claims to be the fastest -- a database for the Instant Experience. I recently sat down with Ofer Bengal to discuss Redis, Redis Labs and the implication for networking and hyperscale clouds.



Jim Carroll:  The database market has been dominated by a few large players for a very long time. When did this space start to open up, and what inspired Redis Labs to jump into this business?

Ofer Bengal: The database segment of the software market had been on a stable trajectory for decades. If you had asked me ten years ago if it made sense to create a new database company, I would have said that it would be insane to try. But cracks started to open when large Internet companies such as Amazon and Facebook, which generated huge amounts of data and had very stringent performance requirements, realized that the relational databases provided by market leaders like Oracle, were not good enough for their modern use cases. With a relational database, when the amount of data grows beyond the size of a single server it is very complex to cluster and performance goes down dramatically.

About fifteen years ago, a number of Internet companies started to develop internal solutions to these problems. Later on, the open source community stepped in to address these challenges and a new breed of databases was born, which today is broadly categorized under “unstructured" or "NoSQL" databases.

Redis Labs was started in a bit of an unusual way, and not as a database company. The original idea was to improve application performance, because we, the founders, came from that space. We always knew that databases were the main bottleneck in app performance and looked for ways to improve that. So, we started with database caching. At that time, Memcached was a very popular open source caching system for accelerating database performance. We decided to improve it and make it more robust and enterprise-ready. And that's how we started the company.

In 2011, when we started to develop the product, we discovered a fairly new open source project by the name "Redis" (which stands for "Remote Dictionary Server"), which was started by Salvatore Sanfilippo, an Italian developer, who lives in Sicily until this very day. He essentially created his own in-memory database for a certain project he worked on and released it as open source. We decided to adopt it as the engine under the hood for what we were doing. However, shortly thereafter we started to see the amazing adoption of this open source database.  After a while, it was clear we were in the wrong business, and so we decided to focus on Redis as our main product and became a Redis company.  Salvatore Sanfilippo later joined the company and continues to lead the development of the open source project, with a group of developers. A much larger R&D team develops Redis Enterprise, our commercial offering.

Jim Carroll: To be clear, there is an open source Redis community and there's a company called Redis Labs, right?

Ofer Bengal:  Yes. Both the open source Redis and Redis Enterprise are developed by Redis Labs, but by two separate development teams. This is because a different mindset is required for developing open source code and an end-to-end solution suitable for enterprise deployment.
 
Jim Carroll: Tell us more about Redis Labs, the company.

Offer Bengal: We have a monumental number of open source Redis downloads. Its adoption has spread so widely that today you find it in most companies in the world. Our mission, at Redis Labs, is to help our customers unlock answers from their data. As a result, we invest equally into both open source Redis and enterprise-grade Redis, Redis Enterprise, and deliver disruptive capabilities that will help our customers find answers to their challenges and help them deliver the best application and service for their customers. We are passionate about our customers, community, people and our product. We're seeing a noticeable trend where enterprises that adopt OSS Redis are maturing their implementation with Redis Enterprise, to better handle scale, high availability, durability and data persistence. We have customers from all industry verticals, including six of the top Fortune 10 companies and about 40% of the Fortune 100 companies. To give you a few examples of some of our customers, we have AMEX, Walmart, DreamWorks, Intuit, Vodafone, Microsoft, TD Bank, C.H. Robinson, Home Depot, Kohl's, Atlassian, eHarmony – I could go on.

Redis Labs has now over 220 employees across our Mountain View CA HQ, R&D center in Israel, London sales office and other locations around the world.  We’ve completed a few investment rounds, totaling $80 million from Bain Capital Ventures, Goldman Sachs, Viola Ventures (Israel) and Dell Technologies Capital.

Jim Carroll: So, how can you grow and profit in an open source market as a software company?

Ofer Bengal:  The market for databases has changed a lot. Twenty years ago, if a company adopted Oracle, for example, any software development project carried out in that company had to be built with this database. This is not the case anymore. Digital transformation and cloud adoption are disrupting this very traditional model and driving the modernization of applications. New-age developers now have the flexibility to select their preferred solutions and tools for their specific problem at hand or use cases. They are looking for the best-of-breed database to meet each use case of their application. With the evolution of microservices, which is the modern way of building apps, this is even more evident. Each microservice may use a different database, so you end up with multiple databases for the same application. A simple smartphone application, for instance, may use four, five or even six different databases. These technological evolutions opened the market to database innovations.

In the past, most databases were relational, where the data is modeled in tables, and tables are associated with one another. This structure, while still relevant for some use cases, does not satisfy the needs of today’s modern applications.

Today, there are many flavors of unstructured NoSQL databases, starting with simple key value databases like DynamoDB, document-based databases like MongoDB, column-based databases like Cassandra, graph databases like Neo4j, and others.  Each one is good for certain use cases. There is also a new trend called multi-model databases, which means that a single database can support different data modeling techniques, such as relational, document, graph, etc.  The current race in the database world is about becoming the optimal multi-model database.

Tying it all together, how do we expect to grow as an organization and profit in an open source market?  We have never settled for the status quo. We looked at today’s environments and the challenges that come with them and have figured out a way to deliver Redis as a multi-model database. We continually strive to lead and disrupt this market. With the introduction of modules, customers can now use Redis Enterprise as a key-value store, document store, graph database, and for search and so much more. As a result, Redis Enterprise is the best-of-breed database suited to cater to the needs of modern-day applications. In addition to that, Redis Enterprise delivers the simplicity, ease of scale and high availability large enterprises desire. This has helped us become a well-loved database and a profitable business

Jim Carroll: What makes Redis different from the others?

Ofer Bengal: Redis is by far the fastest and most powerful database. It was built from day one for optimal performance: besides processing entirely in RAM (or any of the new memory technologies), everything is written in C, a low-level programming language. All the commands, data types, etc., are optimized for performance. All this makes Redis super-fast. For example, from a single, average size, cloud instance on Amazon, you can easily generate 1.5 million transactions per second at sub-millisecond latency. Can you imagine that? This means that the average latency of those 1.5 million transactions will be less than one millisecond. There is no database that comes even close to this performance. You may ask, what is the importance of this?  Well, the speed of the database is by far the major factor influencing application performance and Redis can guarantee instant application response.

Jim Carroll: How are you tracking the popularity of Redis?

Ofer Bengal: If you look at DockerHub, which is the marketplace for Docker containers, you can see the stats on how many containers of each type were launched there.  The last time I checked, over 882 million Redis containers have been launched on DockerHub. This compares to about 418 million for MySQL, and 642 million of MongoDB containers. So, Redis is way more popular than both MongoDB and MySQL. And we have many other similar data points confirming the popularity of Redis.

Jim Carroll: If Redis puts everything in RAM, how do you scale? RAM is an expensive resource, and aren’t you limited by the amount that you can fit in one system?

Ofer Bengal: We developed very advanced clustering technology which enables Redis Enterprise to scale infinitely. We have customers that have 10s of terabytes of data in RAM. The notion that RAM is tiny and used only for very special purposes, is no longer true, and as I said, we see many customers with extremely large datasets in RAM. Furthermore, we developed a technology for running Redis on Flash, with near-RAM performance at 20% the servers cost. The intelligent data tiering that Redis on Flash delivers allows our customers to keep their most used data in RAM while moving the less utilized data onto cheaper flash storage. This has organizations such as Whitepages saving over 80% of their infrastructure costs, with little compromise to performance.

In addition to that, we’re working very closely with Intel on their Optane™ DC persistent memory based on 3D Xpoint™. As this technology becomes mainstream, the majority of the database market will have to move to being in-memory.


Jim Carroll: What about the resiliency challenge? How does Redis deal with outages?

Ofer Bengal: Normally with memory-based systems, if something goes wrong with a node or a cluster, there is a risk of losing data. This is not the case with Redis Enterprise, because it is redundant and persistent.  You can write everything to disk without slowing down database operations. This is important to note because persisting to disk is a major technological challenge due to the bottleneck of writing to disk. We developed a persistence technology that preserves Redis' super-fast performance, while still writing everything to disk. In case of memory failures, you can read everything from disk. On top of that, the entire dataset is replicated in memory.  Each database can have multiple such replicas, so if one node fails, we instantly fail-over to a replica. With this and some other provisions, we provide several layers of resiliency.

We have been running our database-as-a-service for five years now, with thousands of customers, and never lost a customer's data, even when cloud nodes failed.

Jim Carroll: So how is the market for in-memory databases developing? Can you give some examples of applications that run best in memory?

Ofer Bengal: Any customer-facing application today needs to be fast. The new generation of end users expect instant experience from all their apps and are not tolerant to slow response, whether caused by the application or by the network.

You may ask "how is 'instant experience' defined?"  Let’s take an everyday example to illustrate what ‘instant’ really means., When browsing on your mobile device, how long are you willing to wait before your information is served to you? What we have found is that the expected time from tapping your smartphone or clicking on your laptop until you get the response, should not be more than 100 milliseconds. As an end consumer, we are all dissatisfied with waiting and we expect information to be served instantly. What really happens behind the scenes, however, is once you tap your phone, a query goes over the Internet to a remote application server, which processes the request and may generate several database queries. The response is then transmitted back over the Internet to your phone.

Now, the round trip over the Internet (in a "good" Internet day) is at least 50 milliseconds, and the app server needs at least 50 milliseconds to process your request. This means that at the database layer, the response time should be within sub-millisecond or you’re pretty much exceeding what is considered the acceptable standard wait time of 100 milliseconds. At a time of increasing digitization, consumers expect instant access to the service, and anything less will directly impact the bottom line. And, as I already mentioned, Redis is the only database that can respond in less than one millisecond, under almost any load of transactions.

Let me give you some use case examples. Companies in the finance industry (banks, financial institutions) are using relational databases for years. Any change, such as replacing an Oracle database, is analogous to open heart surgery. But when it comes to new customer facing banking applications, such as checking your account status or transferring funds, they would like to have instant experience. Many banks are now moving this type of applications to other databases, and Redis is often chosen for its blazing fast performance bar none.

As I mentioned earlier, the world is moving to microservices. Redis Enterprise fits the needs of this architecture quite nicely as a multi-model database. In addition, Redis is very popular for messaging, queuing and time series capabilities. It is also strong when you need fast data ingest, for example, when massive amounts of data are coming in from IoT devices, or in other cases where you have huge amounts of data that needs to be ingested in your system. What started off as a solution for caching has, over the course of the last few years, evolved into an enterprise data platform.

Jim Carroll: You mentioned microservices, and that word is almost becoming synonymous with containers. And when you mention containers, everybody wants to talk about Kubernetes, and managing clusters of containers in the cloud. How does this align with Redis?

Ofer Bengal: Redis Enterprise maintains a unified deployment across all Kubernetes environments, such as RedHat OpenShift, Pivotal Container Services (PKS), Google Kubernetes Engine (GKE), Azure Kubernetes Service (AKS), Amazon Elastic Container Service for Kubernetes (EKS) and vanilla Kubernetes. It guarantees that each Redis Enterprise node (with one or more open source servers) reside on a POD that is hosted on a different VM or physical server. And in using the latest Kubernetes primitives, Redis Enterprise can now be run as a stateful service across these environments.

We use a layered architecture that splits responsibilities between tasks that Kubernetes does efficiently, such as node auto-healing and node scaling, tasks that Redis Enterprise cluster is good at, such as failover, shard level scaling, configuration and Redis monitoring functions, and tasks that both can orchestrate together, such as service discovery and rolling upgrades with zero downtime.

Jim Carroll: How are the public cloud providers supporting Redis?

Ofer Bengal:  Most cloud providers, such as AWS, Azure and Google, have launched their own versions of Redis database-as-a-service, based on open source Redis, although they hardly contribute to it.

Redis Labs, the major contributor to open source Redis, has launched services on all those clouds, based on Redis Enterprise.  There is a very big difference between open source Redis and Redis Enterprise, especially if you need enterprise-level robustness.

Jim Carroll: So what is the secret sauce that you add on top of open source Redis?

Offer Bengal:  Redis Enterprise brings many additional capabilities to open source Redis. For example, as I mentioned earlier, sometimes an installation requires terabytes of RAM, which can get quite expensive. We have built-in capabilities on Redis Enterprise that allows our customers to run Redis on SSDs with almost the same performance as RAM. This is great for reducing the customer's total cost of ownership.  By providing this capability, we can cut the underlying infrastructure costs by up to 80%. For the past few years, we’ve been working with most vendors of advanced memory technologies such as NVMe and Intel’s 3D Xpoint.  We will be one of the first database vendors to take advantage of these new memory technologies as they become more and more popular. Databases like Oracle, which were designed to write to disk, will have to undergo a major facelift in order to take advantage of these new memory technologies.

Another big advantage Redis Enterprise delivers is high availability. With Redis Enterprise, you can create multiple replicas in the same data center, across data centers, across regions, and across clouds.  You can also replicate between cloud and on-premise servers. Our single digit seconds failover mechanism guarantees service continuity.

Another differentiator is our active-active global distribution capability. If you would like to deploy an application both in the U.S. and Europe, for example, your will have application servers in a European data center and in a US data center. But what about the database? Would it be a single database for those two locations? While this helps avoid data inconsistency it’s terrible when it comes to performance, for at least one of these two data centers. If you have a separate database in each data center, performance may improve, but at the risk of consistency. Let’s assume that you and your wife share the same bank account, and that you are in the U.S. and she is traveling in Europe. What if both of you withdraw funds at an ATM at about the same time? If the app servers in the US and Europe are linked to the same database, there is no problem, but if the bank's app uses two databases (one in the US and one in Europe), how would they prevent overdraft? Having a globally distributed database with full sync is a major challenge. If you try to do conflict resolution over the Internet between Europe and the U.S., database operation will slow down dramatically, which is a no-go for the instant experience end users demand. So, we developed a unique technology for Redis Enterprise based on the mathematically proven CRDT concept, developed in universities. Today, with Redis Enterprise, our customers can deploy a global database in multiple data centers around the world while assuring local latency and strong eventual consistency. Each one works as if it is fully independent, but behind the scene we ensure they are all in sync.          

Jim Carroll: What is the ultimate ambition of this company?

Offer Bengal: We have the opportunity to build a very big software company. I’m not a kid anymore and I do not live on fantasies. Look at the database market – it’s huge! It is projected to grow to $50–$60 billion (depending on which analyst firm you ask) in sales in 2020. It is the largest segment in the software business, twice the size of the security/cyber market. The crack in the database market that opened up with NoSQL will represent 10% of this market in the near term. However, the border line between SQL and NoSQL is becoming a blur, as companies such as Oracle add NoSQL capabilities and NoSQL vendors add SQL capabilities. I think that over time, it will become a single large market. Redis Labs provides a true multi-model database. We support key-value with multiple data structures, graph, search, JSON (document based), all with built-in functionality, not just APIs. We constantly increase the use case coverage of our database, and that is ultimately the name of the game in this business. Couple all that with Redis' blazing fast performance, the massive adoption of open source Redis and the fact that it is the "most loved database" (according to StackOverflow), and you would agree that we have once in a lifetime opportunity!