Title: Reliability in Erasure-Coded Distributed Storage Systems
Consumers are engaged in more social networking and Ecommerce activities these days and are increasingly storing their documents and media in the online storage. Businesses are relying on Big Data analytics for business intelligence and are migrating their traditional IT infrastructure to the cloud. These trends cause the online data storage demand to rise faster than Moore's Law. Erasure coding techniques are used widely for distributed data storage since they provide space-optimal data redundancy to protect against data loss. Companies like Facebook report that the repair traffic (to repair the failed disks in distributed storage system) is a significant percentage of the total cluster network traffic. In this talk, I will present a new repair mechanism called "opportunistic repair" that improves the life time (or hence reliability) of the storage system. This system potentially improves the mean time to data loss of a system with (51,30) erasure code by 17 orders of magnitude as compared to a standard Reed-Solomon code. This is joint work with Chao Tian, Vinay Vaishampayan, and Robin Chen.
Title: Demystifying and Controlling the Performance of Big Data Jobs
Although there is tremendous interest in designing improved data center architecture for big data, very little is known about the characteristics of Hadoop jobs. In this talk, I will present an empirical study of Hadoop jobs in 5 different data centers, describing the job-level and task-level properties of the applications within these data centers, and their impact on cluster server and storage utilization. Then, I will present a straw man approach for introducing caching into Hadoop and discuss the implications of this caching mechanism on the data center. More specifically, energy saving, cluster utilization, and network traffic.
Title: Energy Efficient Content Distribution in an ISP Network
We study the problem of reducing power consumption in an Internet Service Provider (ISP) network by designing the content distribution infrastructure managed by the operator. We propose an algorithm to optimally decide where to cache the content inside the ISP network. We evaluate our solution over two case studies driven by operators feedback. Results show that the energy-efficient design of the content infrastructure brings substantial savings, both in terms of energy and in terms of bandwidth required at the peering point of the operator. Moreover, we study the impact of the content characteristics and the power consumption models. Finally, we derive some insights for the design of future energy-aware networks.
Title: Network Coded Storage with Multi-Resolution Codes
Reducing the energy consumption in modern storage networks can be helped by directly managing traffic-induced data unavailability. Coded storage is a promising technique in this area, although the practical gains of such systems when coding across different elements in the storage stack are not fully understood. Focusing on larger video layer elements, we propose and explore a coded resolution-aware storage (CRS) scheme for data center video-streaming applications. In particular, files stored on drives are pre-network coded across multi-resolution layers. We explore the characteristics of this concept and compare it to an uncoded resolution-aware storage (URS) scheme, as well as to classic storage systems. We illustrate that the CRS scheme can in some cases provide an order of magnitude in saturation probability gains over URS and classic systems.
Title: GreenSwitch: Managing Datacenters Powered by Renewable Energy
Several companies have recently announced plans to build "green" datacenters, i.e. datacenters partially or completely powered by renewable energy. These datacenters will either generate their own renewable energy or draw it directly from an existing nearby plant. Besides reducing carbon footprints, renewable energy can potentially reduce energy costs, reduce peak power costs, or both. However, certain renewable fuels are intermittent, which requires approaches for tackling the energy supply variability. One approach is to use batteries and/or the electrical grid as a backup for the renewable energy. It may also be possible to adapt the workload to match the renewable energy supply. For highest benefits, green datacenter operators must intelligently manage their workloads and the sources of energy at their disposal.
In this talk, we first present Parasol, a prototype green datacenter that we have built as a research platform. Parasol comprises a small container, a set of solar panels, a battery bank, and a grid-tie. Second, we describe GreenSwitch, our model-based approach for dynamically scheduling the workload and selecting the source of energy to use. Our real experiments with Parasol, GreenSwitch, and MapReduce workloads demonstrate that intelligent workload and energy source management can produce significant cost reductions. Our results also isolate the cost implications of peak power management, storing energy on the grid, and the ability to delay the MapReduce jobs. Finally, our results demonstrate that careful workload and energy source management can minimize the negative impact of electrical grid outages.
Title: Delay-Storage Trade-offs in Distributed Storage Systems
In this talk, we show how coding in distributed storage reduces expected download time, in addition to providing reliability against disk failures. For the same total storage used, coding exploits the diversity in storage better than simple replication, and hence gives faster download. We use a novel fork-join queuing framework to model multiple users requesting the content simultaneously, and derive bounds on the expected download time. Our system model and results are a novel generalization of the fork-join system that is studied in queuing theory literature. The results demonstrate the fundamental trade-off between the expected download time and storage space.
Title: How Reach Flexibility in Distributed Storage System
It is usually necessary to compromise between complexity and the ratio R between the number of parity disks and the number of failures that can be simultaneously reconstructed for a particular erasure correcting code for a distributed storage system. In this talk, I present two theoretical studies to complete the state of the art. The first study deals with the computation complexity of data update. We use the notions of average and maximum update complexity, and we determine whether linear code used for data storage are update efficient or not. We particularly analyze sparse graph codes, with a study from the tree representation of these codes. In the second study, we examine the impact on the network load when the code parameters are changed. This can be done when the file status changes (from an hot status to a cold status for example) or when the size of the network is modified by adding disks. Carried out for MDS codes and a particular sparse graph code, this study shows the exact evolution of the network load during and after the data migration process. To finish this talk, I will give some practical results obtained with a flexible and efficient erasure code.
Title: When Do Redundant Requests Reduce Latency?
Several systems possess the flexibility to serve requests in more than one way. For instance, a distributed storage system storing multiple replicas of the data can serve a request from any of the multiple servers that store the requested data, or a computational task may be performed in a compute-cluster by any one of multiple processors. In such systems, the latency of serving the requests may potentially be reduced by sending "redundant requests": a request may be sent to more servers than needed, and it is deemed served when the requisite number of servers complete service. Such a mechanism trades off the possibility of faster execution of at least one copy of the request with the increase in the delay due to an increased load on the system. Due to this tradeoff, it is unclear when redundant requests may actually help. Several recent works empirically evaluate the latency performance of redundant requests in diverse settings.
This work aims at an analytical study of the latency performance of redundant requests, with the primary goals of characterizing under what scenarios sending redundant requests will help (and under what scenarios they will not help), as well as designing optimal redundant-requesting policies. We first present a model that captures the key features of such systems. We show that when service times are i.i.d. memoryless or "heavier", and when the additional copies of already-completed jobs can be removed instantly, redundant requests reduce the average latency. On the other hand, when service times are "lighter" or when service times are memoryless and removal of jobs is not instantaneous, then not having any redundancy in the requests is optimal under high loads. Our results hold for arbitrary arrival processes.
Title: Joint DVFS/Low-power States via Queuing-theoretic Analysis
Large-scale data centers play a central role in the world's information infrastructure. As the workloads allocated to data centers increase so does the economic and environmental footprint of these processing clusters. In this talk I'll present some interesting findings on different power control policies widely deployed in data centers. The idea is based on the queuing-theoretic analysis that determines which low-power state to use and how fast the system should run in each possible workload scenario. The results suggest a coordinated approach, that different power management schemes should be jointly optimized for maximum power efficiency under some performance constraints.
Title: Data Center Demand Response: Coordinating IT and the Smart Grid
It is now recognized that data centers are a significant consumer of energy resources and a substantial source of greenhouse gas emissions. Statistics abound: the growth rate of data center electricity usage in the US is more than 10 times the growth rate of the total US electricity usage, and the Internet produces emissions comparable to the airline industry. In this talk, I will briefly introduce our work on utilizing the spatial and temporal demand flexibilities of data centers to reduce their electricity bills, to better utilize local renewable generations, and more importantly, to provide demand response services to the power grid.
Title: Fundamental Limits of Caching
Caching is a technique to reduce peak traffic rates by prefetching popular content in memories at the end users. In this talk, we introduce a new formulation of the caching problem focusing on its basic structure. For this setting, we propose a novel coded caching approach that can achieve a significantly larger reduction in peak rate compared to previously known caching schemes. In particular, the improvement can be on the order of the number of end users in the network. Moreover, we argue that the performance of the proposed scheme is within a constant factor from the information-theoretic optimum for all values of the problem parameters.
Joint work with Mohammad Maddah-Ali.
Title: Local Reconstruction and Availability
Modern distributed file systems are deploying erasure codes to increase storage efficiency compared to block replication. Initially, classical error-correcting codes (like Reed-Solomon) were used but the benefits of custom code designs are now clear. Multiple companies including Microsoft and Facebook are designing and deploying distributed storage codes. Still, however, the fundamental information theoretic limits are not fully understood. Further, there has been significant activity on explicit code designs that are repair efficient and provide high data-availability. We present some new results, open problems, and directions for modern distributed storage codes.
Title: Update Efficiency
In this short talk, I'll discuss two performance metrics that need to be taken into consideration while designing codes to store `hot data' on distributed storage systems (DSS). First, I'll talk about the notion of update efficiency in DSS. Update efficient codes allows for storing frequently changing data without causing large overhead to maintain the consistency in the system.
I'll then discuss the problem of designing codes that enable the support for multiple parallel requests to data blocks. These codes tries to distribute the load of supporting parallel requests across multiple nodes and ensure small response time for end users.
Title: Reducing Energy Usage Through a Novel File Synchronization Algorithm
One of the major causes of the significant energy costs incurred by modern data storage centers is the need to store many versions of the same file. That is, when a file is edited, both the original and the edited versions must be stored, even if the edits are limited or few in number. This effect is due to the lack of practical synchronization algorithms. In this talk, we will describe a novel synchronization algorithm which is the subject of our recent work. Although our protocol is inspired by information-theoretic techniques and has optimal theoretical performance, it also has the benefit of a relatively simple practical implementation. We describe our algorithm and its implementation and discuss its advantages and performance against existing practical tools such as rsync. In particular, our protocol significantly reduces the number of exchanged bits required for synchronization, greatly improving synchronization energy efficiency.
Title: "Piggybacked" Erasure Codes for Distributed Storage: Findings from the Facebook Warehouse Cluster
Erasure codes, such as Reed-Solomon codes, are being increasingly employed in data centers to combat the cost of reliably storing large amounts of data. Although these codes provide optimal storage efficiency, they require significantly high network and disk usage during recovery of unavailable data. We shall first present a study on the impact of recovery operations of erasure-coded data on the data-center network, based on measurements from Facebook. Our measurements reveal that recovery of Reed-Solomon-coded data results in a significant increase in network traffic, more than a hundred terabytes per day in Facebook's Warehouse cluster.
To address this issue, we construct new storage codes using a new "Piggybacking" technique, which we have also implemented in the Hadoop Distributed File System (HDFS). Our codes reduce the network and disk usage during recovery by 35%, while also being storage optimal and supporting arbitrary design parameters.
Title: Topology Design of Energy-Efficient Data Centers
Saving power in datacenter networks has become a pressing issue. In this work, we consider how to choose the right switch size that can potentially save the most power during the expected operation of the network. We consider sleep mode operation, where components such as ports and switches are turned off when traffic demand is relatively moderate; and speed-scaling operation, where the power of a switch can be varied by adjusting its processing rate according to its traffic demand. We investigate the power-saving performance of different switch sizes and traffic demand patterns. Our findings with sleep mode reveal that deploying a large number of small switches is more energy-efficient than a small number of large switches when the traffic demand is relatively moderate or when servers exchanging traffic are in close proximity. With speed scaling, the reverse is generally true.