Storming Media: Pentagon Reports and DocumentsPentagon Reports: Fast. Definitive. Complete.     
New Account »
Forgot Password?
Advanced Search »

Newsletter
Unsubscribe »
Reports by Author

Yanpei Chen


Click on the titles below to find US government-authored or -collected reports written by Yanpei Chen

Total Results: 3 Results per page:
Sort by: Title Date Desc Pages Display:
Understanding TCP Incast and Its Implications for Big Data Workloads 06 Apr 2012 11 pages
Authors:  Yanpei Chen; Rean Griffit; David Zats; Randy H Katz; CALIFORNIA UNIV BERKELEY DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE
The full text of this report is available for sale.TCP incast is a recently identified network transport pathology that affects many-to-one communication patterns in datacenters. It is caused by a complex interplay between datacenter applications, the underlying switches, network topology, and TCP, which was originally designed for wide area networks. Incast increases the queuing delay of flows, and decreases application level throughput to far below the link bandwidth. The problem especially affects computing paradigms in which distributed processing cannot ...


Interactive Query Processing in Big Data Systems: A Cross Industry Study of MapReduce Workloads 02 Apr 2012 15 pages
Authors:  Yanpei Chen; Sara Alspaugh; Randy H Katz; CALIFORNIA UNIV BERKELEY DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE
The full text of this report is available for sale.Within the past few years, organizations in diverse industries have adopted MapReduce-based systems for large-scale data processing. Along with these new users, important new workloads have emerged which feature many small, short and increasingly interactive jobs in addition to the large long-running batch jobs for which MapReduce was originally designed. As interactive, large-scale query processing (e.g. OLAP) is a strength of the RDBMS community, it is important that lessons from ...


Design Insights for MapReduce from Diverse Production Workloads 25 Jan 2012 15 pages
Authors:  Yanpei Chen; Sara Alspaugh; Randy H Katz; CALIFORNIA UNIV BERKELEY DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE
The full text of this report is available for sale.In this paper, we analyze seven MapReduce workload traces from production clusters at Facebook and at Cloudera customers in e-commerce, telecommunications media, and retail. Cumulatively, these traces comprise over a year's worth of data logged from over 5000 machines, and contain over two million jobs that perform 1.6 exabytes of I/O. Key observations include input data forms up to 77% of all bytes, 90% of jobs access KB to GB ...


Total Results: 3 Results per page: