Sharpe Ratio: Estimation, Confidence Intervals, and Hypothesis Testing

The authors survey and discuss methods proposed in the literature for estimating the Sharpe ratio; computing confidence intervals around a point estimation of the Sharpe ratio; and performing hypothesis testing on a single Sharpe ratio and on the difference between two Sharpe ratios.

The State of Open Data on School Bullying

How much of a problem is school bullying in NYC? The answer depends on who you ask. Data Clinic volunteers compared local surveys (where many students say bullying is happening) with federal data (where a majority of schools report zero incidents), to analyze these disparities for the 2013-14 school year.

Bringing Linux back to the Server BIOS with LinuxBoot

The NERF and Heads projects bring Linux back to the cloud servers’ boot ROMs by replacing nearly all of the vendor firmware with a reproducible built Linux runtime that acts as a fast, flexible, and measured boot loader.

The Future of Pandas

Architecture overview for the future of the Python Pandas data analytics library.

BeakerX (for PyData NYC)

An overview of BeakerX, a collection of kernels and extensions to the Jupyter interactive computing platform.

Responsive and Scalable Real-time Data Analytics

Designing a system that can extract immediate insights from large amounts of data in real-time requires a special way of thinking. This talk presents a “reactive” approach to designing real-time, responsive, and scalable data applications that can continuously compute analytics on-the-fly. It also highlights a case study as an example of reactive design in action.

Archival Storage at Two Sigma

The author presents CelFS, Two Sigma’s geo-distributed file system. Although CelFS has scaled to serve tens of petabytes of data, it uses physical partitioning to provide quality of service guarantees, it has a high replication overhead, and cannot take advantage of outsourced cold storage The talk further describes our response to these limitations in Jaks, a new storage system to reduce the TCO of CelFS and serve as the backend for other systems at Two Sigma.