Luis Ceze

Luis Ceze

University of Washington

9:15a – 10:00a

“Bridging the Gap Between Deep Learning Models and “the Metal” with Apache TVM & VTA”

There is an increasing need to bring machine learning to a wide diversity of hardware devices. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms — such as mobile phones, embedded devices, and accelerators (e.g., FPGAs, ASICs) — requires significant manual effort. In this talk I will present our work on the TVM stack, which exposes graph-level and operator-level optimizations to provide performance portability to deep learning workloads across diverse hardware back-ends. TVM solves optimization challenges specific to deep learning, such as high-level operator fusion, mapping to arbitrary hardware primitives, and memory latency hiding. It also automates optimization of low-level programs to hardware characteristics by employing a novel, learning-based cost modeling method for rapid exploration of code optimizations. To address threat of changes in algorithms, models, operators, or numerical systems threaten to the viability of specialized hardware accelerators, we developed VTA, a programmable deep learning architecture template tightly coupled to TVM. VTA achieves this flexibility via a parametrizable architecture, two-level ISA, and a JIT compiler. The TVM/VTA was recently incubated as an Apache Foundation project and is benefiting from a thriving community of developers.

About Luis

Luis Ceze is a Professor in the Paul G. Allen School of Computer Science and Engineering at the University of Washington, Co-founder and CEO at OctoML, and Venture Partner at Madrona Venture Group. His research focuses on the intersection between computer architecture, programming languages, machine learning and biology. His current focus is on approximate computing for efficient machine learning and DNA-based data storage. He co-directs the Molecular Information Systems Lab (MISL), the Systems and Architectures for Machine Learning lab (SAML) and the Sampa Lab for HW/SW co-design. He has co-authored over 100 papers in these areas, and had several papers selected as IEEE Micro Top Picks and CACM Research Highlights. His research has been featured prominently in the media including New York Times, Popular Science, MIT Technology Review, Wall Street Journal, among others. He is a recipient of an NSF CAREER Award, a Sloan Research Fellowship, a Microsoft Research Faculty Fellowship, the IEEE TCCA Young Computer Architect Award and UIUC Distinguished Alumni Award.

Bin Liu

Bin Liu


10:30a – 11:15a

HHVM Jumpstart: towards instant JIT warmup and agile cluster operations

HHVM is a virtual machine for running programs written in Hack (a dialect of PHP developed at Facebook). HHVM uses just-in-time (JIT) compilation to achieve great efficiency; yet the need for JIT warmup incurs nontrivial overhead in capacity, latency, and operational complexity. This talk describes Jumpstart, the latest in a series of efforts to warmup faster. In Jumpstart, profile data gathered on a small number of hosts are reused on the majority of other servers, allowing these servers to generate optimized code without going through the tiered compilation process. In addition to faster restarts with reduced overhead at the cluster level, Jumpstart also enables more heavy-weight profiling/optimizations for greater efficiency, by combining the ideas of ahead-of-time and JIT compilation.

About Bin:

Bin Liu is a software engineer at Facebook, where he works on the HHVM and other efficiency initiatives. Prior to Facebook, he held various positions at Micron, Xilinx, and AutoESL on compilation onto customized hardware architectures. Bin holds a Bachelor’s degree from Tsinghua University, and a PhD degree from UCLA. He was a receipt of two best-paper awards from the ACM Transactions on Design Automation of Electronic Systems in 2012 and 2013.

Rodolph Perfetta

Rodolph Perfetta


11:15a – 12:00p

Hardening your Runtime

Lately, ISAs have been gaining capabilities to address some of the shortcomings of languages such as C/C++. Features like memory tagging or control flow integrity may sound redundant in a managed language, but they can be used to great effect to increase security in runtimes. In this talk we’ll explore how those hardware extensions can be exploited, using Google V8 as an example.

About Rodolph

Rodolph Perfetta is runtime architect at Arm’s Open Source group. For the last 15 years his work has focused on Runtimes optimisations and enablement for the Arm architecture, including work on GoogleV8 and OpenJDK.
Henry Hamid Safi

Henry Hamid Safi


1:00p – 1:45p

The Performance Journey of Serverless Azure Functions
This talk will cover a brief overview of Serverless Azure Functions and then we will talk about key performance challenges in this area with a deep focus on cold starts. We will share our story on where we were more than a year ago, what optimizations was put in place, how we leveraged .Net Core perf optimizations and other Azure features and where we stand at present.

About Henry:
Henry Hamid Safi is a Principal Software Engineer at Microsoft. His focus is improving performance of cloud based frameworks built on top of Microsoft technologies. He is responsible for the performance of Serverless Azure Functions and all up Azure App Services. He has been working on performance features of many critical Microsoft frameworks and applications ever since he joined Microsoft in 2006. Outside of work, he is an avid world traveler and always loves talking about outdoor activities (backpacking, cycling and snowboarding) and travel to off the beaten paths around the globe.

Yishai Galatzer

Yishai Galatzer


1:45p – 2:30p

Migrating AWS Lambda’s front end from Java 8 to Java 11″

Explore the journey of migrating AWS Lambda’s front-end service from Java 8 to Java 11 using Amazon Corretto, a no-cost distribution of OpenJDK. We walk through the code and dependency changes required to migrate to Java 11, how we measured performance improvements, and how we safely deployed such a significant update to a large-scale service across multiple regions in production.

About Yishai:

Yishai Galatzer is a Software Engineering at Amazon Web Services. Yishai is the general manager for the Amazon Corretto OpenJDK distribution.

Suresh Srinivas

Suresh Srinivas


3:00p – 4:00p

Talk details coming soon.

About Suresh:

Suresh Srinivas is a Principal Engineer at Intel focused on Runtimes. He has a PhD in Computer Science and 25 years of Industry Experience developing JITs, HW/SW codesign, and Runtimes. He is also a Yoga and Meditation teacher. Outside of work he volunteers in the community and enjoys hiking the Pacific Northwest with his dog Luna.