In today's computing landscape, applications are ever more compute- and data-intensive, while hardware is increasingly diversifying with special-purpose accelerators optimized for "hot" domains. But as workloads shift at a breakneck pace, yesterday's accelerators, and the software stacks that support them, struggle to keep up. This makes it difficult for heterogeneous systems to evolve. It also limits domain experts from fully harnessing their potential.
This talk presents our ongoing research on making accelerators more evolvable through a co-design methodology that spans programming models, compilers, and hardware acceleration. I'll begin with Allo (PLDI'24), a new programming model and compiler for building dataflow accelerators. Allo is already in active use across academia and industry, with demonstrated success mapping AI workloads to FPGA and NPU backends. Next, I'll introduce UniSparse (OOPSLA'24), an intermediate language that provides a unified abstraction for customizing sparse matrix and tensor formats. We validate our approach across diverse hardware platforms, including Intel CPUs, NVIDIA GPUs, AMD FPGAs, and processing-in-memory (PIM) devices. Finally, I'll discuss our latest efforts to simplify the construction of accelerator compilers using differentiable techniques (ASPLOS'25 Best Paper) and emerging agentic methods.
Zhiru Zhang is a Professor in the School of ECE at Cornell University. He also serves as the lead PI for the "Heterogeneous Computing Platforms" theme in the SRC/DARPA JUMP 2.0 ACE Center for Evolvable Computing. His current research investigates new algorithms, design methodologies, and automation tools for heterogeneous computing. Dr. Zhang is an IEEE Fellow and has been honored with the Intel Outstanding Researcher Award, AWS AI Amazon Research Award, Facebook Research Award, Google Faculty Research Award, DAC Under-40 Innovators Award, DARPA Young Faculty Award, IEEE CEDA Ernest S. Kuh Early Career Award, and NSF CAREER Award. He has also received 10+ best paper awards from premier computer hardware conferences and research journals. Prior to joining Cornell, he co-founded AutoESL, a high-level synthesis start-up later acquired by Xilinx (now part of AMD). AutoESL's HLS tool evolved into Vivado HLS (now Vitis HLS), which is widely used for designing FPGA-based hardware accelerators.