Lei.Chat()

Lei Zhang

Senior Manager, Software Engineering AMD AI Group

AI Compiler & Runtime. Currently: Triton, IREE, MLIR, LLVM. Previously: SPIR-V, Vulkan, Metal.

Recent Posts

Layout is a core concept in Triton for representing and optimizing distribution mappings from source problems to the target hardware compute and memory hierarchy. In this blog post I will talk about linear layout in Triton, the new unifying mechanism over existing bespoke layouts for different purposes. The aim is to provide motivation and an intuitive understanding of linear layout; I will rely on examples and illustrations instead of theories and proofs.

2024-12-31

16 min read

compiler, triton

triton

Triton Compiler Development Tips

Triton provides an elegant solution to program GPU kernels in Python, positioning itself as a critical component in the modern AI software stack. To deliver performance and portability, it leverages a compiler, the capability of which determines the potential. Hacking the compiler internals is not a simple task. Here are some tips hopefully useful to folks. I’ll try to keep this blog post updated periodically.

2024-12-25

10 min read

compiler, triton

triton

Leaving Google

Time flies—almost 9 years have passed since I joined Google. Now the time has come for me to leave and move on. While here, I’m super lucky to mostly work on open source projects that I can publicly talk about. So at the end of my tenure with Google, I’d like to reflect and summarize the incredible journey, which I am super grateful for and thoroughly enjoyed, before I forget some details.

2023-09-26

7 min read

Single-node ML Runtime Foundation

Previous blog posts overviewed the MLIR dialect hierarchy for kernel code generation (CodeGen) and zoomed in on the Linalg and Vector dialects among them. Now I will switch to discuss the runtime side a bit, in order to provide a holistic view of MLIR-based machine learning (ML) compilers. This one touches the foundation and basics, including the target landscape, runtime requirements and designs to meet thereof.

2023-04-01

18 min read

runtime, mlir, ml-inference

ml-inference , compiler-development

MLIR Linalg Dialect and Patterns

I explained the Vector dialect and related patterns in the previous blog post. In this one let us look at a layer higher and talk about the Linalg dialect and transformations around it.

2022-08-31

13 min read

compiler, ir, mlir, ml-inference

compiler-development