The initial blog post in this series captured my overall take on the evolution trends of compilers and IRs. It also touched on LLVM IR, SPIR-V, and MLIR, explaining the problems they are addressing and design focuses thereof. Today I will expand on MLIR and talk about its dialect hierarchy for machine learning (ML) compilers systematically.
This blog post talks about how to generate performant code for convolution ops using MLIR’s multiple levels of abstractions and transformations. I initially created it for targeting ARM Mali GPUs in IREE. But given it is just direct tiling and vectorization, it should be widely applicable. I will walk through the lowering steps, so if you are interested to know how to organize MLIR’s various dialects/patterns together to achieve similar tasks, this blog post might also be useful.