Generic Autotuning Technology for GPU Applications

Hybrid

7 - 11 March 2022

Venue: Oort

If you are invited or already registered for this workshop, you have received login details by email.

Graphics Processing Units (GPUs) are enabling and accelerating many applications
in science and industry. The performance of GPU applications strongly
depends on how the software has been optimized for the hardware. While
developing GPU applications is generally considered hard, developing highlyoptimized
GPU applications that can be auto-tuned is even more difficult, because
it involves creating many different implementations of the same program
through parameterizations. These parameterizations describe different ways to
parallelize an application across threads and thread blocks, using different data
layouts in various specialized memories, and applying different algorithms, code
optimizations, and transformations that in turn have parameters.

Together these parameters create vast, non-convex, non-continuous design
spaces that are infeasible to search by hand and would have to be searched
over and over again as the application is executed on different hardware or
different input problems. As such, automated performance tuning (auto-tuning)
techniques are often employed to optimize the source code of high-performance
libraries and applications for the CPU, e.g. ATLAS or FFTW, as well as for
GPUs.

Generic auto-tuners aim to bring a universal solution, which can be used for
different libraries, applications, as well as code produced by high-level programming
languages or domain-specific languages. In this way, the effort invested
into auto-tuning implementations no longer has to be repeated when a new code
is developed and new results in auto-tuning research (e.g., novel tuning space
search techniques) no longer have to be implemented in multiple special-purpose
tuners.

Several generic auto-tuners have arisen in recent years and are currently
undergoing active development, each with their own merits and specific focus,
seeking to provide such a generalized solution to the auto-tuning problem. These
auto-tuners are responsible for generating multiple functionally-equivalent variants
of the application source code, compiling them and empirically selecting
the best one according to a given optimization objective.

There is an urgent need for this technology as GPU architectures are becoming
increasingly heterogeneous, and therefore becoming even more difficult
to optimize, and because the GPU market is rapidly diversifying after a long
period of relative stability, further increasing the need for tools that can deliver
performance portable code.

Aim
The aim of this workshop is to bootstrap international collaboration between
research groups working on auto-tuning in different fields and create collaborations
between research groups in auto-tuning and research groups in high-level
programming languages and compilers. Examples of topics that we would like
to investigate during the workshop are:

• Comparison of generic auto-tuning methodologies and identifying open
research problems
• Using high-level programming approaches and generic auto-tuners in concer
• A common interface for auto-tuners to facilitate portability between tuning
frameworks
• A shared database of benchmarking data
• Methodology and metrics for comparing optimization algorithms for autotuning