Distributing and Parallelizing Non-canonical Loops
This work leverages an original dependency analysis to parallelize loops in imperative programs regardless of their form. Our algorithm distributes a loop into multiple loops that can be run in parallel, resulting in gains in execution time similar to state-of-the-art automatic parallelization source-to-source code transformers when both are applicable. Our graph-based algorithm is intuitive, language-agnostic, proven correct, and applicable to all types of loops. Importantly, it can be applied even if the loop iteration space is unknown statically or at compile time, or more generally if the loop is not in canonical form or contains loop-carried dependency. As contributions we deliver the computational technique, proof of its preservation of semantic correctness, and experimental results to quantify the expected performance gains. Our benchmarks also show that many comparable tools cannot distribute the loops we optimize, and that our technique can be seamlessly integrated into compiler passes or other automatic parallelization suites.
Mon 16 JanDisplayed time zone: Eastern Time (US & Canada) change
09:00 - 10:30 | |||
09:00 60mKeynote | Towards a Theoretical Understanding of Property-Directed Reachability VMCAI Sharon Shoham Tel Aviv University | ||
10:00 30mTalk | Distributing and Parallelizing Non-canonical Loops VMCAI Clément Aubert Augusta University, Thomas Rubiano LIPN – UMR 7030 Université Sorbonne Paris Nord, Neea Rusch Augusta University, Thomas Seiller CNRS |