OpenMP User Conference 2019 in Edinburgh

The OpenMP user conference 2019 was held in the University of  Edinburgh by EPCC that hosted 10 speakers in person and one remotely over two days for around 30 attendees.

Day 1- Tuesday 4th June

  1. Hands-on Tutorial

I joined the Programming Your GPU with OpenMP: A Hands-On Introduction presented by the Professor of High Performance Computing at the University of Bristol, Simon McIntosh-Smith. He started with an overview of OpenMP and the difference of the terms used in the GPU programming model, the host, the device, and the clauses used in the target device. After that, we started the active learning by interacting with the ARM supercomputer called IsambardInitially, we run a basic serial program to add vectors, where the clause #pragma omp target was set in the processing loop.Usually, the serial version took more time than the parallel job. We need to check in deep in case we have a different result.

It was also explained how to read nvprof, a profiler CUDA toolkit. The command used was nvprof – -print-gpu-trace ./vadd. The first two lines show the cost of offloading in percentage. The calls from the host to the device are mostly greater than the calls from the device to the host after doing the reduction calculations in the device.

Later, the levels of parallelism were explained where the team of threads to be distributed follows the flow: target → device → compute unit → processing element. The term team of threads to be distributed is defined to be used inside the device. The jacobi_solver exercise was so useful to understand how the team of threads works. In this example, a pointer to a fixed array of floats that needs to be explicitly defined in the code as follows:Screen Shot 2019-06-04 at 12.39.03 PM.png

After adding the omp directives, we have that the parallelised version goes twice faster:Screen Shot 2019-06-04 at 12.43.00 PM

To control the memory movement, the target enter data and the target exit data allows data construction to create a data region. These clauses are set in the area of swapping data because the exchange is expensive. The target enter data allocates and copies data to the device, and target exit data directive copies back or destroys the data.Screen Shot 2019-06-04 at 4.25.18 PM

After these configurations, a very good optimization of the data movement is noticeable reducing the execution time 10 times from the serial version of the jacobi_solver.Screen Shot 2019-06-04 at 4.19.34 PM

2. Advanced OpenMP: Performance and 5.0 Features

An interesting talk about the performance of OpenMP was given bythe Senior Principal Engineer of Intel Corporation in the U.K., Jim Cownie. He shared OpenMP programming knowledge and parallel concepts related, as well as the best practices used.

Day 2- Wednesday 5th June


Nine talks were presented, including a remote presentation of the Director of Supercomputing Center of Excellence at Cray Inc., John M. Levesque who explained how to read the report provided by the CrayPat tool. I highlight these two efforts of the industry in pictures, Dr. Glover from MetOffice trying to accelerate UM and NEMO, and the representative of ARM, Oliver who optimized the performance of OpenMP intranode.

In the academy, the FFLUX optimization by Benjamin Symons from the University of Manchester inspired me to do a comparison with the application I am studying. The OpenMP Parallelisation of Quantum Computing Simulators by Youssef Moawad was also impressive to me, he did an excellent job in his presentation (including the rainbow:).


Experts in OpenMP offered themselves to be asked about OpenMP. They are working towards OpenMP5 and the compatibilities with upcoming versions as GCC 10 and doing a common collaboration to adapt this interface to different HPC architectures.Food of Event

We met new HPC fellows during lunchtime, a Ph.D. candidate from Egypt Youssef and the Ph.D. student from the University of Manchester. Thanks again to Holly and Andreas H.

The OpenMP community in the U.K.

We did not have a photo group, but we were able to exchange ideas during the break time.

Special Thanks

Thanks to Dr. Bull for the invitation that let us enhance our skills, and to Professor McIntosh-Smith. It was exciting to meet in person the author of a paper I’ve referenced.

Good event in general! It would have been nice to have a couple of workshops and 45 minutes per talk. The projects were very interesting, but some talks lasted 15 minutes.

About Julita Inca

Ingeniero de Sistemas UNAC, Magíster en Ciencias de la Computación PUCP, Magíster en Computación de Alto Rendimiento de la Universidad de Edimburgo, OPW GNOME 2011, Miembro de la GNOME Foundation desde el 2012, Embajadora Fedora Perú desde el 2012, ganadora del scholarship of the Linux Foundation 2012, experiencia como Admin Linux en GMD y Especialista IT en IBM, con certificaciones RHCE, RHCSA, AIX 6.1, AIX 7 Administrator e ITILv3. Experiencia académica en universidades como PUCP, USIL y UNI. Leader of LinuXatUNI Community, HPC Software Specialist at UKAEA, and reviewer of the Technological Magazine of ESPOL-RTE, and volunteering Linux training for MINSA Peru... a simple mortal, just like you!
This entry was posted in Events, τεχνολογια :: Technology and tagged , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s