AbstractsComputer Science

A dependency-aware parallel programming model

by Josep Maria Pérez Cáncer




Institution: Universitat Politècnica de Catalunya
Department:
Year: 2015
Record ID: 1129950
Full text PDF: http://hdl.handle.net/10803/286286


Abstract

Designing parallel codes is hard. One of the most important roadblocks to parallel programming is the presence of data dependencies. These restrict parallelism and, in general, to work them around requires complex analysis and leads to convoluted solutions that decrease the quality of the code. This thesis proposes a solution to parallel programming that incorporates data dependencies into the model. The programming model can handle that information and to dynamically find parallelism that otherwise would be hard to find. This approach improves both programmability and parallelism, and thus performance. While this problem has already been solved in OpenMP 4 at the time of this publication, this research begun before the problem was even being considered for OpenMP 3. In fact, some of the contributions of this thesis have had an influence on the approach taken in OpenMP 4. However, the contributions go beyond that and cover aspects that have not been considered yet in OpenMP 4. The approach we propose is based on function-level dependencies across disjoint blocks of contiguous memory. While finding dependencies under those constraints is simple, it is much harder to do so over strided and possibly partially overlapping sets of data. This thesis also proposes a solution to this problem. By doing so, we increase the range of applicability of the original solution and increase the span of applicability of the programming model. OpenMP4 does not currently cover this aspect. Finally, we present a solution to take advantage of the performance characteristics of Non-Uniform Memory Access architectures. Our proposal is at the programming model level and does not require changes in the code. It automatically distributes the data and does not rely on data migration nor replication. Instead, it is based exclusively on scheduling the computations. While this process is automatic, it can be tuned through minor changes in the code that do not require any change in the programming model. Throughout the thesis, we demonstrate the effectiveness of the proposal through benchmarks that are either hard to program using other paradigms or that have different solutions. In most cases, our solutions perform either on par or better than already existing solutions. This includes the implementations available in well-known high-performance parallel libraries.; Dissenyar codis paral·lels es complex. Un dels principals esculls a l'hora de programar aplicacions paral·leles és la presència de dependències. Aquestes constrenyen el paral·lelisme, i en general, per evitar-les es requereix realitzar anàlisis complicades que donen lloc a solucions complexes que redueixen la qualitat del codi. Aquesta tesi proposa una solució a la programació paral·lela que incorpora al model les dependències de dades. El model de programació és capaç d'utilitzar aquesta informació per a trobar paral·lelisme que altrament seria molt difícil de detectar i d'extreure. Aquest enfoc augmenta la programabilitat i el paral·lelisme, i per tant també el rendiment. Tot i que al…