AbstractsComputer Science

Efficient synchronization techniques for shared memory systems

by Nikolaos Kallimanis




Institution: University of Ioannina; Πανεπιστήμιο Ιωαννίνων
Department:
Year: 2013
Keywords: Κατανεμημένος υπολογισμός; Αλγόριθμοι; Κοινόχρηστα αντικείμενα; Καθολικά κοινόχρηστα αντικείμενα; Κοινόχρηστες στοίβες; Κοινόχρηστες ουρές; Τεχνικές συγχρονισμού; Ιεραρχικοί αλγόριθμοι; Distributed computing; Algorithms; Shared objects; Universal objects; Shared stacks; Shared queues; Synchronization techniques; Hierarchical algorithms
Record ID: 1155405
Full text PDF: http://hdl.handle.net/10442/hedi/30106


Abstract

Overcoming the difficulty of concurrent programming has never become more urgent due to the proliferation of multicore machines and the imperative necessity of exploiting their computational power. One way to achieve this is by designing efficient concurrent data structures; common structures, like stacks and queues, are the most widely used inter-thread communication mechanisms. Additionally, synchronization techniques are required to efficiently execute, in a concurrent environment, those parts of modern applications that require significant synchronization. Although the efficient parallelization of these parts is not an easy task, Amdhal's law implies that achieving this is necessary in order to avoid significant reductions in speed-up.In this dissertation three families of highly efficient synchronization algorithms, called RedBlue, Sim and Synch are presented for executing concurrently blocks of code that have originally been programmed to be executed sequentially in asynchronous shared-memory distributed systems.We start by presenting the RedBlue family of adaptive synchronization algorithms that use common base objects (LL/SC or CAS and Read-Write) provided by the majority of the real-world machines.The first of these algorithms achieves better time complexity than all previously presented algorithms. This algorithm uses large LL/SC base objects and it comprises the keystone for the design of the other RedBlue algorithms that use smaller base objects. Specifically, the second algorithm significantly reduces the size of the required base objects. The rest of the algorithms have been designed for large objects.The Sim family of synchronization algorithms achieves at (1) getting better time complexity by using base objects other than LL/SC and read-write (i.e. Swap, AtomicAdd, etc) and (2) competing in terms of performance with the state-of-the-art synchronization algorithms (i.e. high performance spin-locks, etc. Sim is a simple synchronization algorithm with constant step complexity using a Fetch&Add additional to an LL/SC object. Sim has been implemented for a real shared-memory machine architecture. Its practical version, called P-Sim, outperforms several state-of-the-art lock-based and lock-free synchronization algorithms, while being wait-free, i.e. satisfying a stronger progress condition than all the algorithms that it outperforms.We further study blocking implementations of the combining technique with the goal of discovering where their real performance power resides and whether or how performance is impacted by ensuring some desired properties (e.g. fairness in serving requests). This is accomplished by presenting two new blocking implementations of the combining technique; the first (CC-Synch) is highly-efficient in systems that support coherent caches, whereas the second (DSM-Synch) works better in cache-less NUMA machines. In comparison to previous blocking implementations, the new implementations (1) provide bounds on the number of remote memory references (RMRs) that they perform, (2) support a…