Mar 20, 2020

Bachelor thesis

Towards High-Performance Clique Mining

My bachelor thesis is not publicly available.
If you happen to be interested in the thesis, please contact me.

Graph

On listing dense subgraphs fast

Maximal clique enumeration is a well studied problem in the field of data mining. Especially, the popular algorithm by Bron and Kerbosch is deemed to be one of the most useful ones in practice, which is why there were many successful attempts to enhance it further. In this work, we enhance the design of the Bron-Kerbosch (BK) clique enumeration algorithm with several techniques and optimizations related to both the algorithm design and the implementation details. Among others, we apply a new form of vertex ordering in the context of BK in the preprocessing phase to significantly prune the search space, we combine ideas from several different efforts, and we incorporate notions from set algebra to further reduce the total work. We focus on massively parallel shared memory architectures, which enables us to compare our final BK design to state-of-the-art BK implementations. We illustrate significant speedups of up to 7x over most recent work by Das et al. Note here, that we have improved this speedup even further, which is further described in our GraphMineSuite paper.

In addition, we integrate BK into the recently developed GraphMineSuite (GMS), an infrastructure for design, implementation, and evaluation of graph mining algorithms. For this, we use the set-centric formulation of BK to enhance our BK implementation within GMS so that it is highly modular, enabling future users to easily apply different optimizations and test new routines aimed at accelerating BK. For example, the user can straightforwardly try new representations of key data structures, or seamlessly add novel preprocessing routines. Finally, we apply our algorithmic ideas to the problem of parallel graph coloring, securing similar performance gains.