Learning of Configurations of an N-way Model Matching Algorithm

Page content

Context

Software models such as UML-like diagrams are frequently used to describe a program’s structure and behavior in graph-like structure. Identifying the common and specific parts of a set of such model graphs is an essential prerequisite for many development tasks, notably for merging several variants of a model graph into a unified one. To that end, Schultheiss et al. proposed a generic algorithm for n-way model matching which scales for large model graphs. This is achieved by indexing the graph nodes in a multi-dimensional search tree which allows for efficient range queries to find the most suitable matching candidates.

Goal

A major drawback of the approach of Schultheiss et al. is that it relies on the manual definition of the vectorization of model graphs in order to build the multi-dimensional search trees, which has to be handcrafted for every type of model and requires sophisticated domain knowledge. The goal of this thesis is to learn such vectorizations automatically.

Approach

In recent years, graph neural networks (GNNs) emerged as a versatile method for node and/or graph embeddings. In various practical applications, GNNs have shown superior performance when compared to other graph based pattern recognition technologies. The very basic idea of GNNs is to train an ANN on the feature vectors stored as labels in individual nodes in the graph. Thereby, the aim is to learn a holistic vector representation of the complete graph. In the present project we aim at training standard GNNs with different architectures in order to verify whether or not the manually defined vectorization can be automatically learned by means of neural networks operating on graphs.

Required Skills

Good programming skills, particularly working with different APIs Basic statistics and a bit of graph theory and algorithms

Remark

The thesis is at the intersection of software engineering and pattern recognition. It will be supervised by the Software Engineering Group (Prof. Kehrer) that covers the application-oriented perspective, and the Pattern Recognition Group (Dr. Riesen) that contributes its expertise on graph matching and machine learning.

Further Reading

  • Schultheiß, A., Bittner, P. M., Boll, A., Grunske, L., Thüm, T., & Kehrer, T. (2022). RaQuN: a generic and scalable n-way model matching algorithm. Software and Systems Modeling, 1-23.
  • A Gentle Introduction to Graph Neural Networks:
  • Miltiadis Allamanis, Marc Brockschmidt, Mahmoud Khademi: Learning to Represent Programs with Graphs. ICLR 2018
  • Kechi Zhang, Wenhan Wang, Huangzhao Zhang, Ge Li, Zhi Jin: Learning to represent programs with heterogeneous graphs. ICPC 2022: 378-389