In the last few years, thanks to the efforts of two PhDs under my supervision, Zekarias Kefato and Nasrullah Sheikh, I started to explore problems that are completely outside the distributed systems topic: namely, machine learning with a particular focus on network representation learning. Maybe it is the academic age, but when I say “I started to explore” really means Zekarias and Nasrullah were doing 95% of the work, and my role has been just to make sure that the papers are easy to read.
Network Representation Learning is a method to learn a low-dimensional embedding of a graph such that its geometrical properties are preserved. The learned embeddings are used in various downstream machine learning tasks such as classification and link prediction.
NRL can be performed by using various sources of information in a graph such as network structure, attributes, and cascades. These sources can be used independently or in combination, depending on their availability. Early research focused on using only structural information due to its default availability. Recent results suggest that using additional information may help in learning a better representation. The challenge is how to incorporate different sources of information in the learning process.
Towards this end, we worked on two directions:
- Network representation learning on attributed graphs and heterogeneous graphs. GAT2VEC [Computing19] learns a representation of nodes from structural context and attribute context obtained from structural and attribute information respectively. The HETNET2VEC [SNAMS18b] model is for heterogenous network representation learning. The model preserves the various semantic relationship among nodes to learn a representation.
- Using cascade information for network representation learning [MLG17] [MLG17][LOD18] and virality prediction [SNAMS18a]. In the case of social networks, the underlying network information may not be available due to provider restrictions, but we can observe the diffusion events which are signals of the underlying networks. Using cascades we can use recover the underlying network through network representation learning.
[MAISON17] Zekarias T. Kefato and Alberto Montresor. Personalized influencer detection: Topic and exposure-conformity aware. In Proc. of the International Workshop on Mining Actionable Insights from Social Networks, MAISoN’17. ACM, February 2017. [PDF], [Bibtex].
[MLG17] Zekarias T. Kefato, Nasrullah Sheikh, and Alberto Montresor. Deepinfer: Diffusion network inference through representation learning. In Proc. of the 13th International Workshop on Mining and Learning With Graphs, MLG’17. ACM, August 2017. [PDF], [Bibtex].
[MOD17] Zekarias T. Kefato, Nasrullah Sheikh, and Alberto Montresor. Mineral: Multi-modal network representation learning. In Proc. of the 3rd International Conference on Machine Learning, Optimization and Big Data, MOD’17. ACM, September 2017. [PDF], [Bibtex].
[LOD18] Zekarias T. Kefato, Nasrullah Sheikh, and Alberto Montresor. REFINE: Representation learning from diffusion events. In Proc. of the 4th Conference on Machine Learning, Optimization and Data science, LOD’18. Springer, September 2018. [PDF], [Bibtex].
[SNAMS18a] Zekarias T. Kefato, Nasrullah Sheikh, Leila Bahri, Amira Soliman, Alberto Montresor, and Sarunas Girdzijauskas. CAS2VEC: network-agnostic cascade prediction in online social networks. In Proc. of the 5th International Conferenceon Social Networks Analysis, Management and Security (SNAMS 2018), pages 72–79. IEEE, October 2018. [PDF], [Bibtex].
[SNAMS18b] Nasrullah Sheikh, Zekarias T. Kefato, and Alberto Montresor. Semi-supervised heterogeneous information network embedding for node classification using 1D-CNN. In Proc. of the 5th International Conference on Social Networks Analysis, Management and Security (SNAMS 2018), pages 177–181. IEEE, October 2018. [PDF], [Bibtex].