Graphlet-based network comparison distances
Supplementary Information for: "Graphlet-based Characterization of Directed Networks"
A. Sarajlic, N. Malod-Dognin, O.N. Yaveroglu and N. Przulj
Corresponding author: Prof. Natasa Przulj, e-mail: natasa [AT] cs.ucl.ac.uk
We provide here the software and python script that we used to compute the various directed network distances presented in our paper.
This includes the Directed Graphlet Correlation Distance (DGCD), the Directed Relative Graphlet Frequency Distribution Distance (DRGF) and the Directed Graphlet Degree Distribution Agreement (DGDDA).
- First, all directed networks must be in edge-list format.
- Then, you must compute the directed graphlet degree vector signatures of each network using the provided C++ counter:
This will produce three files:
- my_network.graphletcounts.txt, which summarizes the counts of directed graphlets in the network (e.g., for computing DRGF),
- my_network.signatures.txt, which contains the directed graphlet degree signatures of the nodes, with one signature per line (e.g., for computing DGCD),
- my_network.dictionary.txt, which relates the line numbers in the signature files with the nodes IDs of the network.
- Finally, to compute DGCD-13 (DGCD using 2- to 3-nodes directed graphlet degrees), all networks and their signature files must be in the same folder:
"python Directed_Distances_v2.py network_folder 1 n", where n is the number of allowed parallel threads
You can check Directed_Distances_v2.py script to see all available network distance.