Oleksii Kuchaiev, Aleksandar Stevanovic, Wayne Hayes, Natasa Przulj
GraphCrunch 2 provides for a range of analysis that can be done on data networks. For each type of analysis separate tab exists where the user can specify the options for each analysis and load appropriate networks to be analyzed. In order to analyze any network, GraphCrunch 2 must load the network into its network storage database, which is done by using the Load network option located at the top toolbar. For ease of use, user can specify multiple networks to be loaded at the same time.
A network that is loaded is copied into network storage database and registered with the program, at which point it will appear in the network pane conveniently provided on the left side of GraphCrunch 2 ready for use. Loading a network for any type of analysis is equivalent to simply dragging the network from the network pane onto an appropriate field for each analysis.
GraphCrunch 2 provides access to all types of analysis through the list of tabs located on the main window of GraphCrunch. Each tab represents a different type of analysis, except for Results Plot and Task Manager which, as their name suggests, are in charge of presenting the computed results in plots and monitoring the progress of the computations, respectfully.
Pairwise data analysis allows the user to compare two sets of data networks against each other. Each set can contain one or more data networks, loaded in GraphCrunch. Comparisons are performed on unique pairs between sets, excluding duplicates. Therefore, if one wants to compute pairwise comparisons between all networks in a single set it is enough to assign the second set with the same networks as the first. Results are represented in a tabular fashion, on the same tab, with the ability to save them into a csv (comma separated file) or copy paste them into Microsoft Excel or any other spreadsheet application for further analysis.
To run the analysis, first select the set of networks you wish to compare by dragging them from the network pane into the provided fields on the "Pairwise Data Analysis" tab. Pressing "Run Analysis" button will initiate the computation, moving the user to the Task Manager tab, showing the current progress of the whole operation. During the computation, result table located just below the input sets will be populated with results. User may wish to extend the region of the table to see more rows, by dragging the middle line separating the data input and results sections.
Unlike Pairwise Data Analysis, Data vs Models Analysis focuses on one network or set of networks. As the name suggests, this analysis models the loaded data networks using the selected models (and their parameters) and compares them as to find out the best fitting model. The user is presented with 7 major models to choose from:
After selecting any of the above models, Model parameters area will change to reflect the selected model, providing the user with the way to change specific parameters for each model if there are any. For detailed description on the role and effects of each parameters please see the detailed manual.
On the right section of the window is the Models pane, which contains the list of all models scheduled to be generated per network. While this list is initially empty, user can specify which models should be generated, their parameters and the number of each model instances by selecting the appropriate controls and finally clicking on the Add button. In a similar manner the user can remove any models from the list by selecting them and clicking on the Remove button.
In this way, user can generate multiple models and instances with different parameters. Each model is characterized by its type, network indices and parameters. Indices allow user to identify which parameters generated model networks have later on, if there are for example multiple model instances of the same type computed but with different parameters.
As before, clicking on "Run comparisons" will initiate the computation with the application switching over to the Task Manager for an overview of the progress.
For both Pairwise Data Analysis and Data Vs Models analysis, GraphCrunch provides graphical representation of the results in the Results plot tab. After any of the two above mentioned analysis is complete, the plot will be updated with the data. Initially, all available data will be shown on the plot, as it will be indicated by the Data series list. The user, can manually select/deselect any of the networks/models to manipulate the plot to show only the desired subset of the results. Plot are provided for all types of results provided by the analysis, and the selection can be changed using the drop down box at the bottom named Plot type. Finally, the user can at any time save the plot as an PNG image by clicking on the Save plot button.
At times it is useful to assess some of the global properties of a network such as degree distribution, clustering coefficient and average path length. Network properties tab provides the user with a tabular view of resulting clustering coefficient and average path length, together with a plot of degree distribution.
To compute the properties, simply drag the desired networks from the network pane onto the properties table to initiation computation. As each network is processed it will update both the table and the plot showing degree distribution. The plot by default uses linear scale but two check boxes are provided to set scale for either of the axis' as logarithmic. Finally, the user can zoom in onto a particular section of the plot by dragging a selection box on the plot and zoom out by right clicking the button.
GraphCrunch 2 provides user with an option to allign two networks based solely on the topology using a novel algorithm called GRAAL. In the appropriately named tab, user can choose two networks to be aligned (by dragging them from the network pane) and specify the Alpha parameter which influences the aggressiveness of the algorithm between 0.1 and 1.0. Upon successful completion of the algorithm, Node correctness (NC), Edge correctness (EC) and Interaction Correctness (IC) will be displayed, each depicting how much of the nodes, edges and interactions have been aligned correctly, respectfully. In addition the full alignment is displayed in the result table below.
The user can also opt to compute signature similarities and cluster them according to the measure. In the "Signatures" tab, the user can specify the networks for which to compute signature similarities and the file where to save them.
Afterwards, user can cluster the the network using the computed signature similarities as a similarity measure by checking the option "I want to run clustering for Network 1" using K number of clusters, or opt to cluster using a different similarity measure. If opting for the latter, the user needs to provide the file with corresponding similarities between nodes in for that designated field.
Results are displayed in the result table below after computation, with clustering center in the first column and all other nodes in the cluster in the second column.