NSF CAREER: Tools for Analyzing, Modeling, and Comparing Protein-Protein Interaction Networks

 

NSF project award number: 0644424

NSF directorate: CISE

NSF Division: IIS

 

PI: Natasa Przulj

Project start date: January 15, 2007

Expected project end date: December 31, 2011

 

The award abstract:

The project proposes improvements in tools for analyzing and modeling of Protein-Protein Interaction (PPI) networks.  A PPI network is a mathematical representation of the physical interactions amongst proteins in a cell: it is a graph in which nodes represent proteins and edges between the nodes represent possible physical interactions between the corresponding proteins.  Currently available PPI networks are partial and mostly resulting from various high throughput proteomics experiments that attempt to discover as many PPIs as possible in a number of organisms.  The resulting PPI data sets are large including tens of thousands of interactions amongst thousands of proteins even for a unicellular eukaryotic organism such as baker’s yeast.

 

The project introduces new, more sensitive measures of network local structure that are based on enumeration of “graphlets”, small induced subgraphs of large graphs, as well as on distinguishing during the enumeration between the nodes in a graphlet that cannot be mapped to one another by an automorphism (an isomorphism of a graph with itself).  In this way, the degree distribution is generalized into the spectrum of “graphlet degree distributions”.  These new measures produce graphlet-based network “feature vectors” that are further used to heuristically compare synthetic (model based) networks to experimentally determined PPI networks.  The project’s preliminary results indicate that geometric random graphs provide a superior fit to the currently available PPI networks than do other network models including scale-free networks.  The project proposes to build upon these preliminary results towards an efficient PPI network alignment heuristic that would incorporate biological information of the proteins in the networks being aligned, such as protein functional, structural, and amino-acid sequence similarity.  Furthermore, the project proposes to use geometric random graph models of PPI networks as the basis for algorithmic strategies that would optimize the cost of experimentally exploring the entire space of protein-protein interactions in the cell. The use of geometric random graph models and associated graphlet-based network feature vectors can be extended to other network modeling applications, such as epithelial planar cell polarization, chemical informatics, social network analyses, etc.

 

It is expected that the project will lead to better algorithms for various network comparison problems on PPI and other networks.  Exploitation of a network model may significantly reduce the cost of characterizing the interactomes of various organisms.

 

The PI is involved in training high school, undergraduate and graduate students, including two female students.  This work is being included in various UC Irvine courses.  The software will be provided as a free open-source toolkit for other researchers.  Additional information will be available at http://www.ics.uci.edu/~natasha/.

 

Trainees:

 

Oleksii Kuchaiev -- Ph.D. student

Vesna Memisevic -- Ph.D. student

Tijana Milenkovic -- Ph.D. student

Marija Rasajski -- post-doc

Raymond Yu -- undergraduate assistant

 

Former group members:

 

David Hubin -- undergraduate assistant.

Jason Lai -- undergraduate assistant.

Naveen Nathan – undergraduate assistant.

 

Publications:

 

9. Tijana Milenkovic, Ioannis Filippis, Michael Lappe, and Natasa Przulj, Optimized Null Model for Residue Interaction Graphs, submitted.

 

8. N. Przulj and T. Milenkovic, Computatioanl Methods for Analyzing and Modeling Biological Networks, a chapter in “Biological Data Mining”, edited by Jake Chen and Stefano Lonardi,, CRC Press, forthcoming.

 

7. O. Kuchaiev and N. Przulj, Learning the structure of protein-protein interaction networks, Proceedings of the 2009 Pacific Symposium on Biocomputing (PSB), Big Island, Hawaii, January 5-9, 2009.

 

6. C. Guerrero, T. Milenkovic, N. Przulj, P. Kaiser, L. Huang, Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis, PNAS, 105 (36), 13333-13338, 2008.

 

5. Tijana Milenkovic and Natasa Przulj, Uncovering Biological Network Function via Graphlet Degree Signatures, Cancer Informatics, 2008:6 257-273, 2008.

Also: Technical Report No. 08-01, Donald Bren School of Information and Computer Sciences, University of California, Irvine, USA. arXiv:0802.0556v1 [q-bio MN] 5 Feb 2008.

 

4. Desmond J. Higham, Marija Rasajskim, and Natasa Przulj, Fitting a Geometric Graph to a Protein-Protein Interaction Network, Bioinformatics, 24 (8):1093-1099, 2008.

 

3. Tijana Milenkovic, Jason Lai, and Natasa Przulj, GraphCrunch: A Tool for Large Network Analyses, BMC Bioinformatics, 9:70, January 30, 2008.  Highly accessed.

 

2. Natasa Przulj, Geometric Local Structure in Biological Networks, Information Theory Workshop, 2007. ITW '07. IEEE, (2007), p. 402., 10.1109/ITW.2007.4313108 .

 

1. Fereydoun Hormozdiari , Petra Berenbrink, Natasa Przulj, S. Cenk Sahinalp, Not all scale-free networks are born equal: The role of the seed graph in PPI network evolution, PLoS Computational Biology, vol. 3, (2007), p. e118., doi:10.1371/journal.pcbi.0030118.

 

 

The project was funded under the NSF Faculty Early Career Development (CAREER) Program.

 

Page last updated on September 22, 2008.