NSF
project award number: *0644424*

NSF
directorate: CISE

NSF
Division: IIS

PI:
*Natasa
Przulj*

Project
start date: January 15, 2007

Expected
project end date: December 31, 2011

The
project proposes improvements in tools for analyzing and modeling of
Protein-Protein Interaction (PPI) networks.
A PPI network is a mathematical representation of the physical
interactions amongst proteins in a cell: it is a graph in which nodes represent
proteins and edges between the nodes represent possible physical interactions
between the corresponding proteins.
Currently available PPI networks are partial and mostly resulting from
various high throughput proteomics experiments that attempt to discover as many
PPIs as possible in a number of organisms.
The resulting PPI data sets are large including tens of thousands of
interactions amongst thousands of proteins even for a unicellular eukaryotic
organism such as baker’s yeast.

The
project introduces new, more sensitive measures of network local structure that
are based on enumeration of “graphlets”, small induced subgraphs of large
graphs, as well as on distinguishing during the enumeration between the nodes
in a graphlet that cannot be mapped to one another by an automorphism (an
isomorphism of a graph with itself). In
this way, the degree distribution is generalized into the spectrum of “graphlet
degree distributions”. These new
measures produce graphlet-based network “feature vectors” that are further used
to heuristically compare synthetic (model based) networks to experimentally
determined PPI networks. The project’s
preliminary results indicate that geometric random graphs provide a superior
fit to the currently available PPI networks than do other network models including
scale-free networks. The project
proposes to build upon these preliminary results towards an efficient PPI
network alignment heuristic that would incorporate biological information of
the proteins in the networks being aligned, such as protein functional,
structural, and amino-acid sequence similarity.
Furthermore, the project proposes to use geometric random graph models
of PPI networks as the basis for algorithmic strategies that would optimize the
cost of experimentally exploring the entire space of protein-protein
interactions in the cell. The use of geometric random graph models and
associated graphlet-based network feature vectors can be extended to other
network modeling applications, such as epithelial planar cell polarization,
chemical informatics, social network analyses, etc.

It
is expected that the project will lead to better algorithms for various network
comparison problems on PPI and other networks.
Exploitation of a network model may significantly reduce the cost of
characterizing the interactomes of various organisms.

The
PI is involved in training high school, undergraduate and graduate students,
including two female students. This work
is being included in various UC Irvine courses.
The software will be provided as a free open-source toolkit for other
researchers. Additional information will
be available at http://www.ics.uci.edu/~natasha/.

*Oleksii
Kuchaiev* -- Ph.D. student

*Vesna
Memisevic* -- Ph.D. student

*Tijana
Milenkovic* -- Ph.D. student

*Marija
Rasajski* -- post-doc

Raymond
Yu -- undergraduate assistant

David
Hubin -- undergraduate assistant.

Jason
Lai -- undergraduate assistant.

Naveen
Nathan – undergraduate assistant.

**9.** Tijana Milenkovic, Ioannis
Filippis, Michael Lappe, and Natasa Przulj, * Optimized
Null Model for Residue Interaction Graphs*, submitted.

**8. N. Przulj** and T. Milenkovic, * Computatioanl Methods for Analyzing and Modeling Biological Networks*,
a chapter in “Biological Data Mining”, edited by Jake Chen and Stefano
Lonardi,, CRC Press, forthcoming.

**7.** O. Kuchaiev
and **N. Przulj**, *Learning the structure of protein-protein interaction networks*,
Proceedings of the *2009 Pacific Symposium on Biocomputing* (PSB), Big Island,
Hawaii, January 5-9, 2009.

**6.** C.
Guerrero, T. Milenkovic, **N. Przulj**,
P. Kaiser, L. Huang, *Characterization of the proteasome interaction
network using a QTAX-based tag-team strategy and protein interaction network
analysis,* *PNAS*, 105 (36), 13333-13338, 2008.

**5.** Tijana Milenkovic and
Natasa Przulj, *Uncovering Biological Network Function via Graphlet Degree Signatures*,
*Cancer Informatics*, 2008:6 257-273,
2008.

Also:
Technical Report No. 08-01, *arXiv:0802.0556v1*
[q-bio MN] 5 Feb 2008.

**4.** Desmond J. Higham, Marija
Rasajskim, and Natasa Przulj, *Fitting a Geometric Graph to a Protein-Protein Interaction Network*,
*Bioinformatics**,* 24
(8):1093-1099, 2008.

**3.** Tijana Milenkovic, Jason
Lai, and Natasa Przulj, *GraphCrunch: A Tool for Large Network
Analyses*, *BMC Bioinformatics*,** ****9**:70,
January 30, 2008. *Highly accessed*.

**2.** Natasa Przulj, *Geometric
Local Structure in Biological Networks*, *Information
Theory Workshop, 2007. ITW '07. IEEE, (2007)*, p. 402.,
10.1109/ITW.2007.4313108 .

**1.** Fereydoun Hormozdiari ,
Petra Berenbrink, Natasa Przulj, S. Cenk Sahinalp, *Not all scale-free networks are born
equal: The role of the seed graph in PPI network evolution*, *PLoS Computational Biology*,
vol. 3, (2007), p. e118., doi:10.1371/journal.pcbi.0030118.

The
project was funded under the NSF *Faculty Early Career Development (CAREER)
Program*.

*Page last updated on
September 22, 2008.*