The following paper gives a very nice example of a biological network analysis in which many of the nine phases of a network analytic project can be found:

Christine Vogel, Sarah A. Teichmann, and Jose Pereira-Leal

"The Relationship between Domain Duplication and Recombination", J. Mol. Biol. 346(1):355-65, 2005

0) Pose the question: Are domain combinations just determined by random combinations of domains with different abundance or is there a further step of selection?

1) Build the network: Domains are characterized into different superfamilies and two superfamilies are connected by a directed edge if they are directly adjacent in the N->C direction of a gene in at least one gene.

2) Choose subgraph H: ego-network

3) Choose measure: size of ego-network (versatility) vs. external measure, namely the abundance of the node (number of times the domain appears in the genome).

4) Null-model: random combinations of domains while maintaining the abundance of the node

First result: the relationship between versatility and abundance is very stable, independent of how the data set is divided up into different categories. Tested were: division by structural class, phylogeny, protein's function, or the process it contributes to. This result thus shows that the null-model of random combination cannot be rejected.

5) Designing the model: the null-model maintains the abundance of each domain, the number of domains per gene, and their total number per genome.

6) Comparison observation/model: The versatility of the nodes in this model is much higher than in reality, i.e., in the model domains have fewer adjacent domains than in reality.

7) Design of a new model that suits the findings better: here, domains are shuffled and then duplicated until one of the component domains is used up.

Processes were not tested on any of the models, thus steps 8/9 are non-existing.

## No comments:

## Post a Comment