DIVER Help

Sequence file

The input sequence file is an alignment of nucleotide or amino-acid sequences. We accept three types of sequence files, which are fasta, nexus or phylip. The sequence data set could be in either sequential or interleaved format. The name of sequences MUST NOT contain any whitespace characters (tab or space). It is recommended that the sequence name is comprised of alphabetical characters, digits or underscore. The program will replace any other characters with underscore "_".

Example of Nexus file in sequential format:

#NEXUS 

BEGIN DATA;
	DIMENSIONS  NTAX=12 NCHAR=150;
	FORMAT DATATYPE=DNA  MISSING=? GAP=- ;
MATRIX

092398_316    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_339    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_315    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_317    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_312    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_889    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACTCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_894    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCCAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACGACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_896    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_916    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_917    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
ML1365_1      GAAGAAGAGGTAATAATTAGATCACAAAATTTCACAGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
ML1365_2      GAAGAAGAGGTAATAATTAGATCACAAAATTTCACAGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
;
END;

Example of Nexus file in interleaved format:

#NEXUS 

BEGIN DATA;
	DIMENSIONS  NTAX=12 NCHAR=150;
	FORMAT DATATYPE=DNA  MISSING=? GAP=-  INTERLEAVE ;
MATRIX

092398_316    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
092398_339    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
092398_315    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
092398_317    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
092398_312    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
082599_889    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
082599_894    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCCAAAACCATATTAGTACAGCTGAATGAAACTGTACA
082599_896    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
082599_916    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
082599_917    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
ML1365_1      GAAGAAGAGGTAATAATTAGATCACAAAATTTCACAGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
ML1365_2      GAAGAAGAGGTAATAATTAGATCACAAAATTTCACAGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA

092398_316    AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_339    AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_315    AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_317    AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_312    AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_889    AATTAATTGTACAAGACTCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_894    AATTAATTGTACAAGACCCAACGACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_896    AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_916    AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_917    AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
ML1365_1      AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
ML1365_2      AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
;
END;

Example of phylip file in sequential format:

12 150
092398_316    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_339    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_315    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_317    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
092398_312    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_889    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACTCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_894    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCCAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACGACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_896    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_916    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
082599_917    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
ML1365_1      GAAGAAGAGGTAATAATTAGATCACAAAATTTCACAGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
ML1365_2      GAAGAAGAGGTAATAATTAGATCACAAAATTTCACAGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACAAATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT

Example of phylip file in interleaved format:

12 150
092398_316    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
092398_339    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
092398_315    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
092398_317    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
092398_312    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
082599_889    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
082599_894    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCCAAAACCATATTAGTACAGCTGAATGAAACTGTACA
082599_896    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
082599_916    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
082599_917    GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
ML1365_1      GAAGAAGAGGTAATAATTAGATCACAAAATTTCACAGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
ML1365_2      GAAGAAGAGGTAATAATTAGATCACAAAATTTCACAGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA

AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACTCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACCCAACGACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT

Example of fasta file:

>092398_316    
GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>092398_339    
GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>092398_315    
GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>092398_317    
GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>092398_312    
GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>082599_889    
GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACTCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>082599_894    
GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCCAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACGACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>082599_896    
GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>082599_916    
GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>082599_917    
GAAGAAGAGGTAATAATTAGATCACAGAATTTCACGGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>ML1365_1      
GAAGAAGAGGTAATAATTAGATCACAAAATTTCACAGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT
>ML1365_2      
GAAGAAGAGGTAATAATTAGATCACAAAATTTCACAGACAATGCTAAAACCATATTAGTACAGCTGAATGAAACTGTACA
AATTAATTGTACAAGACCCAACAACAATACAAGAAAAAGCATACATATAGCACCAGGGAGAGCATTTTAT

Sequence data type

DNA or amino-acid. You must select correct data type for your sequence alignment file.

Outgroup file (optional)

A text file contains a list of outgroup sequence name(s) which must match the sequence name(s) in sequence file. It must be one sequence name per line. If user provides outgroup file, DIVER will calculate the Most Recent Common Ancestor (MRCA) and divergence from the MRCA.

Example:

ML1365_1
ML1365_2

Group file (optional)

A text file contains two tab delimited columns (1) a list of group names and (2) the corresponding sequence names which must match the sequence names in sequence file. For example, groups could be different sample time points or tissues/compartments. If user provides a group file, DIVER will calculate diversity for each defined group. Otherwise, diversity among all sequences will be calculated as a default group.

Example:

092398	092398_316
092398	092398_339
092398	092398_315
092398	092398_317
092398	092398_312
082599	082599_889
082599	082599_894
082599	082599_896
082599	082599_916
082599	082599_917

if you want to calculate the divergence from a specified sequence, your should defined the sequence as MRCA in your group file. You must type "MRCA" (not case-sensitive) in group field (first column) and the name of the sequence in sequence name field (second column):

MRCA	name_of_sequence

Perform bootstrap and Number of bootstrap replicates

DIVER generates bootstrapped pseudo data sets from the original data set using PhyML, then returns the bootstrap tree with branch lengths and bootstrap values, using standard NEWICK format. Note that the bootstrap analysis is time consuming process. The maximum number of bootstrapping replicates is set to 100 because of computational resource limitations.

Compute aLRT

With the bootstrap option off, users can perform approximate likelihood ratio test [1]. This approach is considerably faster than the bootstrap one.

Substitution model

A nucleotide or amino-acid substitution model. DIVER implements a wide range of substitution models via PhyML: GTR (default) [2,3], JC69 [4], K80 [5], F81 [6], HKY85 [7] and TN93 [8] for nucleotide sequences; LG (default) [9], HIVbetween [10], HIVwithin [10], WAG [11], Dayhoff [12], JTT [13], Blosum62 [14], mtREV [15], rtREV [16], cpREV [17], DCMut [18], VT [19], MtArt [20] and MtMAM [21] for amino-acid sequences.

Equilibrium frequencies

Nucleotide or amino-acid frequencies. They can be optimized (default) or empirical.

Optimized:
    Nucleotide sequences: the equilibrium base frequencies are estimated using maximum likelihood.
    Amino-acid sequences: the eqiulibrium amino-acid frequencies are estimated using the frequencies defined by the substitution model.
Empirical:
    Nucleotide sequences: the equilibrium base frequencies are estimated by counting the occurence of the different bases in the alignment.
    Amino-acid sequences: the equilibrium amino-acid freauencies are estimated by counting the occurence of the different amino-acids in the alignment.

Transition/transversion ratio

Can be fixed with a positive value (default 4.0) or estimated in the maximum likelihood framework. The later makes the program slower. This option is DNA sequences only under HKY85, K80 and TN93 substitution models. The definition of the transition/transversion ratio is the same as in PAML [22]. In PHYLIP, the "transition/transversion rate ratio" is used instead. 4.0 in PHYML roughly corresponds to 2.0 in PHYLIP.

Proportion of invariable sites

The proportion of invariable sites, i.e., the expected frequency of sites that do not evolve, can be fixed to any value in the 0.0-1.0 range or estimated (default) from the data in the maximum-likelihood framework. The latter makes the program slower.

Number of substitution rate categories

The different categories correspond to different rates of evolution from site to site. A discrete-gamma distribution is used to account for variation in substitution rates among sites, where the number of categories that defines this distribution can be supplied by the user. The larger this number, the better is the goodness-of-fit as compared to the continuous distribution. The number of categories of this distribution is set to 4 by default, in this case the likelihood of the phylogeny at one site is averaged over four conditional likelihoods corresponding to four rates and the computation of the likelihood is four times slower than with a single rate. Values for number of categories fewer than four or greater than eight are not recommended. In the first case, the discrete distribution is a poor approximation of the continuous one. In the second case, the computational burden becomes high and an higher number of categories is not likely to enhance the accuracy of phylogeny estimation.

Gamma distribution parameter

The shape of the gamma distribution determines the range of rate variation across sites. This option is used when having more than 1 substitution rate category. Small values, typically in 0.1-1.0 range, correspond to large variability. The higher its value, the lower the variation of substitution rates among sites. The gamma shape parameter can be fixed by the user or estimated (default) via maximum likelihood.

Type of tree improvement

There are three different types to estimate tree topologies. The default approach is to use simultaneous NNI [23]. The second approach relies on subtree pruning and regrafting (SPR) [24]. It generally finds better tree topologies compared to NNI but is also significantly slower. The third approach, Best of NNI & SPR, simply estimates the phylogeny using both methods and returns the best solution among the two.

Optimize tree topology, branch lengths and substitution rate parameters

By default all of three options are optimised in order to maximise the likelihood. There are different combinations that user can choose to optimise.

Your email

Users have to provide valid email address to receive the result. Within eamil there is a link to the analysis results.

Calculate diversity and/or divergence based on tree or pairwise distances

Users are allowed to specify whether divergence and diversity are calculated as tree-based (patristic) distances or genetic distances (not conditioned on a tree topology) or both.

Distance file

Users can provide their own distance matrix to calculate divergence and diversity via DIVER. DIVER accepts two types of distance arrays (matrix and column). The data must be tab delimited.

Examples of matrix:

lower-triangular:

    	 taxa1	 taxa2	 taxa3	 taxa4
taxa1
taxa2	0.0056
taxa3	0.0027	0.0138	
taxa4	0.0078	0.0023	0.0123

upper-triangular:

     	taxa1	 taxa2	 taxa3	 taxa4
taxa1	     	0.0056	0.0027	0.0078
taxa2	     	      	0.0138	0.0023
taxa3	     	      	      	0.0123
taxa4	     	      	

square:

     	 taxa1	 taxa2	 taxa3	 taxa4
taxa1	0.0000	0.0056	0.0027	0.0078
taxa2	0.0056	0.0000	0.0138	0.0023
taxa3	0.0027	0.0138	0.0000	0.0123
taxa4	0.0078	0.0023	0.0123	0.0000

Example of column:

taxa1	taxa2	0.0056
taxa1	taxa3	0.0027
taxa1	taxa4	0.0078
taxa2	taxa3	0.0138
taxa2	taxa4	0.0023
taxa3	taxa4	0.0123

Layouts of DIVER analysis results

(A) DIVER output interface from which user can view and download results. (B) Estimated maximum likelihood phylogenetice tree viewed through the ATV Java aplet. (C) Estimated evolutionary parameters. (D) Reconstructed MRCA sequence. (E) Summarized divergence. (F) Plot of divergence. (G) Summarized diversity. (H) Plot of diversity. (I) Distance matrix. (J) Distance distribution histogram. (K) Summarized distances between different groups.

References

[1] Anisimova M., Gascuel O. Approximate Likelihood-Ratio Test for Branches: A Fast, Accurate, and Powerful Alternative. Systematic Biology, 55:539-552 (2006).
[2] Lanave, C., Preparata, G., Saccone, C. & Serio, G. A new method for calculating evolutionary substitution rates. Journal of Molecular Evolution 20:86-93 (1984).
[3] Tavare, S. Some probabilistic and statistical problems on the analysis of DNA sequences. Lectures on Mathematics in the Life Sciences 17: 57-86 (1986).
[4] Jukes, T. & Cantor, C. Evolution of protein molecules. In Munro, H. (ed.) Mammalian Protein Metabolism, vol. III, chap. 24, 21-132 (Academic Press, New York, 1969).
[5] Kimura, M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution 16:111-120 (1980).
[6] Felsenstein, J. Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution 17:368-376 (1981).
[7] Hasegawa, M., Kishino, H. & Yano, T. Dating of the Human-Ape splitting by a molecular clock of mitochondrial-DNA. Journal of Molecular Evolution 22:160-174 (1985).
[8] Tamura, K. & Nei, M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Molecular Biology and Evolution 10:512-526 (1993).
[9] Le, S. & Gascuel, O. An improved general amino-acid replacement matrix. Mol. Biol. Evol. 25:1307-1320 (2008).
[10] Nickle DC, Heath L, Jensen MA, Gilbert PB, Mullins JI, Kosakovsky Pond SL. HIV-specific probabilistic models of protein evolution. PLoS ONE 2:e503 (2007).
[11] Whelan, S. & Goldman, N. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Molecular Biology and Evolution 18:691-699 (2001).
[12] Dayhoff, M., Schwartz, R. & Orcutt, B. A model of evolutionary change in proteins. In Dayhoff, M. (ed.) Atlas of Protein Sequence and Structure, vol. 5, 345-352 (National Biomedical Research Foundation, Washington, D. C., 1978).
[13] Jones, D., Taylor, W. & Thornton, J. The rapid generation of mutation data matrices from protein sequences. Computer Applications in the Biosciences (CABIOS) 8:275-282 (1992).
[14] Henikoff, S. & Henikoff, J. Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences of the United States of America (PNAS) 89:10915-10919 (1992).
[15] Adachi, J. & Hasegawa, M. MOLPHY version 2.3. programs for molecular phylogenetics based on maximum likelihood. In Ishiguro, M. et al. (eds.) Computer Science Monographs, 28 (The Institute of Statistical Mathematics, Tokyo, 1996).
[16] Dimmic, M., Rest, J., Mindell, D. & Goldstein, D. rtREV: an amino acid substitution matrix for inference of retrovirus and reverse transcriptase phylogeny. Journal of Molecular Evolution 55:65-73 (2002).
[17] Adachi, J., P., W., Martin, W. & Hasegawa, M. Plastid genome phylogeny and a model of amino acid substitution for proteins encoded by chloroplast DNA. Journal of Molecular Evolution 50:348-358 (2000).
[18] Kosiol, C. & Goldman, N. Different versions of the Dayhoff rate matrix. Molecular Biology and Evolution 22:193-199 (2004).
[19] Muller, T. & Vingron, M. Modeling amino acid replacement. Journal of Computational Biology 7:761-776 (2000).
[20] Abascal F, Posada D, Zardoya R. MtArt: a new model of amino acid replacement for Arthropoda. Mol Biol Evol. 24:1-5 (2007).
[21] Cao, Y. et al. Conflict among individual mitochondrial proteins in resolving the phylogeny of eutherian orders. Journal of Molecular Evolution 47:307-322 (1998).
[22] Yang Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol. 39:306-14 (1994).
[23] Guindon, S. & Gascuel, O. A simple, fast and accurate algorithm to estimate largephylogenies by maximum likelihood. Systematic Biology 52:696-704 (2003).
[24] Hordijk W., Gascuel O. Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood. Bioinformatics 21:4338-4347 (2005).