Instructions for ColumnDist2diverstest.pl

This program reformats column distance matrix into the input format for running web tool of Two-Sample Tests for Comparing Intra-Individual Genetic Sequence Diversity Between Populations. Click here for more information of the tool.

Input

  1. Column distance matrix file (space or tab delimited text file that the last three column must be sequence id1, sequence id2 and distance between two sequences).

    Example:

    seq1	seq10	0.01486108
    seq1	seq2	0.00000000
    seq1	seq3	0.00000000
    seq1	seq4	0.00000000
    seq1	seq5	0.00162075
    seq1	seq6	0.01315215
    seq1	seq7	0.01803690
    seq1	seq8	0.01151463
    seq1	seq9	0.01636412
    seq2	seq10	0.01486108
    seq2	seq3	0.00000000
    seq2	seq4	0.00000000
    seq2	seq5	0.00162075
    seq2	seq6	0.01315215
    seq2	seq7	0.01803690
    seq2	seq8	0.01151463
    seq2	seq9	0.01636412
    seq3	seq10	0.01486108
    seq3	seq4	0.00000000
    seq3	seq5	0.00162075
    seq3	seq6	0.01315215
    seq3	seq7	0.01803690
    seq3	seq8	0.01151463
    seq3	seq9	0.01636412
    seq4	seq10	0.01486108
    seq4	seq5	0.00162075
    seq4	seq6	0.01315215
    seq4	seq7	0.01803690
    seq4	seq8	0.01151463
    seq4	seq9	0.01636412
    seq5	seq10	0.01648743
    seq5	seq6	0.01480351
    seq5	seq7	0.01970637
    seq5	seq8	0.01315187
    seq5	seq9	0.01802527
    seq6	seq10	0.00814550
    seq6	seq7	0.01638615
    seq6	seq8	0.00487810
    seq6	seq9	0.00980428
    seq7	seq10	0.01807237
    seq7	seq8	0.01474021
    seq7	seq9	0.01975881
    seq8	seq10	0.00650614
    seq8	seq9	0.00815682
    seq9	seq10	0.01144043
    			
  2. Group file (space or tab delimited file that defines the relationship among group, subject and sequence. The first three columns must be group id, subject id and sequence id).

    Example:

    group1	subject1   seq1
    group1	subject1   seq2 
    group1	subject1   seq3 
    group1	subject1   seq4 
    group1	subject1   seq5 
    group2	subject2   seq6
    group2	subject2   seq7
    group2	subject2   seq8
    group2	subject2   seq9
    group2	subject2   seq10
    
    

Output

  1. Distance file in diverstest format.

    Example:

    1	2	1	1	0.00000000
    1	3	1	1	0.00000000
    1	4	1	1	0.00000000
    1	5	1	1	0.00162075
    2	3	1	1	0.00000000
    2	4	1	1	0.00000000
    2	5	1	1	0.00162075
    3	4	1	1	0.00000000
    3	5	1	1	0.00162075
    4	5	1	1	0.00162075
    1	2	2	2	0.01638615
    1	3	2	2	0.00487810
    1	4	2	2	0.00980428
    1	5	2	2	0.00814550
    2	3	2	2	0.01474021
    2	4	2	2	0.01975881
    2	5	2	2	0.01807237
    3	4	2	2	0.00815682
    3	5	2	2	0.00650614
    4	5	2	2	0.01144043        	
            	
  2. Information file that indicates the correspondence of groups and subjects between input and output files.

    Example:

    Subject index	Subject
    1		subject1
    2		subject2
    
    Group index	Group
    1		group1
    2		group2        	
            	

How to run

  1. Download ColumnDist2diverstest.pl to your local computer, and set it executable by typing the following command in your working directory:

    chmod +x ColumnDist2diverstest.pl

  2. Create a subdirectory in your working directory to hold input and output files. In the subdirectory, type following command:

    perl ../ColumnDist2diverstest.pl InputColumnDistanceFile InputGroupFile OutputDistanceFile OutputInfoFile