Instructions for parseBlastXML_calcFreq.pl
This script parses the xml output file from NCBI's Blastn program and calculates nucleotide frequencies of query sequences at each position of the subject (reference) sequence.
Input
- Blastn output xml file.
- Subject (reference) sequence fasta file.
- Cutoff value for fraction of query sequence that aligned to subject sequence (0-1). The alignment that fraction is below the cutoff is considered as randomly. The default value is 0.6.
Output
- Tab delimited table of nucleotide frequencies at each position of subject (reference) sequence.
How to run
- Download parseBlastXML_calcFreq.pl to your local computer, and set it as executable by typing the following command in your working directory:
chmod +x parseBlastXML_calcFreq.pl
-
Create a subdirectory in your working directory to hold input files and output files. In the subdirectory, type following command:
perl ../parseBlastXML_calcFreq.pl blastXmlOutFile referencFastaFile outputFreqFile alignFractionCutoff