Question: extract chromosome names from blast output
0
gravatar for Ric
22 months ago by
Ric250
Australia
Ric250 wrote:

HI, I ran blastn with blastall and I just wondering how could I extract each chromosome alignment to a separate file i.e. I have 10 chromosomes and I would like to get 10 files?

Here is my blastn output file example:
BLASTN 2.2.22 [Sep-27-2009]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= tig00022194
         (464,128 letters)

Database: musa_acuminata_v2_pseudochromosome-1-11.fasta.clean 
           11 sequences; 397,008,016 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

chr08                                                                8.185e+04   0.0  
chr05                                                                1.171e+04   0.0  
chr11                                                                1.080e+04   0.0  
chr02                                                                1.077e+04   0.0  
chr09                                                                1.037e+04   0.0  
chr03                                                                1.028e+04   0.0  
chr07                                                                1.002e+04   0.0  
chr06                                                                9932   0.0  
chr10                                                                9874   0.0  
chr01                                                                9846   0.0  
chr04                                                                9787   0.0  

>chr08
          Length = 44889171

 Score = 8.185e+04 bits (41289), Expect = 0.0
 Identities = 41909/42103 (99%), Gaps = 57/42103 (0%)
 Strand = Plus / Minus


Query: 376639   ctctccctctccacctcagagcaggcctggagttttgaggagcgtcgtcgcaaccctgct 376698
                |||| |||||||||||||||||||||||||||||| ||||||||||||||||||||||||
Sbjct: 21232377 ctcttcctctccacctcagagcaggcctggagttt-gaggagcgtcgtcgcaaccctgct 21232319


Query: 376699   gtgtggatcattgctagagaggaggacgcttgacctccttcaccttctcctaaggatctg 376758
                ||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 21232318 gtgtggatcatcgctagagaggaggacgcttgacctccttcaccttctcctaaggatctg 21232259


Query: 376759   caaggaaacagggatatacgatctccctaggtaacacaatatactctatacgcagttttg 376818
                ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 21232258 caaggaaacagggatatacgatctccctaggtaacacaatatactctatacgcagttttg 21232199

Thank you in advance

blast blastall • 647 views
ADD COMMENTlink modified 22 months ago by Pierre Lindenbaum122k • written 22 months ago by Ric250
0
gravatar for Pierre Lindenbaum
22 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum122k wrote:
 seq 1 11 |while read L; do  awk -v C="$L" 'BEGIN{ok=0;s=sprintf(">chr%02d",int(C));} /^>chr/ {ok=$0==s;} { if(ok) print ;} ' input.blast > input_${L}.blast ; done
ADD COMMENTlink written 22 months ago by Pierre Lindenbaum122k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1520 users visited in the last hour