conservative position extraction
1
I have kind of fastq files with multiple records:
>ID some information
--A-TGTGAC
0100111100
etc.
Where the 2nd line is a consensus sequence (gap or nucleotide), and 3rd is (now binary) a conservative.
How to parse this file and extract position with the "1" score?
Pure Python code is too complicated. Biopython works with only Phred score.
sequence
consensus
conservative
fastq
parsing
• 804 views
•
link
updated 5.0 years ago by
JC
13k
•
written 5.0 years ago by
gatiyatov
•
0
Perl (because Python seems complicated):
use strict;
use warnings;
my $nl = 0;
my @sq = '' ;
while ( < > ) {
$nl ++;
if ( $nl == 1) {
print;
}
elsif ( $nl == 2) {
chomp;
@sq = split( //, $_ ) ;
elsif ( $nl == 3) {
chomp;
my @cn = split( //, $_ ) ;
if ( $
for ( my $i = 0; $i <= $
print $sq [ $i ] if ( $cn [ $i ] == 1) ;
}
print "\n" ;
}
else { die "line2 and line3 have different lenght\n" ; }
$nl = 0;
}
}
run as:
perl getCons.pl < FASTA_IN > FASTA_OUT
•
link
5.0 years ago by
JC
13k
Login before adding your answer.