Entering edit mode
2.5 years ago
melissachua90
▴
70
I ran the Spades
software (https://github.com/ablab/spades), which outputs contigs.fasta
for f in `ls -1 *_1.fq | sed 's/_1.fq//’`;
do spades.py -o . --mismatch-correction -1 $f\_1.fq -2 $f\_2.fq -t 20;
done
Within the same directory, I created and executed a bash script filter_contigs.sh
:
for file in contigs.fasta; do
grep -F ">" $file | sed -e 's/_/ /g' |sort -nrk 6 |awk '$6>=2.0 && $4>=500 {print $0}'|
sed -e 's/ /_/g'|sed -e 's/>//g'>$file.txt
echo sequences to keep
wc -l $file.txt
echo running fastagrep.pl
../fastagrep.pl -f $file.txt $file > HCov.$file
echo sequences kept
grep -c ">" HCov.$file
done
When I executed filter_contigs.sh
, it fails to capture/keep any contigs. Why?
sequences to keep
0 contigs.fasta.txt
running fastagrep.pl
sequences kept
0
what is the output of
?
hmmm I didn't realize
contigs.fasta
is empty. Perhaps I will rerun spades. I think--mismatch-correction
option is not available for my Spades' version.Maybe
--only-error-correction
?I want to both correct and assemble the reads (I removed the previous error correction steps).
There is no switch needed for both error-correction and assembly, as that is a default behavior.
But if I use
--only-error-correction
, doesn't it mean that it won't perform the assembly? If I runspades.py -o . -1 read_1 -2 read_2 -t 20
, it would perform both error correction and assembly, right?A word of caution; it's been a while since I've used SPAdes so maybe this has changed but I'm pretty sure the "coverage" value in the header of SPAdes contigs is not sequence coverage, it is kmer coverage.