Question: SRA: fastq-dump gives different number of sequences
2.2 years ago
jeetsahu10 wrote:

I have downloaded read sequences using fastq-dump with split file option and SRR id for paired sequences. But splitted files have different number of sequence reads. As per my understanding, since these are paired-end reads these should have equal number of sequences.

$fastq-dump -I --split-files SRR390728

$grep -c '>' SRR7716545_1.fastq


$grep -c '>' SRR7716545_2.fastq


Please correct me if I am wrong.

written 2.2 years ago by jeetsahu10
2.2 years ago
ATpoint44k wrote:

Both files have the same number of reads. You have to grep for '^@', because @ is the fastq header prefix. > is fasta.

ls *.fastq | parallel "echo {} && grep -c '^@' {}"
written 2.2 years ago by ATpoint44k

Thanks, I grepped different symbol. One quick question - Does fastq-dump gives latest dataset used for assembly? if yes how can I get old datasets?

written 2.2 years ago by jeetsahu10

fastq-dump gives the fastq based on the input SRR you give it. I have no detail knowledge about your SRR.

written 2.2 years ago by ATpoint44k

written 2.2 years ago by finswimmer14k
