Question: SRA: fastq-dump gives different number of sequences
gravatar for jeetsahu
10 months ago by
jeetsahu10 wrote:

I have downloaded read sequences using fastq-dump with split file option and SRR id for paired sequences. But splitted files have different number of sequence reads. As per my understanding, since these are paired-end reads these should have equal number of sequences.

$fastq-dump -I --split-files SRR390728

$grep -c '>' SRR7716545_1.fastq


$grep -c '>' SRR7716545_2.fastq


Please correct me if I am wrong.

sequence sra • 359 views
ADD COMMENTlink modified 6 months ago by Biostar ♦♦ 20 • written 10 months ago by jeetsahu10
gravatar for ATpoint
10 months ago by
ATpoint23k wrote:

Both files have the same number of reads. You have to grep for '^@', because @ is the fastq header prefix. > is fasta.

ls *.fastq | parallel "echo {} && grep -c '^@' {}"
ADD COMMENTlink written 10 months ago by ATpoint23k

Thanks, I grepped different symbol. One quick question - Does fastq-dump gives latest dataset used for assembly? if yes how can I get old datasets?

ADD REPLYlink written 10 months ago by jeetsahu10

fastq-dump gives the fastq based on the input SRR you give it. I have no detail knowledge about your SRR.

ADD REPLYlink written 10 months ago by ATpoint23k

Hello jeetsahu ,

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.


ADD REPLYlink modified 10 months ago • written 10 months ago by finswimmer12k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1562 users visited in the last hour