Question: SRA: fastq-dump gives different number of sequences
0
gravatar for jeetsahu
21 months ago by
jeetsahu10
jeetsahu10 wrote:

I have downloaded read sequences using fastq-dump with split file option and SRR id for paired sequences. But splitted files have different number of sequence reads. As per my understanding, since these are paired-end reads these should have equal number of sequences.

$fastq-dump -I --split-files SRR390728

$grep -c '>' SRR7716545_1.fastq

694067

$grep -c '>' SRR7716545_2.fastq

1026976

Please correct me if I am wrong.

sequence sra • 611 views
ADD COMMENTlink modified 17 months ago by Biostar ♦♦ 20 • written 21 months ago by jeetsahu10
3
gravatar for ATpoint
21 months ago by
ATpoint36k
Germany
ATpoint36k wrote:

Both files have the same number of reads. You have to grep for '^@', because @ is the fastq header prefix. > is fasta.

ls *.fastq | parallel "echo {} && grep -c '^@' {}"
SRR7716545_1.fastq
5644111
SRR7716545_2.fastq
5644111
ADD COMMENTlink written 21 months ago by ATpoint36k

Thanks, I grepped different symbol. One quick question - Does fastq-dump gives latest dataset used for assembly? if yes how can I get old datasets?

ADD REPLYlink written 21 months ago by jeetsahu10

fastq-dump gives the fastq based on the input SRR you give it. I have no detail knowledge about your SRR.

ADD REPLYlink written 21 months ago by ATpoint36k

Hello jeetsahu ,

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.

Upvote|Bookmark|Accept

ADD REPLYlink modified 21 months ago • written 21 months ago by finswimmer13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 676 users visited in the last hour