Question: Number of reads in the downloaded fastq file
0
gravatar for kmkdesilva
3 months ago by
kmkdesilva90
United States
kmkdesilva90 wrote:

Hi,

I am trying to download some data from SRA. I used fasterq-dump. This is the command I used. fasterq-dump --split-files --split-spot -O /path/fastq SRR3045676

I wanted to check whether I have downloaded all the reads for the accession. When I used vdb-dump it showed there are 166,306,903 sequence reads under this accession. vdb-dump --info SRR3045676 SEQ:166,306,903

The output file of the fasterq-dump command said it has read 332,613,806 (166,306,903 x 2) reads. But 331,487,754 (165,743,877 x 2) was written. spots read : 166,306,903 reads read : 332,613,806 reads written : 331,487,754

But when I used the following command to count the reads in the downloaded file (R1), it gives a number (165,180,851) less than 165,743,877 echo $(zcat SRR2102500_R1.fastq.gz | wc -l)/4 | bc >> /path/readCount.txt 165,180,851

Can someone please explain why the output says a less number of reads were written and why even lesser number of reads are found in the downloaded fastq file. I tried downloading this accession twice and both times gave the same results. I downloaded few other accessions and they had the exact same number of sequences given by vdb-dump --info command in the final fastq file.

ADD COMMENTlink written 3 months ago by kmkdesilva90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1697 users visited in the last hour
_