Number of reads in the downloaded fastq file
0
0
Entering edit mode
3.5 years ago
Kash ▴ 110

Hi,

I am trying to download some data from SRA. I used fasterq-dump. This is the command I used. fasterq-dump --split-files --split-spot -O /path/fastq SRR3045676

I wanted to check whether I have downloaded all the reads for the accession. When I used vdb-dump it showed there are 166,306,903 sequence reads under this accession. vdb-dump --info SRR3045676 SEQ:166,306,903

The output file of the fasterq-dump command said it has read 332,613,806 (166,306,903 x 2) reads. But 331,487,754 (165,743,877 x 2) was written. spots read : 166,306,903 reads read : 332,613,806 reads written : 331,487,754

But when I used the following command to count the reads in the downloaded file (R1), it gives a number (165,180,851) less than 165,743,877 echo $(zcat SRR2102500_R1.fastq.gz | wc -l)/4 | bc >> /path/readCount.txt 165,180,851

Can someone please explain why the output says a less number of reads were written and why even lesser number of reads are found in the downloaded fastq file. I tried downloading this accession twice and both times gave the same results. I downloaded few other accessions and they had the exact same number of sequences given by vdb-dump --info command in the final fastq file.

SRA read count fasterq-dump vdb-dump --info • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 1765 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6