Question: sra to fastq
0
gravatar for vimlakany
15 months ago by
vimlakany0
vimlakany0 wrote:

How does SRA file looks and how is it converted into two fastq file in case of paired end?

rna-seq • 658 views
ADD COMMENTlink modified 15 months ago by ThePresident90 • written 15 months ago by vimlakany0

ProTip: You can avoid using sratoolkit altogether. Search EBI-ENA with the SRA accession number to download fastq files directly (disclaimer: may not work for new SRA submissions, from within last day or two, but will eventually catch up).

ADD REPLYlink modified 15 months ago • written 15 months ago by genomax40k
0
gravatar for genomax
15 months ago by
genomax40k
United States
genomax40k wrote:

SRA file is binary (not human readable), but looks just like any other file in a listing :-).

Use this guide for SRAtoolkit from NCBI, which is what you would use for converting an sra file to fastq.

ADD COMMENTlink written 15 months ago by genomax40k

can u tell how sra is converted into fastq, I mean the algorithm used in conversion? How does fastq-dump identifies and splits into two fastq files in case of paired-end and single fastq file in case of single-end?

ADD REPLYlink written 15 months ago by vimlakany0
0
gravatar for b.nota
15 months ago by
b.nota3.6k
Netherlands
b.nota3.6k wrote:

You don't need to download sra files, but you can use fastq-dump from the SRAtoolkit (as @genomax2 mentioned already).

If you have the toolkit installed, you just have to name the SRA file which you want to have. It will be downloaded as a fastq file.

e.g., in linux terminal:

~/sratoolkit/bin/fastq-dump SRR2393592
ADD COMMENTlink modified 15 months ago • written 15 months ago by b.nota3.6k

Does SRR2393592_1.fastq represents reads from forward strand and SRR2393592_2.fastq represents reads from reverse strand? If not how are they splited from sra file. for example, If sra file size is 6.9GB, fastq file generated is 50.8GB how is it processing?

ADD REPLYlink modified 15 months ago • written 15 months ago by vimlakany0

Yes they do. SRA files are binary and compressed. Think of this as similar to using tar or gzip to compress files.

If you are interested in software, the source code for SRA software/utilities is available on this page.

ADD REPLYlink written 15 months ago by genomax40k

For a sra file of size 2.2GB, the fastq file generated using fastq dump was 10.8GB but fastq file of the same sample in EBI was only 2.7GB in size. why is it so?

ADD REPLYlink written 15 months ago by vimlakany0

What was the exact command used? Which SRA# are you looking at?

ADD REPLYlink written 15 months ago by genomax40k

The command used to convert sra to fastq is fastq-dump --split-3 ERR738423.sra The above sra is single-end data. SRA file size is 2.2GB; using fastq-dump command fastq file obtained is 10.2GB; in ENA fastq file is 7GB. Why there is a huge difference in size?

ADD REPLYlink modified 15 months ago • written 15 months ago by vimlakany0
0
gravatar for chen
15 months ago by
chen1.4k
OpenGene
chen1.4k wrote:

use fastq-dump --split-3 command from sra-toolkit

ADD COMMENTlink modified 15 months ago • written 15 months ago by chen1.4k
0
gravatar for ThePresident
15 months ago by
ThePresident90
ThePresident90 wrote:

For paired, use this: fastq-dump --split-3 SRR2393592

The easiest way to install SRA toolkit is from brew package. Follow this link to first install brew and once that's done, simply run brew install sratoolkit in terminal

ADD COMMENTlink modified 15 months ago • written 15 months ago by ThePresident90

brew or apt-get or yum will install an old version of sra toolkit, suggest to download the source and compile it.

ADD REPLYlink written 15 months ago by chen1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 822 users visited in the last hour