Question: how to deal with sra files which can generate three fastq files?
0
gravatar for fanglujing
5 months ago by
fanglujing30
China/xiamen
fanglujing30 wrote:

Hi, I have downloaded sra file from NCBI, SRR4242282.sra and I got three fastq files after use fastq-dump to extract fastq files from sra files. command :fastq-dump --split-3 --gzip SRR4242282.sra I have no idea with this result, I haven't met this before. Any suggestion would be appreciated.

fastq-dump fastq sra • 273 views
ADD COMMENTlink modified 5 months ago by t.kuilman750 • written 5 months ago by fanglujing30
1

It does look like the submitter's may have submitted index sequences in a separate file since the corresponding ENA entry also shows three fastq files. Examine the files to see which one is the index sequence containing file. It should be easily apparent because of short reads.

Edit: I will leave this here in case other submitter's have done this.

OP: Please confirm if t.kuilman's explanation is applicable in your case.

ADD REPLYlink modified 5 months ago • written 5 months ago by genomax64k

I have checked fastq content and I think t.kuilman's suggestion works in this situation. Thanks for the reply.

ADD REPLYlink written 5 months ago by fanglujing30
2
gravatar for t.kuilman
5 months ago by
t.kuilman750
Netherlands
t.kuilman750 wrote:

Please see my previous post: this is due to the fact that BOTH paired and unpaired reads are included in these sra files. Using the --split-files option does not work since this would lead to fastq-files that are incomplete. What you did is correct; simply use the files ending with _1 and _2 will do. The remaining files contains the unpaired reads, and can be trashed.

ADD COMMENTlink modified 5 months ago • written 5 months ago by t.kuilman750

thanks for your replay and it does help me a lot.

ADD REPLYlink written 5 months ago by fanglujing30
0
gravatar for Santosh Anand
5 months ago by
Santosh Anand4.7k
Santosh Anand4.7k wrote:

Either use only the _1 and _2 files or use option --split-files instead of --split-3

See the manual/help page:

  --split-files                    Dump each read into separate file.Files 
                                   will receive suffix corresponding to read 
                                   number 
  --split-3                        Legacy 3-file splitting for mate-pairs: 
                                   First biological reads satisfying dumping 
                                   conditions are placed in files *_1.fastq and 
                                   *_2.fastq If only one biological read is 
                                   present it is placed in *.fastq Biological 
                                   reads and above are ignored.
ADD COMMENTlink written 5 months ago by Santosh Anand4.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1825 users visited in the last hour