Question: How can I submit paired-end SRA data to NCBI?
0
gravatar for MAPK
2.4 years ago by
MAPK1.7k
MAPK1.7k wrote:

Hi All, I tried to submit paired end fastq files (R1 and R2) for a sample to NCBI SRA database. I tried the following steps:

1) Created a bioproject profile by following the link: https://submit.ncbi.nlm.nih.gov/subs/bioproject/SUB4422178/submitter Filled in everything and submitted to get SAMN and PRJNA ids, then I selected FTP uploads

2) Then went to https://submit.ncbi.nlm.nih.gov/subs/sra/ First I moved both R1 and R2 files to a separate directory and cd to that directory. Then in the terminal typed:

ftp -i
open ftp-private.ncbi.nlm.nih.gov

Then on the prompt, typed username and password from link in 2

Username: subftp
Password: w*******

cd to account folder from link in 2 cd uploads/amyname@gmail.com_00YmVxw2

3) Created a new directory as shown below:

mkdir rhizophagus
cd rhizophagus

4) Then transfered both fastq files to ncbi directory by typing: mput *

After transfer was complete, I typed ls to see all files that have been transferred.

5) I then submitted the files using upload folder from https://submit.ncbi.nlm.nih.gov/subs/sra/ once the files were available on the database. I then selected the folder and submitted the folder.

Both of these files are now online here https://trace.ncbi.nlm.nih.gov/Traces/sra/?study=SRP158305. However, when I tried to download these file using prefetch --option-file sratest.txt and extract with fastq-dump --split-files SRP158305, I got two fastq files, but one file is 12.2 gb and the other file is only 303.3 mb. The actual file size of each fastq (R1 and R2) should be 10.9, but the downloaded fastq's are 12.2 gb and 303mb. I am not sure how it should have been submitted, but I would really if somebody could help me figure out where it went wrong. Thanks for your help in advance.

sra ncbi • 2.8k views
ADD COMMENTlink modified 2.4 years ago by Biostar ♦♦ 20 • written 2.4 years ago by MAPK1.7k

If you go to the link above they appear to be similar sized.

cap

Did you upload them uncompressed? Perhaps SRA has already converted them (to .sra) and/or compressed them further.

ADD REPLYlink written 2.4 years ago by GenoMax95k

@genomax Yes, I did not compress them. I just uploaded the two fastq files. Should I have to compress them and make one compressed file before uploading?

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by MAPK1.7k

And what's the procedure to remove/update already submitted files? Is it possible to remove them and resubmit?

ADD REPLYlink written 2.4 years ago by MAPK1.7k
1

Doing fastq-dump --split-files SRR7716298 seems to recover asymmetric sized files as you posted above. I suggest that you email SRA support to let them know what is happening and ask them to reset your submission so you can re-upload the data.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by GenoMax95k

Thanks. So when I submit it again, should I just compress both R1 and R2 and make one compressed file and submit?

ADD REPLYlink written 2.4 years ago by MAPK1.7k

To further clarify I am seeing this when I do fastq-dump --split-files SRR7716298.

Rejected 41571327 READS because READLEN < 1
Read 42671006 spots for SRR7716298
Written 42671006 spots for SRR7716298
ADD REPLYlink written 2.4 years ago by GenoMax95k

What does that mean? Could you please clarify?

ADD REPLYlink written 2.4 years ago by MAPK1.7k
1

Either a file or the SRA record must have become corrupt. I assume this is original raw data?

ADD REPLYlink written 2.4 years ago by GenoMax95k

That's right, these are raw data. I have emailed SRA support to reset it. So when I re-upload the files, should I just compress both files and submit as one compressed file?

ADD REPLYlink written 2.4 years ago by MAPK1.7k

Compress and submit them as a pair.

ADD REPLYlink written 2.4 years ago by GenoMax95k

Sorry still confused- should I submit as one compressed file (with both R1 and R2) or two individually compressed files?

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by MAPK1.7k
1

gzip each file separately and submit as two files

ADD REPLYlink written 2.4 years ago by piet1.8k

Thanks, I will give it a try.

ADD REPLYlink written 2.4 years ago by MAPK1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1947 users visited in the last hour
_