Submitting short reads to SRA
1
2
Entering edit mode
9.1 years ago
kristjan ▴ 170

Because many journals request to have all the sequences in a public database, I wanted to submit my data to ENA or GenBank, but now I have some problems. The sequencing was done in 2011 with Illumina HiSeq200 of 16S V6 region. After barcode and primer removal the average read length was about 80 bp. My first problem is that GenBank accepts over 200 and ENA over 100 bp long reads, so do I have to find another database that accepts <100 bp reads?

And my second question is do you have to submit raw reads or after quality check? Because ENA requests fastq files, but after denoising I only have fasta files and unable to merge it with quality file. Although it makes more sense to me to upload denoised data and not raw reads as there is not much useful information in the low quality reads.

SRA database-submission • 3.4k views
ADD COMMENT
4
Entering edit mode
9.1 years ago
Ido Tamir 5.2k

The data you submit should be demultiplexed but unaltered. So you can submit to ENA (although I am quite sure that they also accept reads of length 36 e.g.). The idea of the archives is that other scientists can reproduce your analysis or do their own from the raw data.

ADD COMMENT
0
Entering edit mode

I totally agree here. Demultiplex. Do not clean. Submit to SRA. GenBank is an incorrect database to submit Illumina reads to.

ADD REPLY

Login before adding your answer.

Traffic: 2468 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6