3.2 years ago
guillaume.rbt ▴ 990

Hi all,

When I want to download files of a specific study (for example here : https://www.ebi.ac.uk/ena/data/view/PRJEB23709 ), for a given sample I have the choice between "FASTQ file" or "Submitted File".

I was wondering what is the difference between those two files (the "submitted" one being sligthly bigger that the corresponding "FASTQ" one)

Thanks

ENA EBI fastq • 2.9k views
If I remember well you can also upload (un)aligned BAM files , which ENA will then convert back to fastq I think, but as ATpoint I as well suggest to always go for the fastq version.

In this specific case Submitted file appears to contain the actual sample name. If you get the ENA fastq files then you may need to keep track of metadata for the sample names. So in this case I suggest that you download a sample ENA and Submitted files. Compare them (they should be identical) and the probably get the Submitted files instead.

I've checked and they are indeed the same files. The difference in size was only due to the modification of reads name in the "FastQ" files.

Hi, I am facing an issue in choosing a correct fastq file, since there are two entries for 1 sample accession (ERS1042158). [1]https://www.ebi.ac.uk/ena/data/view/ERS1042158

Can anyone please explain why there are two entries for 1 sample and what could be the possible difference? File size is also little different (87 GB vs 84 GB).

Hi AISHA ,

would you mind posing this as a new question (rather then adding it here) ? This way we try to keep the questions/answers logically structured.

thx

Okay. I am going to post it as a new question.

3.2 years ago
ATpoint 64k

I do not know what Submitted Fileis. Use FASTQ file. There are ways to speed up the download, see my tutorial Fast download of FASTQ files from the European Nucleotide Archive (ENA)

0
Entering edit mode

Ok ! (thanks for the tutorial, it is very useful)