EGA (ENA) database wont accept fastq files
1
0
5 weeks ago
heskett ▴ 90

hi folks,

I keep getting this error despite clearly uploading two files per library for paired end fastq

In run, alias:"ena-RUN-TAB-24-05-2022-00:14:25:828-37227", accession:"". In run(ena-RUN-TAB-24-05-2022-00:14:25:828-37227) Paired reads must contain two Fastq files


this is an example of the spreadsheet. i name the library the same and then have two lines, one for each file _1 and _2 for paired end. I am sure that they expect a different format but i cant figure it out. theres literally no examples in the documetation. thanks so much

GM12878_clone4  PRJEB52794  Illumina NovaSeq 6000   gm12878_clone4_early_rt GENOMIC RANDOM PCR  WGS PAIRED  gm12878_clone4_early_rt_1.fastq.gz
GM12878_clone4  PRJEB52794  Illumina NovaSeq 6000   gm12878_clone4_early_rt GENOMIC RANDOM PCR  WGS PAIRED  gm12878_clone4_early_rt_2.fastq.gz

ENA EGA fastq • 465 views
1
I usually work via the webin interface, and not with uploaded spreadsheets but ok.

Can you check/confirm a few things? :

• you are sure the files are in correct fastq format? (check for instance head of both of them)
• they are 'in sync' meaning the first read from file one is the counterpart of the first read of file 2 etc
• It can be that you will need to provide the two files of a paired end read file on the same line but in different columns (see my initial sentence here, that is at least how it sort of looks in the webin interface)

and rest assured, they definitely accept fastq files :)

0
thanks much! yes i found the correct template that allows for two files in one line

0
You may be best off contacting help desk at EGA for this.

0
Thanks. I have tried but it looks like their response time is very long. I have a journal editor asking me to upload ASAP, so I was hoping someone might know the format better. I have spent time answering other peoples questions and just hope someone may be able to identify the issue easily and give a tip or two.

1
5 weeks ago

What happens if you use only one line per pair (Just a wild guess)? Like so:

GM12878_clone4  PRJEB52794  Illumina NovaSeq 6000   gm12878_clone4_early_rt GENOMIC RANDOM PCR  WGS PAIRED  gm12878_clone4_early_rt_1.fastq.gz gm12878_clone4_early_rt_2.fastq.gz

1
indeed, I'm also thinking in that direction

0
this ended up being correct. I accidentally downloaded the template that didn't have the extra columns to add multiple fastq to one line. thanks everyone