ENA fastq file shown as single-end but has 3 fastq files: X.fastq.gz X_1.fastq.gz X_2.fastq.gz
1
0
Entering edit mode
24 months ago
Mr Locuace ▴ 110

Hello, I am downloading some data from this project from ENA:

https://www.ebi.ac.uk/ena/data/search?query=PRJNA494719

There are cases where I do not understand the format. For instance, this is supposed to be a single-end file (it says "SINGLE"): https://www.ebi.ac.uk/ena/data/view/SRX4809217

but has three files instead, namely, SRR7976417.fastq.gz, SRR7976417_1.fastq.gz and SRR7976417_2.fastq.gz

On the other hand, this file: https://www.ebi.ac.uk/ena/data/view/SRX4809200

is shown as paired-end (says "PAIRED") but is a single file.

For most of the other data in this project, SINGLE has a single file and PAIRED has 3 files (I guess read1, read2, orphan read). For instance:

https://www.ebi.ac.uk/ena/data/view/SRX4809208

and

https://www.ebi.ac.uk/ena/data/view/SRX4809204

How to know if the problematic files are really single-end or paired-end?

Thanks very much

fastq paired-end single-end ENA • 771 views
ADD COMMENT
3
Entering edit mode
24 months ago

If there are three files, they are paired. The metadata from SRA is not uncommonly incorrect.

ADD COMMENT
0
Entering edit mode

Thank you @Sean Davis !. But what about the single file that says "PAIRED"?. How to know if it is actually single-end or a merge between read1 and read2 paired-ends?

ADD REPLY
0
Entering edit mode

Check the read names. If it is a merge you should have the same read name twice (probably adjacent to each other), otherwise read names should be unique.

ADD REPLY
0
Entering edit mode

Thank you very much @ATpoint !

ADD REPLY

Login before adding your answer.

Traffic: 2816 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6