ENA fastq file shown as single-end but has 2 fastq files: X.fastq.gz and Y.fastq.gz
1
0
Entering edit mode
10 weeks ago
Aaliya • 0

Hello, I am downloading some data from this project from ENA: https://www.ebi.ac.uk/ena/browser/view/SRX13384934

It is mentioned that the data is single ended, but there are still two fastq files (their experimental accession number is the same).

I tried to confirm this on NCBI SRA as well: https://www.ncbi.nlm.nih.gov/sra/?term=SRX13384934 Even on NCBI SRA, it is mentioned that it is single end, but there are two fastq files given.

I am assuming that the two files are the same and the number of runs which were done for this sample set was two for confirmation?

I have done fastqc and trimmomatic on these files, and they do differ by sequence length, should I opt for the one best in quality?

I have to take either of these files to perform Kallisto and then DESeq2.

Thank you in advance!

fastq single-end ENA • 441 views
ADD COMMENT
0
Entering edit mode
9 weeks ago

It's probably the same library run twice, I which case, trim the longer on to be the same length as the shorter, then use both.

ADD COMMENT
0
Entering edit mode

I cannot use both, to perform Kallisto quantification I need one single file only.

ADD REPLY
0
Entering edit mode

@swbarnes is correct. If you look at the data access tab for these entries in SRA ( one example https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR17204972&display=data-access ) then you will see that there are L001 and L002 in the original file names. This means the same sample was run on two lanes.

You can cat file1.gz file2.gz > one_file.gz and use the single file as input.

Both samples were run as 75 bp. You will get a range of read length after trimming the data. That is normal.

ADD REPLY
0
Entering edit mode

I understood what you are implying, thank you

ADD REPLY

Login before adding your answer.

Traffic: 2701 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6