Question: ebi.uk RNAseq data
0
gravatar for clizama
12 months ago by
clizama0
clizama0 wrote:

Hey,

I was downloading some pair end data files from ebi.ac.uk. I have the following problem in general they have just two files per sample (for pair end). This RNAseq has 4 files per sample I think they split the files in 4, two for each files.

https://www.ebi.ac.uk/ena/data/view/PRJNA378234

But not sure which are the files that I have to merge :

Example: These two code belong to the same sample, however each code has two files. Paired - SRR5314617 File 1 File 2

  • SRR5314618 File 1 File 2

Then my question is? I have to merge file1 with file1 and 2 with 2? or 1 and 2 in code 1 same code2? Not sure exactly how Ebi organize it.

Thanks

rna-seq • 364 views
ADD COMMENTlink modified 12 months ago by Devon Ryan88k • written 12 months ago by clizama0
1
gravatar for Devon Ryan
12 months ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

You'll see that one column is labeled "FTP" and the other "Galaxy", so you have a single set of paired-end files. You want the FTP column unless you're using Galaxy.

ADD COMMENTlink written 12 months ago by Devon Ryan88k

Hi,

Just checking the FTP column, If you check the code SRR5314617 and SRR5314618 both correspond to same sample ( RNA-seq liver macrophage 1m old liver1mWT3) and both has a single set of paired-end files. Then, I'm little confused if they split the files in two set of paired end, and I have to merge file 1 with file 2 and file 2 with file..etc.

Thanks

ADD REPLYlink written 12 months ago by clizama0

If you are certain the two datasets represent an identical sample then you could merge the resulting alignments. There is no need to do the merge at the read stage. In case you decide the merging was not appropriate, you would need to back up just one step instead of starting over. You can also estimate if there is any kind of batch effect (different runs/libraries etc) by keeping them separate.

ADD REPLYlink modified 12 months ago • written 12 months ago by genomax62k

It's possible these are replicates for the same sample, I checked the bioproject in NIH the experiment has 15 SRA files, but the ebi website has 28.

ADD REPLYlink written 12 months ago by clizama0
1

That is odd indeed. Download one or two duplicate sample files and see if it is the same data was uploaded twice by error.

ADD REPLYlink written 12 months ago by genomax62k

Ah, on GEO it's clearer that those are two runs of the same samples. That happens on occasion, though one hopes they're runs from the same library prep.

ADD REPLYlink written 12 months ago by Devon Ryan88k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1610 users visited in the last hour