Question: ebi.uk RNAseq data
0
gravatar for clizama
3 months ago by
clizama0
clizama0 wrote:

Hey,

I was downloading some pair end data files from ebi.ac.uk. I have the following problem in general they have just two files per sample (for pair end). This RNAseq has 4 files per sample I think they split the files in 4, two for each files.

https://www.ebi.ac.uk/ena/data/view/PRJNA378234

But not sure which are the files that I have to merge :

Example: These two code belong to the same sample, however each code has two files. Paired - SRR5314617 File 1 File 2

  • SRR5314618 File 1 File 2

Then my question is? I have to merge file1 with file1 and 2 with 2? or 1 and 2 in code 1 same code2? Not sure exactly how Ebi organize it.

Thanks

rna-seq • 153 views
ADD COMMENTlink modified 3 months ago by Devon Ryan79k • written 3 months ago by clizama0
1
gravatar for Devon Ryan
3 months ago by
Devon Ryan79k
Freiburg, Germany
Devon Ryan79k wrote:

You'll see that one column is labeled "FTP" and the other "Galaxy", so you have a single set of paired-end files. You want the FTP column unless you're using Galaxy.

ADD COMMENTlink written 3 months ago by Devon Ryan79k

Hi,

Just checking the FTP column, If you check the code SRR5314617 and SRR5314618 both correspond to same sample ( RNA-seq liver macrophage 1m old liver1mWT3) and both has a single set of paired-end files. Then, I'm little confused if they split the files in two set of paired end, and I have to merge file 1 with file 2 and file 2 with file..etc.

Thanks

ADD REPLYlink written 3 months ago by clizama0

If you are certain the two datasets represent an identical sample then you could merge the resulting alignments. There is no need to do the merge at the read stage. In case you decide the merging was not appropriate, you would need to back up just one step instead of starting over. You can also estimate if there is any kind of batch effect (different runs/libraries etc) by keeping them separate.

ADD REPLYlink modified 3 months ago • written 3 months ago by genomax48k

It's possible these are replicates for the same sample, I checked the bioproject in NIH the experiment has 15 SRA files, but the ebi website has 28.

ADD REPLYlink written 3 months ago by clizama0
1

That is odd indeed. Download one or two duplicate sample files and see if it is the same data was uploaded twice by error.

ADD REPLYlink written 3 months ago by genomax48k

Ah, on GEO it's clearer that those are two runs of the same samples. That happens on occasion, though one hopes they're runs from the same library prep.

ADD REPLYlink written 3 months ago by Devon Ryan79k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1145 users visited in the last hour