Integrated analysis of RNA-seq datasets
0
0
Entering edit mode
8 months ago
Petesview ▴ 10

Hi Bioinformaticians,

I want to perform an integrated analysis with two RNA-seq datasets. Each dataset has two conditions, and both datasets are Illumina short reads. However the datasets have different bp read length and sequenced with different Illumina sequencers. Let's say I have the scenerio:

DatasetA = condition 1 & condition 2 DatasetB = condition 3 & condition 4

In this case, if I would like to look for genes upregulated in condition 1 of dataset A, but to also check whether this gene is also upregulated if condition 1 is compared against condition 3, is this technically advisable to do? If so, can someone please direct me to the appropriate algorithm or technique that this can be performed? Thank you

RNA-seq • 668 views
ADD COMMENT
0
Entering edit mode

You'll need to account for batch effects. Search in this forum and on limma and DESeq2 vignette. Did you have a control (baseline) in both the datasets?

ADD REPLY
0
Entering edit mode

Thanks for your reply. I don't have a control condition that is shared between the two datasets. So all conditions in both dataset are exclusive. I can foresee that having a control is most likeable, but unfortunately this isn't the case, so I'm not sure if integrating them are doable.

ADD REPLY
0
Entering edit mode

If these are two different datasets you may have more confounding factors than just read length and different sequencers.

For example, if these were generated by different labs (or just by different people) the sample preparation or cell culture conditions may be different between datasets - so before you start looking at batch correction methods you should first look at how the samples were handled before sequencing and make sure you are comparing like-for-like. Otherwise the analyses may not be technically sound.

ADD REPLY
0
Entering edit mode

Thanks for your reply. That is the case for me, whereby datasets are generated by different labs. The conditions themselves are what I desire, but there are slight differences in terms of sample preparation, since one of them is sequenced via Illumina NextSeq whilst the other is HiSeq. Speaking of this question, if I have not mistaken, the analyses on TCGA patients done by cBioPortal were batch corrected. RNA-seq reads from these patients must have been generated in across multiple experiments, and maybe even different sequencers?

ADD REPLY

Login before adding your answer.

Traffic: 1798 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6