Question: Does read length of RNA seq affects the results ?
1
gravatar for chaudharyc61
12 days ago by
chaudharyc6140
India
chaudharyc6140 wrote:

Hello everyone

As my question in my title says "Does read length of RNA seq affects the results ?" So I ahve a wild type of 75 BPs paired end data and mutant is of 150 BPs paired end.

After mapping does that affects the DEGs ?

Thank you Chandan kumar

rna-seq deseq2 next-gen • 146 views
ADD COMMENTlink modified 12 days ago by ponganta50 • written 12 days ago by chaudharyc6140

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0697-y

As noted by @ATPoint you should use comparable lengths in a single analysis at starting point.

ADD REPLYlink modified 12 days ago • written 12 days ago by GenoMax95k
1

@OP, this is actually a general principle. If you compare groups in a statistical framework you must make sure the only difference between them is the biological effect you want to test. Everything else that is specific for group would be a confounder.

ADD REPLYlink modified 12 days ago • written 12 days ago by ATpoint44k
1
gravatar for ATpoint
12 days ago by
ATpoint44k
ATpoint44k wrote:

I would anticipate that impact would be minor on the global scale but individual genes might be affected. Longer reads improve alignment. False alignments could be reduced since longer reads are more unique. In order to avoid mappability bias I would probabl trim all data to a constant length, for example with seqtk, and then remap.

The fact that both groups differ in sequencing implies that they might have been produced at different timepoints, is that the case? If so the experiment would be confounded, hopefully the confounding effect does not mask any meaningful biological effects. Can you elaborate?

ADD COMMENTlink modified 12 days ago • written 12 days ago by ATpoint44k

Yes, both data groups are sequenced at different timepoints.

ADD REPLYlink written 12 days ago by chaudharyc6140
0
gravatar for ponganta
12 days ago by
ponganta50
ponganta50 wrote:

What kind of analyses do you want to conduct? How do you quantify (mapping or quasi mapping?), what kind of reference do you utilise? Do you want to compare WT and mutant under certain conditions?

To add to @ATpoint and @GenoMax, if you want to find DEGs between WT and mutant, you might see a pretty hefty batch effect. Make sure to investigate those effects prior to DGE-analyses via clustering and PCA of samples.

ADD COMMENTlink written 12 days ago by ponganta50
1

The OP states that the read length is entirely confounded with biological condition. Thus, you won't be able to see this as a batch effect on a PCA.

ADD REPLYlink written 12 days ago by i.sudbery10k

Unfortunately, the OP also states that both libraries were constructed in different experiments, hence the likely batch effect I mentioned. Sorry for my imprecise wording! Maybe comBat will be of use here? But to @chaudharyc61: I doubt that you can succesfully conduct DGE-analyses in this situation. Look out for batch effects using a PCA. If you find that PC1 explains most of the variation and clearly seperates WT and mutant in two, this will be indicative of a batch effect due to different experiments (i.e. different libraries made by different people at different times with different technology) being compared.

ADD REPLYlink modified 11 days ago • written 11 days ago by ponganta50
1

If group is confounded by batch you cannot correct it. If groups separate then this can be due to biology or batch, or both. No way to tell.

ADD REPLYlink modified 11 days ago • written 11 days ago by ATpoint44k

I concur. When group is 100% confounded it is mathematically impossible to correct it, irrespective of how fancy the tool you use is.

ADD REPLYlink written 11 days ago by i.sudbery10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2405 users visited in the last hour
_