GC content and length bias in RNA-seq data
0
0
Entering edit mode
6.1 years ago

Hi I have seven RNA-seq data in two groups( control,treat) that produced from two companies (controls=BGI,treats=novogen). I used NOISeq to quality control of count data, and the results recommended normalization for GC content and length. But I have studied some papers and have founded that GC content and length normalization in gene expression comparison between two groups introduced bias. So I would like to use normalization factors instead of size factors in DESeq2(as described in differential analysis of count data - the DESeq2 package).

normFactors<-matrix(runif(nrow(dds)*ncol(dds),0.5,1.5),ncol=ncol(dds),nrow=nrow(dds),dimnames=list(1:nrow(dds),1:ncol(dds)))
normFactors<- normFactors / exp(rowMeans(log(normFactors)))
normalizationFactors(dds)<- normFactors
sizeFactors(dds)<-normFactors

Is it a correct manner to normalization and introduce the normalization factors in DESeq2? Thank you for cooperation in advance. Best regards Parvin

rna-seq • 1.6k views
ADD COMMENT
0
Entering edit mode

It is not correct to use groups for which the data was generated differently, by a different company in your case. Your groups are confounded by technical differences in addition to the biological effects.

ADD REPLY
0
Entering edit mode

In addition to what Wouter has said, it looks like you are making your situation overly complex. You have 2 different datasets from 2 different platforms. You can analyse these together and include batch/platform as a covariate, but you must exercise caution. I have done this in the past but we eventually decided not to proceed for fear of criticism.

ADD REPLY

Login before adding your answer.

Traffic: 2138 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6