Question: Prefiltering gene lists and DESeq
gravatar for jjrin
8 months ago by
jjrin10 wrote:

Hello, I am doing DESeq on my RNA-Seq data. However, I have filtered my conditions individually based on the replicates within each condition. Basically, I have 5 conditions with 5 replicates. I filtered each condition based on the rowmeans of the replicates (5 and above reads are kept). However, now that I am doing DESeq, it does not allow me to make a DESeq set or do any sort of comparisons

I receive this error every time. This is likely due to the fact that none of my conditions have the same lengths due to the prefiltering.

longer object length is not a multiple of shorter object lengthError in Ops.factor(a$V1, l[[1]]$V1) : 
  level sets of factors are different

I have been using DESeqDataSetFromHTSeqCount within the DESeq2 R package, with each replicate file having separate text files with gene names and counts.

Is there any way for me to remove the genes that are simply not shared between all of the conditions or do the comparison without regards to any genes that are only present in certain conditions? Thank you!

rna-seq deseq R • 393 views
ADD COMMENTlink modified 8 months ago by h.mon21k • written 8 months ago by jjrin10
gravatar for h.mon
8 months ago by
h.mon21k wrote:

For DESeq2, you don't need pre-filtering, it helps only for speeding up (just a tiny bit, on my experience) and reducing memory usage. From the manual:

Note that more strict filtering to increase power is automatically applied via independent filtering on the mean of normalized counts within the results function.

ADD COMMENTlink written 8 months ago by h.mon21k

Thanks for your reply!

The problem is that my gene lists are already prefiltered because I did batch correction on them (using sva seq) which requires prefiltering. Therefore all of my gene lists are different lengths since I batch corrected them in subsets, one treatment with one master control group. Is there anyway that I can continue to use these new gene lists with adjusted counts even though they are different lengths?

ADD REPLYlink written 8 months ago by jjrin10

I batch corrected them in subsets

This strikes me as the opposite of how batch correction should be performed.

In general, for known batch effects, you can account for by including them in the model - DESeq2 can do this.

ADD REPLYlink written 8 months ago by h.mon21k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1629 users visited in the last hour