Will different biotyped transcripts summarized under the same geneID be a problem in DESeq2?
2.2 years ago
Zhaoming


I was using DESeq2 to analyze transcript quantification data generated from Salmon.

I generated a tx2gene file that associated each transcript with their gene IDs. The result suggested that in our Msx1 KO mice, there was only ~ 70% reduction of the gene that was knocked out. After taken a closer look I realized that under this gene ID there are two transcripts, one is a protein coding transcript, which was almost depleted in our KO group, whereas the other transcript, which is a retained intron, did not change much.

Although looking at the summarized gene level quantification result, there was still a significant amount of Msx1 expression (~30%) in the knockout group, these remaining transcripts were entirely composed of non-coding transcripts which will not provide the same biological function as the protein coding ones.

I am wondering if this would be a concern for the quantification of this gene, and genes that produce both coding and non-coding transcripts in general.

Any insight or suggestion will be greatly appreciated!

Thanks, Zhaoming

