Gene vs Transcript Level DEG Analysis
1
3
Entering edit mode
3.6 years ago
asumani ▴ 70

Hi all,

I use Ballgown v2.20.0 for DEG analysis after HISAT and StringTie. I conducted both gene and transcript level analysis in stattest function. In gene level, I got only 50 genes with qval < 0.05. In transcript level, I got 1127 transcripts with qval < 0.05. I want to perform gene level analysis for a comparison between another expression results which was analyzed in gene level.

Since I want to capture as many genes as possible, I pulled only the highest expressed isoforms to represent genes among 1127 transcripts. That gave me 1042 'gene names'. I know this sounds like a brute simplification.

Do you have any idea why I get such a few genes in gene level analysis? What is the main biological concern in pulling only the highest expressed isoforms in transcript level analysis?

Thanks in advance,

ballgown RNA-Seq • 1.2k views
ADD COMMENT
8
Entering edit mode
3.6 years ago
ATpoint 82k

Aggregate transcripts to gene level with tximport, then use DEseq2 or edgeR, all tools with extensive documentation. It is not the highest isoform but the sum of all isoforms that should be used to build gene level counts. tximport takes care of that.

ADD COMMENT
0
Entering edit mode

Thanks a lot! I used tximport followed by limma/voom since my other data set was already analyzed with limma. Limma/voom option is also in tximport manual, so I think that is alright.

ADD REPLY
0
Entering edit mode

Yes, that is perfectly valid to use limma/voom.

ADD REPLY

Login before adding your answer.

Traffic: 2810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6