Hi all,
I use Ballgown v2.20.0 for DEG analysis after HISAT and StringTie. I conducted both gene and transcript level analysis in stattest
function. In gene level, I got only 50 genes with qval < 0.05. In transcript level, I got 1127 transcripts with qval < 0.05. I want to perform gene level analysis for a comparison between another expression results which was analyzed in gene level.
Since I want to capture as many genes as possible, I pulled only the highest expressed isoforms to represent genes among 1127 transcripts. That gave me 1042 'gene names'. I know this sounds like a brute simplification.
Do you have any idea why I get such a few genes in gene level analysis? What is the main biological concern in pulling only the highest expressed isoforms in transcript level analysis?
Thanks in advance,
Thanks a lot! I used tximport followed by limma/voom since my other data set was already analyzed with limma. Limma/voom option is also in tximport manual, so I think that is alright.
Yes, that is perfectly valid to use limma/voom.