RNA-Seq Reads Count
0
0
Entering edit mode
18 months ago
PBC ▴ 10

Hi Everyone

I have a question related to previous procedures to do Differential Gene Expression (DEG) by using DESeq2. I have counts for 2 conditions each one with 3 replicates. After I did the DEG, I realized that are duplicated genes in the final result, because when I did the count with HTSeq I considered gene id. Thus the count considers different transcripts for the same gene when I performed the analysis.

For instance

    Gene ID        Count        Gene Name

    A               10            KDR
    B               12            KDR

I think that I should join these counts, since they came from the same gene, but I do not have certain of this. Thus I will have 22 read count for this gene KDR in a file and I will do the analysis considering 22 reads count for this gene instead of do the DEG for A and for B separately.

My question is: Should I join the count from different transcripts that fall in the same gene to follow the DEG analysis? If no, why?

I tried to find an answer on the forums, but I did not get one so far. Sorry if this is a basic question, I just started to do this type of analysis.

Thanks in advance for your supporting.

HTSeq RNA-SEQ DEG reads DESeq2 • 983 views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Is this a problem for a significant number of genes? Do you have gene counts or transcript counts? I don't think HTSeq can generate transcript counts. If two different Gene IDs have the same gene name, (which I know happens occasionally in ensembl) I'd stick with the unique IDs all the way through, until the very end of the analysis.

ADD REPLY
0
Entering edit mode

Hi!

Thank you for your reply.

I think that I could get a different result since the DEG analysis will be done based on gene ID row. For now, I got different DEG result for A and for B. I think it is a problem for the DEG analysis for this gene, since I do not have much duplicates in my count table.

As for HTseq I think that this tool generates counts for transcripts since I have different gene IDs that fall in the same gene. To do the counts I used this parameters.

htseq-count -t exon --mode union --stranded no -i gene_id $file $gtf > ${name}_count.txt

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6