How to calculate allele specific expression from RNAseq data
1
2
Entering edit mode
8.0 years ago
juncheng ▴ 200

As far as I know, to calculate allele specific expression (ASE), people take advantage of the fact that we have known SNPs, based on the ratio of SNPs expressed, we can calculate ASE.

The question is how if we have more than one known SNPs on the same transcript (or Gene?), and the rations are likely to be different, how do we treat this, we make a naive average or how to do it exactly?

Best.

Jun

RNA-Seq • 3.9k views
0
Entering edit mode

Thanks. Here is a example:

POS         AlleleFreq   REF      ALT   DP   Gene
6245603    0.36            T/16      C/9      25    "RPL22";
6245623    0.28            T/18      C/7      25    "RPL22";
6246324    0.16            A/16      G/3     19    "RPL22";
6246348    0.35            T/13      C/7      20    "RPL22";


AlleleFreq is calculated by dividing the count of ALT/REF. As you can see, the ratio is not identical across the same transcript. How would you then calculate allele frequency then?

By the way, the data is single cell cancer sample RNAseq.

0
Entering edit mode

There's no need to calculate allele frequency. You split the read counts covering each informative SNP according to the allele they support and then count accordingly.

0
Entering edit mode

Sorry to use "Allele frequency" term here. What I actually did here shown as "AlleleFreq" is dividing the read counts from one allele to the total read count from that loci.

I added the read count at each base. Could you just give me a example how to calculate allele specific expression at this case?

0
Entering edit mode

Assuming that those positions are the only ones that represent that gene, the counts would be 26 for one allele and 63 for the other. You test per-gene, not per-position.

0
Entering edit mode

Thanks, I know understood what you mean. But is there the problem of haplotype? The genotype of the two alleles could be TTAT/ CCGC or TTGC/CCAT or TCGT/ CTAC...?

0
Entering edit mode

Unfortunately I don't know what your question is. I'm assuming that the reference allele describes one haplotype. If that's not the case, then calculate things accordingly.

0
Entering edit mode

Thanks again, you helped me a lot.

1
Entering edit mode
8.0 years ago

Just sum the counts over each informative SNP in the transcript. You'll note that this is identical to how differential coverage of a transcript is dealt with.

0
Entering edit mode

Thanks Devon, I have a update bellow.