GATK ASEReadCounter: Downstream analysis for identifying allele specific expression
4.4 years ago
komal.rathi ★ 3.9k

Hi everyone,

I am trying to find allele specific expression from RNA-seq data. I have the results from GATK ASEReadCounter and the output looks like this:

contig  position    variantID   refAllele   altAllele   refCount    altCount    totalCount  lowMAPQDepth    lowBaseQDepth   rawDepth    otherBases  improperPairs
chr2    38021   rs113895774 C   T   0   1   1   0   0   1   0   0
chr2    217334  rs6709534   G   C   0   3   3   0   0   3   0   0
chr2    218386  rs9213  G   A   84  101 185 0   0   185 0   0
chr2    220889  rs3828165   G   A   9   7   16  0   0   16  0   0
chr2    221560  rs60484953  G   A   11  11  22  0   0   22  0   0
chr2    221981  rs3791224   C   T   3   4   7   0   0   7   0   0
chr2    222336  rs3791223   T   C   3   8   11  0   0   11  0   0
chr2    224086  rs1474053   T   A   3   2   5   0   0   5   0   0
chr2    224919  rs2290911   A   G   55  16  71  0   0   71  0   0

I had an impression that this tool implements some test(s) to find statistically significant sites. But if I am not wrong, it only calculates the counts of ref and alt alleles based on RNA-seq - more or less like bam-readcount or samtools mpileup followed by counting bases.

In the manual of ASEReadCounter, they say that you can use the output format as input to mamba but I am not sure looking at the input format for mamba which requires an additional field: EXON_INFO - variant annotation label. Alternatively, I am thinking of using the refCount, altCount and totalCount, perform a chi square test to determine if there is allelic imbalance or not.

I would like to get suggestions on how to analyze this output or what downstream methods/statistical tests to use. Any help would be much appreciated.


RNA-Seq allele-specific expression • 4.0k views
@komal.rathi I want to know how you analyzed it last. For I can't download mamba.

I ended up doing a chisq test using the output of ASEReadCounter.

4.3 years ago
prasundutta87 ▴ 580

ASEReadCounter is doing what it says its doing..just counting the ref and alt reads per allele..

A chi square test or a two sided binomial test is fine for determining locus based ASE..for gene based there are other can get more information from this paper-

2.9 years ago

Late to the party on this post, but as @prasundutta87 alludes to, ASEReadCounter is doing exactly what it was intended to do. The best guide I've seen to performing the statistics around ASE, is from Mike Love's specific guide, and it's worth looking at this thread for Aaron Lun's answer too.


