Question

Allele specific expression (ASE) using TPM (transcripts per million) values

2

Entering edit mode

7.9 years ago

kirannbishwa01 ★ 1.6k

I aligned the RNAseq reads to the diploid (hybrid genome) and calculated the TPM (transcripts per million) values for my samples using EMASE. So, the TPM values are reported for each gene_id and haplotype. I want to do ASE variation analyses within the samples. My thought is that applying DE approaches to it would be fine, but the analyses should focus for the difference with in the samples and also check if the ASE differences for any given gene is/are consistent across samples.

I have came across edgeR, DeSeq, DeSeq2, kalliso sleuth. But, I am wondering if someone could suggest which of these tools would be best to work with my data.

Note: I posted the same question on google groups just to expedite the analyses. If this violates the policy of question posting please let me know.

Thanks, - Bishwa K.

transcripts per million RNAseq ASE • 2.4k views

ADD COMMENT • link updated 7.9 years ago by Sandeep ▴ 260 • written 7.9 years ago by kirannbishwa01 ★ 1.6k

GouthamAtla · Answer 1 · 2016-06-10

0

Entering edit mode

7.9 years ago

Sandeep ▴ 260

There is nice tool available which is already published.

ASEQ: fast allele-specific studies from next-generation sequencing data

You can use it on your aligned data file.

Hope this helps.

ADD COMMENT • link 7.9 years ago by Sandeep ▴ 260

0

Entering edit mode

I already have TPM valuses calculated using EMASE. Previously I wanted to use ASE-TIGAR for ASE analyses but had to changes since it wasn't accepting my scaffolds. Now, I have TPM values, so I want to get some opinion on doing statistical analyses. I have explored edgeR, Deseq2, these tools are for Differential Expression. But I want something simple to start with and specific to the TPM values calculated for two haplotypes within a sample. I know these data are mainly approached by using poisson model with overdispersion, or by using negative bionomial regression. I am looking for some worked out examples on ASE to stay on right track, until now I have found none.

Here is the structure of my data:

gene_id_locus   strand  gene_name   gene_of_Int gene.erc.M  gene.erc.S  gene.erc.T      gene.tpm.M  gene.tpm.S  gene.tpm.T  
Al_scaffold_0001_1000   +           3.44195E-11 55  55      4.09867E-12 6.563692475 6.563692475 
Al_scaffold_0001_1004   -           1.62587E-05 184.9999837 185     7.79528E-07 8.86988221  8.869882989 
Al_scaffold_0001_1015   +           2.015114379 4930.984886 4933        0.201724233 493.6191982 493.8209224 
Al_scaffold_0001_1024   +           0   0   0       0   0   0   
Al_scaffold_0001_1030   +           2   29  31      1.457537529 21.13429417 22.5918317  
Al_scaffold_0001_1039   -   ATNAT8      3.9 22.1    26      0.140147839 0.79417109  0.934318929 
Al_scaffold_0001_1041   -           0   0   0       0   0   0   
Al_scaffold_0001_1044   -           712.7205414 314.2794586 1027        24.00223976 10.62981371 34.63205347 
Al_scaffold_0001_1048   -           774.4874591 482.5125409 1257        119.5809891 74.50001447 194.0810036 
Al_scaffold_0001_1061   +           0   0   0       0   0   0   
Al_scaffold_0001_1062   +   PHS1        193.4487519 198.5512481 392     9.02171979  9.347412647 18.36913244 
Al_scaffold_0001_1063   +           0   0   0       0   0   0   
Al_scaffold_0001_1066   +           0   0   0       0   0   0

ADD REPLY • link updated 7.9 years ago by GouthamAtla 12k • written 7.9 years ago by kirannbishwa01 ★ 1.6k