Hello,
I am attempting to target an alternative transcript of a gene and generate an expression level read count for this one specific gene. I have BAM files already aligned to Hg19. For the gene I have the genome location and the entire sequence. I have attempted the following:
Creating a Bed file with the start and end sequence of each of the alternative splice variants and the number of exons each one has. When utilizing bedtools and CoverageBed this results in identical read counts for each transcript due to the large overlap in chromosomal location.
I have additionally looked into Miso and the gff-alternative event annotation format. This seems like it may work though I do not understand how to create such a file. I have looked into the premade gff annotation files made by Miso but when I searched for the specific variant of my gene. It showed up in multiple entries, with none of them matching the chromosomal location of the gene I am looking for. I would really appreciate some help!!!
Do you want raw counts or normalized expression values will do? If the latter is fine then you can use Cuffquant & Cuffnorm to get isoform expression of a gene. It will not give you raw read counts but will give normalized expression values & FPKM values. To get raw counts, DEXSeq may be your problem's solution.
The issue I am running into which both cuffquant and cuffnorm also have is how to create an annotation file, either BED or GFF because these file formats only require the chromosomal location of the gene. In my case the different alternative splice variants of this gene all overlap. So with only that information the reads between all the alternative splice variants will be identical. Is there a away around this?