A lab member was able to use the TSS annotation for all genes to analysis factor enrichment in Pol II mark as well as other marks such as those found in the 7SK snRNP (KAP1, Hexim1). He was able to generate heatmaps and metagene graphs. I am now taking over the bioinformatics portion of a new similar project and have a couple of questions that I hope someone can help with.
1.) My project requires me to conduct similar analysis except that I will be looking at factor enrichment in Exon - Intron / Intron - Exon junctions. The problem is that my lab member does not know how he was able to get a hold of the TSSAnnotationForAllGenes.txt file he used to annotate peaks, all he knows is he was able to find a text file online. I have spent hours searching for Exon Start Site annotation text files but no to avail.
With that being said, I am assuming that this information should be available (if it exists) in the UCSC Genome Browser database. What I am unsure about is how to exactly identify and extract Exon - Intron Junction information using UCSC. I have attempted to use the Table Browser using the clade/genome/assembly for my model (Human h19). I've used Genes and Gene Predictions group, and RefSeq track and for Region I have specifically looked at the BRCA1 gene position and then output the format to BED so that I can annotate peaks, and generate my heatmaps and metagene plots.
EDIT: After clarification with my PI, I want to do a meta-gene of regions that are centered (x axis = 0) on exon-intron boundaries. The BRCA1 is just a test gene, as I want to make sure it works before going to my PI saying that it can be done.
Is this the correct way to go about doing this? My first heatmap generation looked like this, but at least to me it makes little sense. I have attached the first heatmap I generated and maybe someone can shed some light on this.