Essentially I am trying to re-create a database with three classes of exons.
1) Constitutively Spliced Exons
2) Alternatively Retained Exons
3) Alternatively Skipped Exons
This has been done before and I am attempting to re-create the database that this paper: http://www.sciencedirect.com/science/article/pii/S0092867415002639 generated. (Look at Annotation of Exons and Introns in Supplemental Methods)
However, I have no idea how to go about doing something like this. I have read all their methodology, including their supplemental methods and it still isn't apparent to me. The concept is easy, I just don't know how to apply it and what software to use. They list no software used, and they don't mention any custom scripting.
My PI has even gone as far as to e-mail the authors, but they have not replied.
I was hoping someone here could perhaps help me out by giving me a few ideas on how to proceed of building this database. Please keep in mind that I don't have a large amount of programming experience so designing my own custom script is probably out of the question (perhaps slight altering of a currently existing one might be do-able). Though I have some background in terminal and unix usage, and am familiar with a variety of commonly used software (bedtools, bedops, HTSeq, R, etc.)
My assumptions right now are that I need to download the latest gencode annotation file (though I'm not sure which one this would be since their are so many options on the gencode site, and different formats). And then someone overlap transcripts of a single exon and then classify them based on my three classes. I'm lost all the way through as you can see.
Thanks in advance to anyone that can even point me in a right direction,