Often times I find myself checking if a mapped region overlaps with known regions of the genome. To do this, I use a set of genes that includes merged transcripts from UCSC, Ensembl, Refseq, Gencode, and Vegagene.
Usually this works just fine, but now I am looking for atypical types of transcripts such as siRNAs, lincRNAs, and all small RNA types. I'm not sure if the above annotations are comprehensive enough.
My questions to you are:
Can we (as a community) create a list of resources/websites where we can gather these genes?
How do you create comprehensive gene sets?
Here is a working list: