I am trying to retrieve the 3'UTRs for the list of gene ids from this dataset:
I have managed to get the sequences for about 1300 (out of ~20000) with Biomart using DBASS5 Gene Name as filter.
I have also used the Table Browser from UCSC..Some of the id's (~3000, not sure which) are not compatible with the refseq gene ids from their repository. The ids returned are in the following format:
hg19refGeneNM_032291 range=chr1:67208779-67210768 5'pad=0 3'pad=0 strand=+ repeatMasking=none
, and since I do not know which ones are not valid I cannot map them back to my gene set
How can I get the complete list of 3UTRs for this gene list?