I'm working with some human breakpoint data:
Chr.L Pos.L Strand.L Chr.H Pos.H Strand.H 18 19092052 + 18 30289323 +
I would like to know where breakpoints are generally occuring, e.g. joining together two exons, introns, UTRs, etc.
I have tried querying Ensembl via Biomart in R using:
attributes = c("transcript_biotype"), filters = c("chromosomal_region")
When I use the first position
18:19092052:19092052:1 it returns some transcripts which are out of range (e.g. 18822203-19035091) but seems to return the correct transcript with transcript start and end values overlapping the input, so I can work with that.
However for the second position
18:30289323:30289323:1 it does not return anything. Does this mean it is noncoding DNA? Is this happening because I am querying Ensembl Genes? I can live with that too but I'd just like it confirmed.
Otherwise, is there a better way I could do this? Perhaps using an SNP tool, like ANNOVAR?