Entering edit mode
4.5 years ago
EpiExplorer ▴ 90
I have the list of LncRNAs of interest and their coordinates in BED format. I am interested in identifying the protein coding genes that are 300kb upstream and 300kb downstream of each of those LncRNAs. The genome annotation of lncRNA is gencode 27. I would like to how I can find the upstream/downstream genes? The easiest way I can think is table browser in UCSC browser.
But I am not sure how. Please help with this.
Thanks for your time.
Not solution but a direction. You will require coordinates of protein coding genes in BED format.
step#1 : fetch the coordinates 300 kb up and downstream of the LncRNAs. You may use
awkor shell or a simple PERL one liner for that. Pipe that into a standard BED file format.
step#2 : use bedtools intesect to find the overlapping regions using the BED file for protein coding genes.
That's how I think! there should be other faster and better ways.
Go through this post. It might help.