Question: Obtaining A Fasta With Conserved Region Near A Gene In Ucsc Genome Browser
8.2 years ago
Anima Mundi
wrote:


I would like to obtain, from UCSC Genome Browser, a FASTA with 5 Kb upstream the promoter plus intron regions of a gene of interest; how can I obtain only the conserved sequences among different species (those proposed by the tool), instead of generating a FASTA with the whole 5 Kb upstream plus the whole intron regions? Thanks in advance.

8.2 years ago
San Francisco
Treylathe wrote:

Someone will propose a more elegant coding solution, but in the meantime if you just want to do this 'quick and dirty', I'd suggest the Table Browser intersection function or Galaxy join function. I can give you a more complete step-by-step, but it might have to wait till I'm at my computer and not my iPad :D.

Edited to add:

can't promise much more than this at the moment, got busy and travel coming up :D... but here's the basic outline that I think will work. I'm sure there is a more elegant solution. Use Table Browser Choose UCSC Known Genes (use whatever gene position, IDs you have as position) Create custom track Choose "introns" or "upstream (5000kb) create track

Create a second track Choose conservation in comparative. There are many to choose from, look at the documentation to determine which (across vetebrates, mammals, etc) Filter for a value you prefer for conservation (again, look at documentation) create a second custom track.

In Table Browser choose first custom track, intersect with second custom track get sequence as output

That should get you a FASTA record of only those regions that have an overlap with a high conservation score you choose. I'm sure there must be a more elegant way (and to get very specific regions) than this, but that's the 'quick and dirty' method.

I would be glad to read and try your step-by-step tutorial. Thanks,


8.2 years ago
Anima Mundi
wrote:

This is helpful, thanks again.

