Aligning full length primary transcripts to a reference genome
2
0
Entering edit mode
9.4 years ago
mmaselko • 0

My goal is to retrieve the promoter regions of a few hundred genes. I have the primary transcripts of these genes in a FASTA file and the organism's genome in another FASTA file.

How can I align the transcripts to the reference genome and then retrieve 200nt of genomic sequence upstream of the transcript?

Thanks for the help!

alignment • 2.5k views
ADD COMMENT
1
Entering edit mode
9.4 years ago
Vivek ★ 2.7k

You could use Blat to align the transcripts to the genome, once you have the coordinates, you could create a bed file of 200 nt upstream of the alignment start and use getFasta of BedTools suite to retrieve the promoter sequence from the reference genome.

http://bedtools.readthedocs.org/en/latest/content/tools/getfasta.html

ADD COMMENT
1
Entering edit mode
9.4 years ago
Prakki Rama ★ 2.7k

In addition, you can also use GMAP to align the transcripts to genome. Check this tutorial for more information.

ADD COMMENT

Login before adding your answer.

Traffic: 2085 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6