Simple google search gave me the following:
But how can I do it programmically in Python and find all human tRNA-sequences using Bio.Entrez?
I’ve read biopython Cookbook. I found the following example for some Arabidopsis thaliana chromosomes.
17.2.2 Annotated Chromosomes Continuing from the previous example, let’s also show the tRNA genes. We’ll get their locations by parsing the GenBank files for the five Arabidopsis thaliana chromosomes. You’ll need to download these files from the NCBI FTP site ftp://ftp.ncbi.nlm.nih.gov/genomes/Arabidopsis_thaliana, and preserve the subdirectory names or edit the paths below:
from reportlab.lib.units import cm from Bio import SeqIO from Bio.Graphics import BasicChromosome entries = [("Chr I", "CHR_I/NC_003070.gbk"), ("Chr II", "CHR_II/NC_003071.gbk"), ("Chr III", "CHR_III/NC_003074.gbk"), ("Chr IV", "CHR_IV/NC_003075.gbk"), ("Chr V", "CHR_V/NC_003076.gbk")] max_len = 30432563 #Could compute this telomere_length = 1000000 #For illustration chr_diagram = BasicChromosome.Organism() chr_diagram.page_size = (29.7*cm, 21*cm) #A4 landscape for index, (name, filename) in enumerate(entries): record = SeqIO.read(filename,"genbank") length = len(record) features = [f for f in record.features if f.type=="tRNA"] #Record an Artemis style integer color in the feature's qualifiers, #1 = Black, 2 = Red, 3 = Green, 4 = blue, 5 =cyan, 6 = purple for f in features: f.qualifiers["color"] = [index+2] cur_chromosome = BasicChromosome.Chromosome(name) #Set the scale to the MAXIMUM length plus the two telomeres in bp, #want the same scale used on all five chromosomes so they can be #compared to each other cur_chromosome.scale_num = max_len + 2 * telomere_length #Add an opening telomere start = BasicChromosome.TelomereSegment() start.scale = telomere_length cur_chromosome.add(start) #Add a body - again using bp as the scale length here. body = BasicChromosome.AnnotatedChromosomeSegment(length, features) body.scale = length cur_chromosome.add(body) #Add a closing telomere end = BasicChromosome.TelomereSegment(inverted=True) end.scale = telomere_length cur_chromosome.add(end) #This chromosome is done chr_diagram.add(cur_chromosome) chr_diagram.draw("tRNA_chrom.pdf", "Arabidopsis thaliana")
It might warn you about the labels being too close together - have a look at the forward strand (right hand side) of Chr I, but it should create a colorful PDF file.
Is it a good way for human genome? And what about Bio.Entrez?