Entering edit mode
9.6 years ago
pvltpost
•
0
I would like to ask how to get DNA coordinates (start & end) and appropriate DNA code with using UniProt proteinID and amino acid sequences through MySQL query.
Thank you in advance!
What have you tried? This can probably be done via UCSC.
Yes, I have tried MySQL queries to UCSC. I found only way how to get known peptides -- trough knownGenePep.seq. But I don't know how to get dna-coordinates for them and also I have smaller peptides which are not in knownGenePep.
I tried pyucsc and I think it is a good and easy way. But here I heve a problem how to use fastinterval.Genome class: need I download *.fa fasta files for doing this:
, or I could enquire to db without data files on my computer? How test_genome must be defined?
Does anybody use pyucsc?
You should be able to get coordinates from the knownGene table.
BTW, you could also just use biomart and download the results (assuming they don't have direct SQL access). Here's an example query to get all coordinates of human genes with Uniprot IDs. You can also get the sequences via biomart, though I don't think it'll add the uniprot IDs to the header in the fasta file.
I have list of peptides sequences and uniprot IDs of proteins in which peptides are. But it is much more short peptides (10-20 amino acids) then in knownGenePep. And I need coordinates for my peptides, not for genes.
what is "DNA coordinate" ? cDNA ? mRNA ? genomic DNA ?
I mean genomic DNA.