Downloading Cds Fasta Sequences From Gene Ids
1
1
Entering edit mode
8.7 years ago
lin.barnum ▴ 230

I have a list of geneIDs obtained from UCSC genome browser like this:

CG6854
CG8119
CG8359
CG9437
CpipJ_CPIJ001450
CpipJ_CPIJ002577
CpipJ_CPIJ011605
CpipJ_CPIJ011632
CpipJ_CPIJ016978
GA11162
GA11800
GA12610

All of these are from the insect group. I would like to obtain the fasta CDS for these genes without introns. I can do this individually so hopefully there should a way to automate it as I have 195 of these. Any ideas on how this can be done would be appreciated.

fasta ucsc gene • 2.8k views
ADD COMMENT
0
Entering edit mode
8.7 years ago
viv_bio ▴ 50

If you want to automate it install Python and Biopython

open python

from Bio import Entrez , SeqIO
handle = Entrez.efetch("pubmed", id="CG6854,CG8119,CG8359,CG9437,CpipJ_CPIJ001450", retmode="xml")
records = Entrez.parse(handle)
for record in records:
      write_list.append(record)

SeqIO.write(write_list, "output_file","fasta")

in place of id copy paste all ids. and you will get a output file.

ADD COMMENT

Login before adding your answer.

Traffic: 973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6