How to download constitutive coding exon sequences for a list of 200 genes in the mouse genome?
1
0
Entering edit mode
7.9 years ago
spring_tan • 0

Hi all,

I am a complete newbie in the world of bioinformatics. A molecular biologist by training and working 95% of the time at the bench, I am trying to learn how to download exon sequences that are expressed in all tissues for a list of 200 mouse genes and save them into a single FASTA file. I want to then scan for certain motifs in these sequences in a batch mode by feading the FASTA file into an online program.

So far, I have been trying to use the Entrez.efetch function in Biopython to fetch the infomation for a single gene and then use a loop function to download all the sequences from ncbi. Could some one tell me if I am heading in the right direction and could you help me out as I am stuck and not getting it to work... If not, could you point me to the right direction?

Many thanks in advance.

sequence genome gene • 1.3k views
ADD COMMENT
0
Entering edit mode

Look at these posts.

A: Online Resources For Mouse Research

The post is 4 years old, but many links are still alive.

There are two directories in NCBI for two different mouse strains:

ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/vertebrate_mammalian/Mus_musculus/latest_assembly_versions/

Recent mouse RNA-prot data from NCBI can be found here :

ftp://ftp.ncbi.nlm.nih.gov/refseq/M_musculus/mRNA_Prot/

You will have to learn about this browser :

https://genome.ucsc.edu/goldenpath/help/hgTracksHelp.html

ADD REPLY
0
Entering edit mode
7.9 years ago

So you already have your list of mouse genes? Using Biopython and Entrez.efetch sounds like an okay way to go. Have you seen the Biopython tutorial and cookbook? http://biopython.org/DIST/docs/tutorial/Tutorial.html

What exactly doesn't work?

ADD COMMENT

Login before adding your answer.

Traffic: 1985 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6