Which sequence should I choose?
1
0
Entering edit mode
21 months ago
David • 0

Hi all,

I've collected sequences from 5 species for 10 different genes. My method was to find the gene RefSeq numbers from my reference genome (Drosophila melanogaster) and type this into the the search bar in genome browser for the other species (other Drosophila species).

This has returned more than one sequence for each gene per species (e.g. if I'm looking for the gene HDAC4 in Drosophila simulans, it returns 3-4 sequences instead of the expected 1) which makes me wonder, which sequence should I pick? Is there an optimal method for doing this, or do you have any advice?

I'd really appreciate any help on this one!

Best wishes,

David

UCSC browser genome sequence alignment • 1.2k views
ADD COMMENT
2
Entering edit mode
21 months ago
GenoMax 142k

Are you doing these searches at flubase.org?

I see only one sequence (form RefSeq) if I do the search like this at NCBI: https://www.ncbi.nlm.nih.gov/search/all/?term=HDAC4+%5BGENE%5D+AND+Drosophila+simulans+%5BORGN%5D

Drosophila melanogaster on the other hand has 8 known RefSeq transcripts: https://www.ncbi.nlm.nih.gov/search/all/?term=HDAC4+[GENE]+AND+Drosophila+melanogaster+[ORGN]

Are you looking for sequence of gene or transcript? If you want the sequence of the gene then you will need to click on the gene database link to get to that page. From there look for a fasta link in genomic regions section.

D. simulans does not seem to have a similar gene entry. For D. melanogaster: https://www.ncbi.nlm.nih.gov/gene/?term=HDAC4%20%5BGENE%5D%20AND%20Drosophila%20melanogaster%20%5BORGN%5D

ADD COMMENT
0
Entering edit mode

Hi GenoMax,

Thanks for the reply!

HDAC4 was a random gene from the top of my head, apologies for the confusion.

I'm looking for the gene, so when I search for this using the melanogaster refseq number in UCSC genome browser I get multiple genes on different chromosomes for, for instance, simulans.

I tried NCBI and the gene database but it seemed to only export the coding regions, and I'm interested in the full sequence. I also want to take 1000 bases upstream and downstream of the gene, which I don't think you can do in the gene database?

Anyway yeah, these a bit far from my original question which is if you have multiple sequences for one gene, how to you decide which to choose? Is there an example of someone doing this? I haven't been able to find anything.

ADD REPLY
0
Entering edit mode

Unfortunately NCBI and Ensembl only carry the melanogaster genome. So you are going to be limited to UCSC or flybase.org (no longer free I think) for this.

I think your best bet is to grab the genes you need from melanogaster and then identify homologous regions from genome files you can download from UCSC: https://hgdownload.soe.ucsc.edu/downloads.html

ADD REPLY
0
Entering edit mode

Yeah this is what I've already done, my question was: if you have multiple sequences for one gene in the same species and in different locations (i.e. different chromosomes), how to you decide which to choose?

ADD REPLY
1
Entering edit mode

You will have to do a careful analysis of them by doing sequence alignments to make sure they are real orthologs/paralogs. Other drosophila genomes are probably nowhere near complete as melanogaster and that can prove a challenge.

What is the ultimate goal here? Has this analysis not been done by other fly people over the years?

ADD REPLY
0
Entering edit mode

Ok that make sense, this is what I thought but I wasn't sure if there was a more automated method. Yes you're right, but they seem to align fairly well.

We're looking at conservation in promotor and intronic regions of a particular set of genes regulated by a particular DNA binding protein. The data are supplementary to some expression observations we've made - I hope this makes sense.

ADD REPLY
1
Entering edit mode

If they align well then that should make your job a bit easier. Go for hits on the longest contigs since those are likely to be of good quality than hits to small contigs.

ADD REPLY
0
Entering edit mode

Ok will do, thanks for the help!

ADD REPLY

Login before adding your answer.

Traffic: 1757 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6