Question: How to find a gene from genome shotgun reads
gravatar for qiyunzhu
4.8 years ago by
United States
qiyunzhu130 wrote:

I am trying to extract a gene marker, say, COII (cytochrome oxidase subunit II) (e.g., NCBI: JQ319797), from a sequenced but unassembled insect genome, Musca domestica (house fly) (NCBI: AQPM00000000). I downloaded all the scaffolds, created a local BLAST database, and blastned a COII homolog against this database. The result includes multiple hits, each looking like a real COII, but they slightly differ from each other in sequence. I thought one genome should only contain one version of gene, but the result seems to be a mixture. I am asking if it is an artifact due to the imcompleteness of the genome sequencing project, or I did something wrong, or, since it is a mitochondrial gene, there are supposed to be different versions within an insect? And how should I correctly get this gene marker from the genome? Thank you!

sequencing assembly • 1.5k views
ADD COMMENTlink written 4.8 years ago by qiyunzhu130

were the read counts for different variants comparable? also I would suggest changing the name of the question

ADD REPLYlink written 4.8 years ago by oganm60

Yes they are comparable. They align well, with several to a dozen polymorphisms, and sometimes short gaps. How would you suggest the new title to be?

ADD REPLYlink written 4.8 years ago by qiyunzhu130

first you resolve mitochondrial reads, and try assemble it and annotate mitochondrial genome with mitochondrial annotation servers like MITOS, DOGMA etc, it will annotate COII gene.

or you just design primers specific to this gene for insects and sequence this gene.

ADD REPLYlink written 4.8 years ago by cvu130

Thanks for your idea! However, I also need some nuclear genomes. And the genomic data does not indicate which reads are from mitochondria. Doing sequencing (you mean experimentally, right?) is not an option for me now, as we don't have those insect samples.

ADD REPLYlink written 4.8 years ago by qiyunzhu130

you just simply map your reads against closest reference mitochondrial genome, this is how you can resolve mitochondrial reads.

ADD REPLYlink written 4.8 years ago by cvu130
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1478 users visited in the last hour