Question: Analyzing genetic context/gene synteny of hundreds of sequences.
gravatar for pawlowac
3.0 years ago by
pawlowac60 wrote:

Hi everyone,


I'm looking at analysing the genetic context which my gene is found in among hundreds of genomes. I have the sequence for 5kb upstream and downstream of my gene. I have tried mauve, but it doesn't seem to handle this number of sequences at once. 

My thought process is as follows;

1) identify conserved fragments of DNA (coding or non-coding) within the sequence

2) group sequences together that have those same fragments

3) Use mauve to analyze a smaller number of more similar sequences


I'm not quite sure how to tackle 1 and 2. Using a global-alignment program (MAFFT) doesn't work here since I run into a memory shortage (I have 8 gb). Does anyone have a suggestion?

ADD COMMENTlink modified 2.9 years ago by Biostar ♦♦ 20 • written 3.0 years ago by pawlowac60

How about identifying all refseq genomes that have the same gene and retrieving the annotations within ± 5kb in those genomes? This wouldn't be computationally demanding and would probably be relatively easy to achieve with e.g. blast against refseq_genomic and then some entrez direct magic..

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by 5heikki7.1k

I've used an ebot (efetch) perl script to download all genomes associated with my protein GI numbers. Then, using biopython I've been able to extract annotations for +/-5 kb around my gene of interest. Do you have a suggestion for automatically comparing the sequences?

ADD REPLYlink written 3.0 years ago by pawlowac60

What do you hope to achieve from comparing the sequences that you did not find out from comparing the annotations?

ADD REPLYlink written 3.0 years ago by 5heikki7.1k

I hope to identify potential sites of recombination, a comparison of sequence identity of surrounding genes and the average mutation rate between the region surrounding target genes compared to the average mutation rate of the target genes.

ADD REPLYlink written 3.0 years ago by pawlowac60

You don't say if you are looking at a populational level, close species comparison, or comparisons between a wider range of taxa.

ADD REPLYlink written 2.9 years ago by h.mon12k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 993 users visited in the last hour