extract sequence from genome based on homologues with a sequence
0
0
Entering edit mode
9.8 years ago

I have a sequence that have homologous in some species and the score of this homologue.. ex: this is a record from the gff file:

4592637 Beutenbergia_cavernae_DSM_12333 TILL    70731   70780   .   0   .   clst_id=429;SubjectOrganism=Thermofilum_pendens_Hrk_5;SubjectScore=0.343373493975904;SubjectOrganism=Ignicoccus_hospitalis_KIN4_I;SubjectScore=0.323293172690763;SubjectOrganism=Burkholderia_pseudomallei_MSHR346;SubjectScore=0.343373493975904;SubjectOrganism=Burkholderia_mallei_SAVP1;SubjectScore=0.343373493975904;SubjectOrganism=Enterobacter_638;SubjectScore=0.343373493975904;SubjectOrganism=Rickettsia_felis_URRWXCal2;SubjectScore=0.343373493975904;SubjectOrganism=Gemmatimonas_aurantiaca_T_27;SubjectScore=0.343373493975904;SubjectOrganism=Streptomyces_coelicolor;SubjectScore=0.363453815261044;SubjectOrganism=Beutenbergia_cavernae_DSM_12333;SubjectScore=1;SubjectOrganism=Kocuria_rhizophila_DC2201;SubjectScore=0.343373493975904;SubjectOrganism=Rhodococcus_jostii_RHA1;SubjectScore=0.383534136546185;SubjectOrganism=Symbiobacterium_thermophilum_IAM14863;SubjectScore=0.363453815261044;

where:

  • 4592637 => NAPP(Nucleic Acid Phylogenetic Profiling database) id of sequence (not genbank id)
  • Beutenbergia_cavernae_DSM_12333 => specie name of sequence
  • TILL => type of sequence
  • 70731 .. 70780 => start and end of sequence
  • clst_id=429 => is the id of cluster of this sequence
  • SubjectOrganism => name of specie that sequence has homologues with it
  • SubjectScore => score of homologues of sequence with this species (Blastn score)

I want to extract the sequence from the SubjectOrganism where the sequence(4592637) make similarity.

How can I extract the sequence from genome where a sequence has homologues in biopython???

genome biopython extract sequence homologues • 1.7k views
ADD COMMENT
0
Entering edit mode

Your file does not contain information on where the sequences align in the other organisms. So how would one extract that?

ADD REPLY
0
Entering edit mode

I want to extract the sequence with the highest score...

ADD REPLY
1
Entering edit mode

Like I said, your file does not contain information on what the alignment is so you cannot extract the sequence because the sequence is not present in the file.

It is possible that what you want is actually something else but you just call it the "sequence".

ADD REPLY
0
Entering edit mode

I try to blast the sequence with the entire genome of organism and choose the sequence that has highest score

ADD REPLY

Login before adding your answer.

Traffic: 1557 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6