Question: extract sequence from genome based on homologues with a sequence
gravatar for halima.loulou
4.4 years ago by
halima.loulou0 wrote:

I have a sequence that have homologous in some species and the score of this homologue.. ex: this is a record from the gff file:

4592637 Beutenbergia_cavernae_DSM_12333 TILL    70731   70780   .   0   .   clst_id=429;SubjectOrganism=Thermofilum_pendens_Hrk_5;SubjectScore=0.343373493975904;SubjectOrganism=Ignicoccus_hospitalis_KIN4_I;SubjectScore=0.323293172690763;SubjectOrganism=Burkholderia_pseudomallei_MSHR346;SubjectScore=0.343373493975904;SubjectOrganism=Burkholderia_mallei_SAVP1;SubjectScore=0.343373493975904;SubjectOrganism=Enterobacter_638;SubjectScore=0.343373493975904;SubjectOrganism=Rickettsia_felis_URRWXCal2;SubjectScore=0.343373493975904;SubjectOrganism=Gemmatimonas_aurantiaca_T_27;SubjectScore=0.343373493975904;SubjectOrganism=Streptomyces_coelicolor;SubjectScore=0.363453815261044;SubjectOrganism=Beutenbergia_cavernae_DSM_12333;SubjectScore=1;SubjectOrganism=Kocuria_rhizophila_DC2201;SubjectScore=0.343373493975904;SubjectOrganism=Rhodococcus_jostii_RHA1;SubjectScore=0.383534136546185;SubjectOrganism=Symbiobacterium_thermophilum_IAM14863;SubjectScore=0.363453815261044;




==>4592637 => NAPP(Nucleic Acid Phylogenetic Profiling database) id of sequence (not genbank id)

==>Beutenbergia_cavernae_DSM_12333 => specie name of sequence

==>TILL => type of sequence

==>70731 .. 70780 => start and end of sequence

==>clst_id=429 => is the id of cluster of this sequence

==>SubjectOrganism => name of specie that sequence has homologues with it

==>SubjectScore => score of homologues of sequence with this specie ( Blastn score )


I want to extract the sequence from the SubjectOrganism where the sequence(4592637) make similarity.

how can i extract the sequence from genome where a sequence has homologues in biopython???






ADD COMMENTlink modified 4.4 years ago • written 4.4 years ago by halima.loulou0

Your file does not contain information on where the sequences align in the other organisms. So how would one extract that?

ADD REPLYlink written 4.4 years ago by Istvan Albert ♦♦ 78k

I want to extract the sequence with the highest score...

ADD REPLYlink written 4.4 years ago by halima.loulou0

like I said, your file does not contain information on what the alignment is so you cannot extract the sequence because the sequence is not present in the file. 

it is possible that what you want is actually something else but you just call it the  "sequence" 

ADD REPLYlink written 4.4 years ago by Istvan Albert ♦♦ 78k

I try to blast the sequence with the entire genome of organism and choose the sequence that has highest score

ADD REPLYlink written 4.4 years ago by halima.loulou0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 799 users visited in the last hour