How to identify the corresponding gene of a short sequence of a genome?
0
0
Entering edit mode
4.4 years ago
Kumar ▴ 120

I have extracted a signature patterned short sequence from a genome. The short sequence either be a promoter or regulator of a particular gene. I know the coordinates of the short sequence within the genome, but how to identify or predict the short sequence corresponding gene? Is there any way to identify the gene of the corresponding short sequence?

Please see the picture illustration of the question?

picture illustration

fasta gene genome sequence • 887 views
ADD COMMENT
1
Entering edit mode

As I understand this, you have the known position and/or sequence of a small genomic feature (promoter or whatever). You want to know the sequence of the gene adjacent to it?

MultiGeneBlast would work for this potentially. If you do a profile search for the promoter sequence, you can find the sequences in its environment. If you already have an example of the gene you're looking for, you can include that in the profile and that will keep the search constrained.

ADD REPLY
0
Entering edit mode

Thank you @Joe, I would try MultiGeneBlast aswell.

ADD REPLY
0
Entering edit mode

I am not sure I understand the question. If you know the genomic coordinates of the sequence and the genome is annotated then you can just look up the gene(s) whose coordinates overlap with those of your sequence. If the genome is not annotated, you could try annotating the region around your sequence or simply find homologous regions that are annotated and infer your gene of interest from there.

ADD REPLY
0
Entering edit mode

@Jean-Karim Heriche, Your suggestion is the possible solution of my question. However, I have to do the same for multiple short sequences. So, manual screening would be a tedious process. Therefore, could you please suggest me any possible way to automate the process?

ADD REPLY
1
Entering edit mode

The details depend on what's available to you. If the annotated genome is present in Ensembl, then you could write a script using the Ensembl perl API. If you have BED/GFF files, you could use bedtools closest or intersect.

ADD REPLY
0
Entering edit mode

@Jean-Karim Heriche, Yes, I have prokka annotated files (including GFF) of the concerned genome. I would try bedtools. Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 1720 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6