Extracting Transcription Factor Binding Sites Out Of Gff File
1
2
Entering edit mode
11.2 years ago
x.talebzadeh ▴ 20

Hi everybody,

I got a little problem in my research. I have a file which is in GFF file format and contains coordinates of binding sites of some transcription factors, but I wonder if I can use these coordinates as binding sites and map these coordinates to genome. moreover some records are on reverse strand and I don't know if I should convert these coordinates to forward strand coordination, or just use them directly on forward strand. here are some records from the file:

chr1    MOTEVOC_cage_181208    TF_binding_site_cage_181208    865485    865498    0.988754905    +    .    TF_binding_site_cage_181208 SP1-232644 ;ALIAS SP1 ;

chr1    MOTEVOC_cage_181208    TF_binding_site_cage_181208    865545    865558    0.976246809    -    .    TF_binding_site_cage_181208 SP1-232645 ;ALIAS SP1 ;

chr1    MOTEVOC_cage_181208    TF_binding_site_cage_181208    884554    884567    0.999270603    -    .    TF_binding_site_cage_181208 SP1-239587 ;ALIAS SP1 ;

chr1    MOTEVOC_cage_181208    TF_binding_site_cage_181208    944651    944664    0.998479822    +    .    TF_binding_site_cage_181208 SP1-232005 ;ALIAS SP1 ;

thank you so much for your help. ;)

gff • 2.8k views
ADD COMMENT
1
Entering edit mode

You said they were 'coordinates of binding sites'. What do you mean when you say 'can I use these coordinates as binding sites?' You seem to essentially be asking if you can use binding sites as binding sites and the answer to that would be yes, tautologically. If they are coordinates then they are already mapped to the genome since a coordinate is telling you exactly where in the genome your feature is. I guess I think you need to be more clear about what you want.

ADD REPLY
0
Entering edit mode

thank you so much for reply, Well, to be more Accurate, I have the file but I'm not sure if they are really the binding sites, besides I don't know how to treat the reverse strand coordinates.When using other file formats like .BED , I didn't need to convert the coordinates but I'm not sure if it is the case fot GFF files. thank you again.

ADD REPLY
0
Entering edit mode

Sorry. Maybe I'm being dense. I don't get what you want to do. The accuracy of the binding sites depends on what program you used to make the GFF file. I don't think moving sites from the reverse to the forward strand makes much sense. GFF is already a good enough human-readable format if all you want to know is where your sites are. So, I am assuming you want to do something with the GFF file, like put it in another program or display it in a browser. If that's the case, you should name the program you want to use, what format you need or maybe be a little more specific about what you are trying to do.

ADD REPLY
0
Entering edit mode

What genome ?

ADD REPLY
0
Entering edit mode
11.2 years ago

It is not entirely clear what you are looking for, here is a solution to what I think you may be after:

Use a tool such as fastaFromBed in the bedtools package. This will extract the sequences on the proper strand. Then you can remap these sequences to another genome with an aligner. The resulting BAM file can be turned into a BED file via the bamToBed command of bedtools.

ADD COMMENT

Login before adding your answer.

Traffic: 1454 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6