Finding specific subsets of genes related with known transcription factors
2
0
Entering edit mode
6.6 years ago

Hi there,

It's the first time I use this platform so I hope it reaches to many people.

I have just got DNA sequencing results from an experiment where I had different conditions. I made comparisons from those conditions and got those Transcription Factor Binding Sites (TFBSs) that where enriched in each of them.

My field of work is reproduction (focusing in embryo development and implantation). I want to extract those genes that are regulated by each TF (associated with the TFBSs) and which are related with functions in embryo development and implantation. I think that may be filtering by GO terms may be the best way to do this. In that case, for each TFs, I would also like to get a parameter showing the contribution of GO terms related to reproduction from the total GO terms for the TF.

I've been reading about a method called PASTAA that may help, but I don't really know how it works or even if it would be useful for my problem.

Thanks in advance!

sequencing • 1.5k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

by TFBS do you mean you have a motif matrix or a the regions in the genome (i.e. the coordinates) where a factor bound?

ADD REPLY
0
Entering edit mode

You say you have enriched TFBSs - To ensure we can answer your question, and to avoid any confusion with terminology and interpretation, can you tell us please how you got these "TFBSs"? It is more likely that you actually have a set of enriched motifs. If you do have the sites already.... then for each site you only need look for the nearest downstream gene within the same TAD

ADD REPLY
0
Entering edit mode

In order to get the TFBSs sequences we just sequenced the genomic regions within our target samples. Afterwards, we conducted a comparative analysis of the TFBSs regions using the ENSEMBL database as the source.

We actually thought about looking for the genes sourrounding the TFBSs, but this seems quite imprecise as, as you know, some of the TFs present trans-acting activity and may regulate promoters quite far up/downstream of their position.

Thanks again!

ADD REPLY
1
Entering edit mode
6.5 years ago

If you have the TF binding sites in UCSC BED format, you can use BEDOPS convert2bed to convert gene annotations to BED, bedops to make a file of gene promoters (say, a region 500 nt upstream of the gene TSS), and bedmap to associate binding sites with the promoters of genes.

Get some gene annotations and write them to a BED-formatted file:

$ wget -qO- ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_27/gencode.v27.basic.annotation.gff3.gz \
    | gunzip --stdout - \
    | awk '$3 == "gene"' \
    | convert2bed -i gff - \
    > genes.bed

Or replace this step with whatever annotation source you prefer, for your reference genome.

Make a file of promoter regions from the genes:

$ awk -v OFS="\t" '($6 == "+") { print $1, $2, ($2+1), $4; }' genes.bed | bedops --range -500:0 --everything - > promoters.for.bed
$ awk -v OFS="\t" '($6 == "-") { print $1, ($3 - 1), $3, $4; }' genes.bed | bedops --range 0:500 --everything - > promoters.rev.bed
$ bedops --everything promoters.for.bed promoters.rev.bed > promoters.bed

Sort your BED-formatted TF binding sites:

$ sort-bed tfbs.unsorted.bed > tfbs.bed

Map binding sites to gene promoters:

$ bedmap --echo --echo-map --delim '\t' promoters.bed tfbs.bed > answer.bed

Each line of the file answer.bed contains a promoter region, its associated gene ID, and any TF binding sites that overlap the gene's promoter.

ADD COMMENT
0
Entering edit mode

OK, I am going for that, thanks a lot for the help!

ADD REPLY
0
Entering edit mode
6.5 years ago

Hi!

I actually have the regions on the genome where certain transcription factors bind, thanks!

ADD COMMENT

Login before adding your answer.

Traffic: 1879 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6