Question: Target genes of position-specific transcription factors
0
gravatar for zaibunnisa.t
13 months ago by
zaibunnisa.t0 wrote:

I have a list of transcription factors with the following information:

chrom position  rs_ids   cosmic_id   motif_id   motif_alt_id   matched_sequence

chr3 71435739 rs201008547 COSN16951339 MA0528.1 ZNF263 GAGGGAGGAAGGGACGGAGGG

I want to know the target genes which are regulated by these transcription factors. Can anyone please give me suggestions for how can I do so, and what kind of tools I should use?

I also tried previously described tools and online browser but they give thousands of targeted genes for 1 transcription factor but I want more position-specifically related target genes.

Thanks in advance!!!!!

ADD COMMENTlink modified 13 months ago • written 13 months ago by zaibunnisa.t0

I also tried previously described tools and online browser but they give thousands of targeted genes for 1 transcription factor but I want more position-specifically related target genes.

Some transcription factors do, literally, have many thousands of targets. Look up oestrogen ('estrogen', in US english) receptor α (alpha), Myc, and Pten, for example. Keep in mind that a transcription factor doesn't know what are its targets... it just binds wherever there is an electromagnetic / 'electrochemical' potential such that it can bind, which is mediated via target DNA sequence motifs and binding sites on the transcription factor. Where binding is sufficiently strong, it may exert its effects; where binding is not strong, the effect may be weaker or non-existent. Also, the target regions have to be accessible for binding to occur - different regions of chromatin will be 'open' (accessible) in different tissues due to tissue-specific differences. These can be gauged by ATAC-seq.

Using the programs that you have already tried, you should be able to order the targets by some sort of score and/or decide whether tissue-specific differences may be at play.

ADD REPLYlink modified 13 months ago • written 13 months ago by Kevin Blighe52k
0
gravatar for Alex Reynolds
13 months ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

It's a bit of work, maybe, but perhaps the following set operations could guide some investigation.

Your example TF is MA0528.1, which is a Jaspar identifier.

For your genome of interest, you could run that genome's sequence through FIMO to call binding sites of Jaspar TF models at some threshold, say 1e-4 or 1e-5. Say this file is called tbfs.jaspar.1e-5.bed.

Given a set of whole-genome binding sites, you can then filter that set using the proximal promoters of all genes of interest (genes.bed). These could be Gencode genes in GFF format, converted to BED via gff2bed, or by way of similar approaches.

Proximal promoters could be defined as a 1kb region upstream of the gene's TSS:

$ bedops -u <( awk ($6="+") genes.bed | bedops --range -1000:0 - ) <( awk ($6="-") genes.bed | bedops --range 0:1000 - ) > promoters.bed

Then filter the whole-genome TFBS set :

$ bedops --element-of 1 tbfs.jaspar.1e-5.bed promoters.bed > tbfs.jaspar.1e-5.subset.bed

Then grep this subset for MA0528.1:

$ grep MA0528.1 tbfs.jaspar.1e-5.subset.bed > MA0528.1.hits.bed

and map these hits back to the genes:

$ bedmap --range 1000 --echo --skip-unmapped genes.bed MA0528.1.hits.bed > answer.bed

You might add TF-specific ChIP-seq data overlaps as experimental evidence of concordance of gene promoters derived from answer.bed with TFs of interest actually binding to those regions in real life.

ADD COMMENTlink written 13 months ago by Alex Reynolds29k

Alex Reynolds thanx for your reply I downloaded the file from Genecode "gencode.v29.chr_patch_hapl_scaff.annotation.gff3", and converted into bed format:

chr1    1320455 1320529 exon:ENST00000435064.5:3        .       -       HAVANA  exon    .       ID=exon:ENST00000435064.5:3;Parent=ENST00000435064.5;gene_id=ENSG00000127054.20;transcript_id=ENST00000435064.5;gene_type=protein_coding;gene_name=INTS11;transcript_type=protein_coding;transcript_name=INTS11-208;exon_number=3;exon_id=ENSE00003666435.1;level=2;protein_id=ENSP00000413493.1;transcript_support_level=1;tag=basic,appris_principal_1,CCDS;ccdsid=CCDS21.1;havana_gene=OTTHUMG00000003330.13;havana_transcript=OTTHUMT00000009360.2

When I run bedops command:

bedops -u <( awk ($6="+") gencode.v29.chr_patch_hapl_scaff.annotation.bed | bedops --range -1000:0 - ) <( awk ($6="-") gencode.v29.chr_patch_hapl_scaff.annotation.bed | bedops --range 0:1000 - ) > promoters.bed

it gives me the following error:

-bash: command substitution: line 15: syntax error near unexpected token `$6="+"'
-bash: command substitution: line 15: ` awk ($6="+") gencode.v29.chr_patch_hapl_scaff.annotation.bed | bedops --range -1000:0 - )'

Could you check is there any problem in bed format? etc.,

ADD REPLYlink written 13 months ago by zaibunnisa.t0

Sorry, try adding ticks around the awk condition:

bedops -u <( awk '($6="+")' gencode.v29.chr_patch_hapl_scaff.annotation.bed | ... )
ADD REPLYlink modified 13 months ago • written 13 months ago by Alex Reynolds29k

again it gives the following error.

bedops -u <( awk '($6="+")' gencode.v29.chr_patch_hapl_scaff.annotation.bed | bedops --range -1000:0 - ) <( awk '($6="-")' gencode.v29.chr_patch_hapl_scaff.annotation.bed | bedops --range 0:1000 - ) > promotor.bed
    May use bedops --help for more help.

    Error: Bad Input
    No operation argument given.
    May use bedops --help for more help.

    Error: Bad Input
    No operation argument given.
ADD REPLYlink written 12 months ago by zaibunnisa.t0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1830 users visited in the last hour