Question

SNP in Transcription factor binding site analysis

2

Entering edit mode

9.5 years ago

Floris Brenk ★ 1.0k

Hi all,

I have about 100 sequences of different transcription start sites with a SNP in them and I would like to know if this SNP is affecting the transcription factor binding site. These SNPs are already influencing the expression of these genes, so this would be interesting to show that these SNPs are the actual causal variant.

My first guess was to put both allele in LASAGNA 2.0 and use jaspar CORE matrices "all vertebrates".

>seq_reference
CCATCTTGCGTCGCTCTTGCTTGAAGGCCG
>seq_alternative = higher expression
CCATCTTGCGTCGCTGTTGCTTGAAGGCCG

output:

seq_reference    
Name    Sequence    Position
(0-based)    Strand    Score    p-value    E-value
TFAP2A
(MA0003.1)    GCCTTCAAG    19    -    7.54    0.00085    0.0187
PBX1
(MA0070.1)    CCTTCAAGCAAG    15    -    7.58    0.00065    0.0123
Pax6
(MA0069.1)    TTCAAGCAAGAGCG    11    -    10.85    5.0E-5    0.00085

seq_alternative    
Name    Sequence    Position
(0-based)    Strand    Score    p-value    E-value
BRCA1
(MA0133.1)    GCAACAG    13    -    6    0.001    0.0240
TFAP2A
(MA0003.1)    GCCTTCAAG    19    -    7.54    0.00085    0.0187
Pax6
(MA0069.1)    TTCAAGCAACAGCG    11    -    9.78    0.0002    0.0034

But I don't see much difference and I don't really understand how to interpret this. Does anyone know if this is the right way to do it? Or have other ideas? Or is this the good way to do it only this is a bad example?

tfbs snp • 5.5k views

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.5 years ago by Floris Brenk ★ 1.0k

score 1 · Answer 1 · 2014-11-07

1

Entering edit mode

9.5 years ago

Denise CS ★ 5.2k

You can use the Ensembl VEP to see if you SNPs map to regulatory regions in the human genome. The Ensembl Regulation team have annotated regulatory regions based on ChIP-Seq data (for TF and histone marks) and DNaseI-Seq. They also have incorporated the data from JASPAR, so when you enter your SNPs into VEP you can view if they fall in regions of the genome where Ensembl regulatory features and motif features have been annotated.

ADD COMMENT • link 9.5 years ago by Denise CS ★ 5.2k

0

Entering edit mode

Thanks for your reply! Can I also see then whether the binding site is disrupted and so this SNP can be identified as the causal variant?

ADD REPLY • link 9.5 years ago by Floris Brenk ★ 1.0k

Ram · Answer 2 · 2014-11-07

1

Entering edit mode

9.5 years ago

Ming Tommy Tang ★ 3.9k

You can have a look at this website: http://regulome.stanford.edu/

It will tell you whether the SNP disrupts the TF binding or not.

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.5 years ago by Ming Tommy Tang ★ 3.9k

0

Entering edit mode

Wow this site looks very interesting thanks! For interpretation of the scores, the lower the score the more likely "damaging" I assume. What would be the difference between: matched TF motif, any motif and TF binding?

ADD REPLY • link updated 2.4 years ago by Ram 43k • written 9.5 years ago by Floris Brenk ★ 1.0k

0

Entering edit mode

Examples:

score

1F -> rs4763879
2A -> rs907611
3A -> rs6451493
4 -> rs1738074
5 -> rs5029939
6 -> rs7665090

ADD REPLY • link updated 2.4 years ago by Ram 43k • written 9.5 years ago by Floris Brenk ★ 1.0k

0

Entering edit mode

from the link:

Score     Supporting data
1a        eQTL + TF binding + matched TF motif + matched DNase Footprint + DNase peak
1b        eQTL + TF binding + any motif + DNase Footprint + DNase peak
1c        eQTL + TF binding + matched TF motif + DNase peak
1d        eQTL + TF binding + any motif + DNase peak
1e        eQTL + TF binding + matched TF motif
1f        eQTL + TF binding / DNase peak
2a        TF binding + matched TF motif + matched DNase Footprint + DNase peak
2b        TF binding + any motif + DNase Footprint + DNase peak
2c        TF binding + matched TF motif + DNase peak
3a        TF binding + any motif + DNase peak
3b        TF binding + matched TF motif
4         TF binding + DNase peak
5         TF binding or DNase peak
6         other

The smaller the score is, more likely it will disrupt the binding. you may want to read their paper to have a better idea http://www.ncbi.nlm.nih.gov/pubmed/22955989

ADD REPLY • link updated 2.4 years ago by Ram 43k • written 9.5 years ago by Ming Tommy Tang ★ 3.9k