Question: Prioritizing Tfbs Snps
gravatar for Jiny
9.5 years ago by
Jiny20 wrote:

I have selected 147 functional SNPs using genomatrix in a set of genes and tried to analyze the polymorphic status of the SNPs. 47 were polymorphic and located in TFBS (Transcription factor binding site). Can anyone please suggest me methods of prioritizing the polymorphic SNPs using bioinformatics So that I will be able to reduce the number of SNPs for further high throughput genotyping.

binding snp transcription • 2.4k views
ADD COMMENTlink modified 9 months ago by Biostar ♦♦ 20 • written 9.5 years ago by Jiny20

What If we already have a TFBS (ChIP-Seq) dataset ? Can I use GATK ?

ADD REPLYlink written 9.2 years ago by Curiosity120
gravatar for Casey Bergman
9.5 years ago by
Casey Bergman18k
Athens, GA, USA
Casey Bergman18k wrote:

Montgomery et al in "A survey of genomic properties for the detection of regulatory polymorphisms" report that "distance to transcription start site, local repetitive content, sequence conservation, minor and derived allele frequencies, and presence of a CpG island" have discriminatory potential for identifying rSNPs.

ADD COMMENTlink written 9.5 years ago by Casey Bergman18k
gravatar for Dataminer
9.5 years ago by
Dataminer2.7k wrote:

Have you tried MAPPER click here this might solve your problem to an extent... in my case it did.

ADD COMMENTlink modified 9.5 years ago by Casey Bergman18k • written 9.5 years ago by Dataminer2.7k
gravatar for Larry_Parnell
9.5 years ago by
Boston, MA USA
Larry_Parnell16k wrote:

MAPPER is our tool of choice as well as it uses both TRANSFAC and JASPAR motifs. Here's how we've analyzed SNPs with MAPPER:

Take a 41-bp segment of the genome with your SNP at position 21. That is 20bp of genome seq on either side of the SNP. I use 20 because the biggest models MAPPER uses are about 15 bp. Copy this sequence and append it to the end of your 41 bp segment and place an N between the two concatenated sequences (I use the N as a spacer or punctuation mark). Put allele 1 at position 21 and allele 2 at position 63. You have a sequence of 83 bp in teh following format:

(20 bp of genome, or bases 1-20)-allele 1-(next 20 bp of genome, or bases 22-41)-N-(20 bp of genome, or 1-20)-allele 2-(next 20 bp of genome, or 22-41)

In this manner I can assay one sequence to cover both alleles. Other approaches will work as well - e.g. two queries each with a different allele. Do as you wish.

Run MAPPER and save your results. I filter the results by score and E-value to retain only the most likely predictions.

I then look at for allele-specific binding of transcription factors that are relevant to the phenotypes we're following. This last point means that I delete those predictions that are for plant and invertebrate TFs. I am also not interested in many TFs that do not have a role in our research topics (obesity, diabetes, e.g.). For me, the predictions by MAPPER must encompass the positions where the SNP alleles are in the query sequence - positions 21 and 63.

I can highly recommend this approach as it has given us many good associations, even several that show interactions with components of the environment that drive activation of the TFs predicted by MAPPER.

ADD COMMENTlink written 9.5 years ago by Larry_Parnell16k

Thanks Larry for providing such a descriptive answer :) , I could have also done that, but I wanted Jiny to work her/his way out with the tool. @Jiny: Larry has provided you the exact way to walk on a path.

ADD REPLYlink written 9.5 years ago by Dataminer2.7k
gravatar for
7.4 years ago by
United States
mulin0424.li120 wrote:

Combining the genetic and epigenetic features by recent ENCODE project, a tool named GWAS3D can help you quit a lot on regulatory SNPs prioritization. Please visit this site:

ADD COMMENTlink written 7.4 years ago by mulin0424.li120
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1483 users visited in the last hour