Hi everyone! I am new to analysis the SNP data. Now I have get some SNP ids such as rs937395. The first step, I want to find the promoter, enhancer, TFBS and Gene influenced by this SNP. I mean I have obtained the location of this SNP on the chromosome, how can I find the promoter, enhancer, TFBS and Gene encompassed this location? The Second step, following the first, how can I find the promoter, enhancer, TFBS regulated Genes. I want to construct a regulated network from the SNP to the Gene I am interested. Thank you very much guys! Please excuse my poor English.
You can use BEDOPS tools to query BED files that contain promoter, enhancer, TFBS and gene annotations, against a BED-formatted file that shows the position(s) of your SNP(s).
First, you could get a full list of SNPs into BED format.
Let's say you are using genome build
$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e 'SELECT chrom, chromStart, chromEnd, name FROM snp141Common' | tail -n +2 | sort-bed - > snp141Common.bed
You might filter this for your SNP of interest (e.g.,
$ grep -F 'rs937395' snp141Common.bed > rs937395.bed
If you have a text file of SNP IDs of interest, you could filter on matches with entries in that file:
$ grep -Ff snps_of_interest.txt snp141Common.bed > snps_of_interest.bed
Next, you might grab annotations of interest.
As an example, let's grab GENCODE v19 records, filter them for genes, and convert the result to BED with the
gtf2bed conversion tool:
$ wget -O - ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_19/gencode.v19.annotation.gtf.gz \ | gunzip -c \ | grep -w "gene" \ | gtf2bed \ > gencode.v19.genes.bed
To demonstrate a query, we can use the
bedmap map tool to look at a 1 kb window around your particular SNP(s) of interest, looking for any GENCODE v19 gene ID annotations that fall within that window.
For instance, around
$ bedmap --range 500 --echo --echo-map-id-uniq rs937395.bed gencode.v19.genes.bed > answer.bed
Or around all SNPs of interest:
$ bedmap --range 500 --echo --echo-map-id-uniq snps_of_interest.bed gencode.v19.genes.bed > answer.bed
Basically, you repeat and adjust this procedure depending on the window of interest, SNPs of interest, and target annotations of interest.