EPIC array - find probes that are in promoters
1
0
Entering edit mode
5.2 years ago
veronico ▴ 70

Hi everyone,

I found some DMPs from a few samples, and have now a list of probes. However, these probes are probably from different parts of the genome and I am only interested in the ones that are found in promoters. I was wondering if anyone had a recommendation on how to do this. I have the coordinates of these probes from a manifest provided from Illumina, but I am stuck on how to proceed from here.

Thank you!

methylation • 3.3k views
ADD COMMENT
0
Entering edit mode

Get a reference annotation (GTF) file for your organism and extract the transcription start sites of every gene. As a proxy for promoters one might use something like -200bp to the TSS. Be sure to respect the strands, so if a gene is on the top strand then the start coordinate is the actual start site, if on the bottom, the end coordinate is the actual gene start. From there on, simply intersect your promoters and DMPs. Be sure that the coordinates from the DMP and the reference annotation is based on the same genome assembly, e.g for human both hg19.

ADD REPLY
0
Entering edit mode

Thank you for your advice! It is really helpful!

ADD REPLY
0
Entering edit mode

Would you know also how to categorize the probes by which region they are found (body, 5'UTR, intron, etc...)? The manifest has a column that explains this, but it is a bit confusing. For example, one region is explained to be "5'UTR;TSS200;TSS200;Body".

ADD REPLY
1
Entering edit mode

I like to use annotatr to get some basic annotations (http://bioconductor.org/packages/release/bioc/vignettes/annotatr/inst/doc/annotatr-vignette.html#annotationhub-annotations). But yeah, annotations are not exclusive. If something is a 5'UTR it is also in the gene body and it is an exon and it is close to the TSS. You'll have to decide for what you need. I think the most basic annotation classification is (3'UTR, 5'UTR, exon, intron, intergenic, promoter).

ADD REPLY
2
Entering edit mode
5.2 years ago

I am guessing that you are using annotations from one of these sources (or a program that uses that as the dependency)?

IlluminaHumanMethylationEPICanno.ilm10b4.hg19

IlluminaHumanMethylationEPICanno.ilm10b2.hg19

You can create a data frame to view those annotations using a command similar to the following:

data("IlluminaHumanMethylationEPICanno.ilm10b2.hg19")
annoObj = getAnnotation(IlluminaHumanMethylationEPICanno.ilm10b2.hg19)
ADD COMMENT
1
Entering edit mode

yes, I am using "IlluminaHumanMethylationEPICanno.ilm10b4.hg19"

ADD REPLY
1
Entering edit mode

Thank you!!!!!!!!! This is really helpful!

ADD REPLY

Login before adding your answer.

Traffic: 1695 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6