Peak Calling on a single gene?
1
0
Entering edit mode
15 months ago
Maycon • 0

I have a dataset containing some human (cancer cell-line) CHIP-seq runs for different treatment and control groups (both with two replicates) for histone methylation and acetylation marks (8 experiments in total), and I want to perform a peak calling experiment on this data, but I only really care about quantifying the mapping against a single gene (human CXCL1) across the different marks and the treatment and control groups. So I was wondering, does it make sense to map the reads (using BWA) to just the gene sequence before peak-calling, or should I map them at least to the entire chromosome? Do I lose any information by mapping them to the shorter single gene sequence over mapping them to the entire chromosome or the entire genome? Computationally it would be better to map them to the gene sequence instead of an entire genome/chromosome so, if it is OK to do so, I would prefer.

chip peak illumina calling chip-seq • 458 views
ADD COMMENT
1
Entering edit mode

Computationally it would be better to map them to the gene sequence instead of an entire genome/chromosome

Using a reduced reference representation is not advisable when the data came from whole genome. Aligners will try their best to align data and thus reads that did not originate from this region will likely be aligned and can potentially mess up your results.

ADD REPLY
2
Entering edit mode
15 months ago
ATpoint 82k

Peak callers like macs need a genome-wide distribution of reads iirc to properly build their background models so that pvalues are reliable. I would do standard analysis and then just filter for the region you want post-hoc after peak calling.

ADD COMMENT

Login before adding your answer.

Traffic: 3156 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6