Question: ChIP-seq data analysis problems
gravatar for H.Chloe
14 months ago by
H.Chloe20 wrote:

Dear all.

I am new to Bioinformatics and tools for analysis.
I finished ChIP-seq data analysis in r according to "ChIP-seq analysis basics(Aleksandra P ekowska, Simon Anders)", but there is a problems in result graph.
In my visualization result, figures in peaks of input data for ChIP-seq are too high.
Results from ChIP-seq analysis basics shows low figures less than 10, but mine is more than average 100.

Here is my code.

input <- import.bed("chip.input.bed")
file1 <- import.bed("chip.wt.bed")

prepareCHIPseq = function(reads){
frag.len = median(estimate.mean.fraglen(reads), na.rm = T)
cat(paste0("Median fragment size for this library is ", round(frag.len)))
reads.extended = resize(reads, width = frag.len)

input = prepareCHIPseq(input)
file1 = prepareCHIPseq(file1)

genome <- BSgenome.Dmelanogaster.UCSC.dm6
si = seqinfo(genome)
si[c("chr2L", "chr2R")]

binsize = 200
bins = tileGenome(si[c("chr2R"),], tilewidth = binsize, 
     = T)
BinChIPseq = function(reads, bins){
                        mcols(bins)$score = countOverlaps(bins, reads)

input.200bins = BinChIPseq(input, bins)
file1.200bins = BinChIPseq(file1, bins)

mart = useMart(biomart = "ENSEMBL_MART_ENSEMBL",
            dataset = "dmelanogaster_gene_ensembl")
fm = Gviz:::.getBMFeatureMap()
fm["symbol"] = "external_gene_name"

g_start = 23945000
g_end = 23975000

bm = BiomartGeneRegionTrack(chromosome = "chr2R", genome = "dm6",
                         start = g_start, end = g_end, biomart = mart, 
                         size = 4, name = "RefSeq", utr5 = "red3", utr3 = "red3",
                         protein_coding = "black", col.line = NULL, cex = 7, 
                         featureMap = fm) # no filter option
AT = GenomeAxisTrack()

input.track = DataTrack(input.200bins,
                    strand="*", genome="dm6", col.histogram='gray',
                    fill.histogram='black', name="Input", col.axis="black",
                    cex.axis=0.4, ylim=c(0,150))
file1.track = DataTrack(file1.200bins,
                    strand="*", genome="dm6", col.histogram='steelblue',
                    fill.histogram='black', name="h3k9me3", col.axis='steelblue',
                    cex.axis=0.4, ylim=c(0,150))

peaks.file1 = import.bed("chip.peak_file1.bed")
peaks1.track = AnnotationTrack(peaks.file1,
                           genome="dm6", name='Peaks File. 1',

plotTracks(c(input.track, file1.track, peaks1.track, bm, AT),
                  from=g_start, to=g_end,
                  transcriptAnnotation="symbol", window="auto",
                  type="histogram", cex.title=0.7, fontsize=10 )

I am not sure original data(original data of input was bam file, but I changed bam file to bed file using bamtobed option in bedtools.) have quality problems or code above made this result.
Could anyone tell me what the problem of my input is?
Plus, BiomartGeneRegionTrack in Gviz had filter=list("with_ox_refseq_mrna"=TRUE), but "with_ox_refseq_mrna" is gone in latest version.
Then is it okay to remove this ChIP-seq data analysis?

Thanks in advance!

chip-seq R • 365 views
ADD COMMENTlink modified 14 months ago • written 14 months ago by H.Chloe20

You short try and put in some pictures to illustrate the problem. Currently, it is hard to figure out what the exact problem is.

ADD REPLYlink written 14 months ago by ATpoint38k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1647 users visited in the last hour