Question

ChIP-seq data analysis problems

1

Entering edit mode

4.8 years ago

H.Chloe ▴ 20

Dear all.

I am new to Bioinformatics and tools for analysis.
I finished ChIP-seq data analysis in r according to "ChIP-seq analysis basics(Aleksandra P ekowska, Simon Anders)", but there is a problems in result graph.
In my visualization result, figures in peaks of input data for ChIP-seq are too high.
Results from ChIP-seq analysis basics shows low figures less than 10, but mine is more than average 100.

Here is my code.

input <- import.bed("chip.input.bed")
file1 <- import.bed("chip.wt.bed")

library(chipseq)
prepareCHIPseq = function(reads){
frag.len = median(estimate.mean.fraglen(reads), na.rm = T)
cat(paste0("Median fragment size for this library is ", round(frag.len)))
reads.extended = resize(reads, width = frag.len)
return(trim(reads.extended))
}

input = prepareCHIPseq(input)
file1 = prepareCHIPseq(file1)

library(BSgenome)
library(BSgenome.Dmelanogaster.UCSC.dm6)
library(cluster.datasets)
genome <- BSgenome.Dmelanogaster.UCSC.dm6
si = seqinfo(genome)
si[c("chr2L", "chr2R")]

binsize = 200
bins = tileGenome(si[c("chr2R"),], tilewidth = binsize, 
              cut.last.tile.in.chrom = T)
BinChIPseq = function(reads, bins){
                        mcols(bins)$score = countOverlaps(bins, reads)
                        return(bins)
}

input.200bins = BinChIPseq(input, bins)
file1.200bins = BinChIPseq(file1, bins)

mart = useMart(biomart = "ENSEMBL_MART_ENSEMBL",
            dataset = "dmelanogaster_gene_ensembl")
fm = Gviz:::.getBMFeatureMap()
fm["symbol"] = "external_gene_name"

g_start = 23945000
g_end = 23975000

bm = BiomartGeneRegionTrack(chromosome = "chr2R", genome = "dm6",
                         start = g_start, end = g_end, biomart = mart, 
                         size = 4, name = "RefSeq", utr5 = "red3", utr3 = "red3",
                         protein_coding = "black", col.line = NULL, cex = 7, 
                         featureMap = fm) # no filter option
AT = GenomeAxisTrack()

input.track = DataTrack(input.200bins,
                    strand="*", genome="dm6", col.histogram='gray',
                    fill.histogram='black', name="Input", col.axis="black",
                    cex.axis=0.4, ylim=c(0,150))
file1.track = DataTrack(file1.200bins,
                    strand="*", genome="dm6", col.histogram='steelblue',
                    fill.histogram='black', name="h3k9me3", col.axis='steelblue',
                    cex.axis=0.4, ylim=c(0,150))

peaks.file1 = import.bed("chip.peak_file1.bed")
peaks1.track = AnnotationTrack(peaks.file1,
                           genome="dm6", name='Peaks File. 1',
                           chromosome='chr2R',
                           shape='box',fill='blue3',size=2)

plotTracks(c(input.track, file1.track, peaks1.track, bm, AT),
                  from=g_start, to=g_end,
                  transcriptAnnotation="symbol", window="auto",
                  type="histogram", cex.title=0.7, fontsize=10 )

I am not sure original data(original data of input was bam file, but I changed bam file to bed file using bamtobed option in bedtools.) have quality problems or code above made this result.
Could anyone tell me what the problem of my input is?
Plus, BiomartGeneRegionTrack in Gviz had filter=list("with_ox_refseq_mrna"=TRUE), but "with_ox_refseq_mrna" is gone in latest version.
Then is it okay to remove this ChIP-seq data analysis?

Thanks in advance!

ChIP-Seq r • 911 views

ADD COMMENT • link 4.8 years ago by H.Chloe ▴ 20

0

Entering edit mode

You short try and put in some pictures to illustrate the problem. Currently, it is hard to figure out what the exact problem is.

ADD REPLY • link 4.8 years ago by ATpoint 81k