Question: Converting .narrowPeak file to GRanges for use in Bioconductor ChIPseeker
0
gravatar for dally
3.7 years ago by
dally180
United States
dally180 wrote:

I have a .narrowPeak file of Pol IIthat has called peaks using MACS2 and has annotated the peaks using HOMER. I am attempting to visualize the peaks using the covplot command in the Bioconductor package "ChIPseeker". I've looked at the user manual and seen that the 'readpeakfile' command is what I'm looking for (I think) but i've been unable to get it to work. Could anyone help me on how the command line would look so that I can accomplish what I'm trying to do?

chipseeker chip-seq R • 4.2k views
ADD COMMENTlink modified 23 months ago by zhangly81130 • written 3.7 years ago by dally180

Could you give examples of how you've attempted to get it to work? i.e. post the specific series of commands you attempted (avoiding unwanted personal information).
From the manual, it looks like you can either import a peak file using readPeakFile (I suspect you need to specify as="GRanges"), or you can directly specify the peak file when invoking the covplot command.

Are you sure that your path to the peak file is correct?

The tricky bit is gettting the syntax right for specifying peakfile (as seen on page 24 in the readPeakFile section).

ADD REPLYlink written 3.7 years ago by Joseph Pearson430

require(ChIPseeker)

require(TxDb.Hsapiens.UCSC.hg19.knownGene)

txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene

covplot(home/carlos/data/Pol-II-Chip.MACS2_peaks.narrowPeak, weightCol ="score", xlab = "Chromosome Size (bp)", ylab =" ", title = "Pol-II Peaks over Chromosomes", chrs = NULL, xlim = NULL)

 

So i'm pretty sure my problem is that I need to specify the narrowPeak file as an object which I am unsure how to do.

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by dally180

And what's the error that you get?


It looks like the parameters you are setting are correct (and in fact, are mostly the defaults, so you don't even need to set them). I'm wondering if it's the way you are specifying the path to the file is correct. In the manual, the examples use the "system.file" command, maybe try that and see what the exact format of the file path that is saved.

You might just need quotes around your file path?

Edit: I think that if you specified the path to your .narrowPeak file using file.path, that should work. Could be wrong.

system.file (as used in the ChIPseeker examples) seems to just return a character vector of the path.

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by Joseph Pearson430

Error is: "Error in covplot("home/carlos/data/Pol-II-Chip.MACS2_peaks.narrowPeak", : peak should be a GRanges object or a peak file ...

>

I'm pretty sure it's the way i'm specifying the path file ... i've tried setting 

peak <- system.file("home/data/Pol-II-Chip.MACS2_peaks.narrowPeak") but that doesn't seem to be working either so i'm probably using it wrong.

 

EDIT: Tried using path.file("home/data/Pol-II-Chip.MACS2_peaks.narrowPeak") and then running the script again but received the same error except now it says:

Error in covplot(peak, weightCol = "score", xlab = "Chromosome size (bp)", : peak should be a GRanges object or a peak file ...

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by dally180

I actually got it to work, but now whenever I attempt 

 

covplotpeakfile.gr, weightCol ="score", xlab = "Chromosome Size (bp)", ylab =" ", title = "Pol-II Peaks over Chromosomes", chrs = NULL, xlim = NULL)

 

I get the following:

Error in .noramrg_shift_or_weight(weight, "weight", x) : 'weight' must be a numeric vector, a single string, or a list-like object

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by dally180

I suspect that the weightCol parameter is supposed to specify the name of your GRanges metadata column, or the column name of your file.

1) Do you have a column in your file with the title "score"?

2) Have you tried importing your MACS data as a GenomicRanges, then running covplot with that object?

ADD REPLYlink written 3.7 years ago by Joseph Pearson430
3
gravatar for zhangly811
23 months ago by
zhangly81130
zhangly81130 wrote:

Install this package:

source("https://bioconductor.org/biocLite.R")
biocLite("rtracklayer")
library(rtracklayer)

And then load narrowPeak and the import command will do the conversion:

file_narrowPeak = "/Users/xxx/mypeak.narrowPeak" 
extraCols_narrowPeak <- c(signalValue = "numeric", pValue = "numeric",
                          qValue = "numeric", peak = "integer")

gr_narrowPeak <- import(file_narrowPeak, format = "BED",
                        extraCols = extraCols_narrowPeak)

The solution is adopted from https://charlesjb.github.io/How_to_import_narrowPeak/

ADD COMMENTlink modified 23 months ago • written 23 months ago by zhangly81130
0
gravatar for dally
3.7 years ago by
dally180
United States
dally180 wrote:

1) I have no column in my file with the title "score", but I do have one in my annotated .txt file with a column "Score" but for some reason the readPeakFile to convert to GRanges doesn't seem to work on .txt files.

 

2) I am running the covplot with only peakfile.gr) so lets see what happens. I will edit this once it's finished running.

EDIT: I ran the covplot data but without a weightCol all the peaks were of identical heights.

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by dally180

Probably the simplest thing (in the short run) is to read in your .narrowPeak file as a data.frame, then construct a GenomicRanges object from it, with the score column specified as metadata. Then, run covplot on your GRanges, and specify the weight column by name.

ADD REPLYlink written 3.7 years ago by Joseph Pearson430

can't seem to construct the Genomic Ranges Object. I'm pretty bad at syntax in general.

 

I'm trying to follow this guide: http://davetang.org/muse/2013/01/02/iranges-and-genomicranges/

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by dally180
1

As a larf, imported a bed file (tab-delimited) with some additional columns into a data.frame.

My recommendation is to put a header line into your file, then import a data.frame as follows:

Because my bed file had headers, I put "header=TRUE"

peaks<- read.table("midEnriched.Full.txt",header=TRUE)

This makes a data.frame, where each column (variable) has a different name, which can be seen here when I type

> str(peaks)
'data.frame':	167 obs. of  8 variables:
 $ chrom : Factor w/ 5 levels "chr2L","chr2R",..: 5 3 3 5 3 3 1 4 3 4 ...
 $ left  : int  2690162 40353 26610 8190040 33472 1355033 18370795 8896863 12663057 8893104 ...
 $ right : int  2690879 40727 27041 8190186 33774 1355708 18371139 8897318 12663550 8893766 ...
 $ logFC : num  -11.9 -9.84 -9.7 -9.54 -9.24 ...
 $ logCPM: num  7.3 5.26 5.12 4.96 4.67 ...
 $ PValue: num  6.76e-21 1.11e-12 5.78e-12 2.06e-11 2.80e-09 ...
 $ FDR   : num  1.35e-16 7.42e-09 2.89e-08 8.23e-08 6.23e-06 ...
 $ peakID: Factor w/ 167 levels "sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_10041",..: 45 112 108 53 110 120 73 12 143 11 ...

To make a GRanges, I typed

myRanges<- GRanges(seqnames=peaks$chrom,ranges=IRanges(peaks$left,peaks$right),strand=NULL,peaks$logFC,peaks$logCPM,peaks$PValue,peaks$FDR,peaks$peakID)

Where I used the column name from each data frame variable (e.g. chromosomes are peaks$chrom , because that's what my column header name was). This generated a GRanges object, This is what it looks like:

> myRanges
GRanges object with 167 ranges and 5 metadata columns:
        seqnames               ranges strand   | peaks.logFC peaks.logCPM peaks.PValue  peaks.FDR
           <Rle>            <IRanges>  <Rle>   |   <numeric>    <numeric>    <numeric>  <numeric>
    [1]     chrX   [2690162, 2690879]      *   |  -11.902233     7.301598     6.76e-21   1.35e-16
    [2]    chr3L   [  40353,   40727]      *   |   -9.844267     5.255242     1.11e-12   7.42e-09
    [3]    chr3L   [  26610,   27041]      *   |   -9.702037     5.122502     5.78e-12   2.89e-08
    [4]     chrX   [8190040, 8190186]      *   |   -9.541360     4.958998     2.06e-11   8.23e-08
    [5]    chr3L   [  33472,   33774]      *   |   -9.239363     4.673488     2.80e-09   6.23e-06
    ...      ...                  ...    ... ...         ...          ...          ...        ...
  [163]    chr2R [20778828, 20779622]      *   |   -1.626363     6.466925  0.002774428 0.04910492
  [164]    chr2L [ 8842609,  8843061]      *   |   -1.802049     5.857315  0.002782288 0.04920049
  [165]    chr3R [ 6318490,  6318881]      *   |   -1.844687     5.630014  0.002795912 0.04931061
  [166]     chrX [ 8530879,  8531472]      *   |   -1.685482     6.171483  0.002822427 0.04969062
  [167]    chr2R [15682073, 15682610]      *   |   -2.013013     5.481337  0.002844105 0.04997572
                                         peaks.peakID
                                             <factor>
    [1] sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_13262
    [2]  sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_5843
    [3]  sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_5838
    [4] sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_13905
    [5]  sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_5840
    ...                                           ...
  [163]  sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_5747
  [164]  sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_1345
  [165] sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_10066
  [166] sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_13947
  [167]  sorted_68hr_sim_FAIRE_POOLED_MACS2_peak_5020
  -------
  seqinfo: 5 sequences from an unspecified genome; no seqlengths

 

 

ADD REPLYlink written 3.7 years ago by Joseph Pearson430
0
gravatar for Guangchuang Yu
3.7 years ago by
Guangchuang Yu2.2k
China/Guangzhou/Southern Medical University
Guangchuang Yu2.2k wrote:

If you looking at the source code from covplot:

    if (is(peak, "GRanges")) {
        peak.gr <- peak
    }
    else if (file.exists(peak)) {
        peak.gr <- readPeakFile(peak, as = "GRanges")
    }
    else {
        stop("peak should be a GRanges object or a peak file...")
    }

 

You will found that the file you passed is not correct, so that it reach 'stop' to throw error.

 

Make sure you are using the correct path and correct file name.

ADD COMMENTlink written 3.7 years ago by Guangchuang Yu2.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1466 users visited in the last hour