Hello all,
I'm having a bit of trouble with bioconductor. I have several different GRange objects (peak files from MACS, specifically) that I'd like to merge by performing a left join. Examples files look like this:
    > head(c8.grange)
  GRanges object with 6 ranges and 0 metadata columns:
                                 seqnames               ranges strand
                                    <Rle>            <IRanges>  <Rle>
  Grouped_C8_NoExt_WithD_cond1_1        1 [ 2215700,  2216098]      *
  Grouped_C8_NoExt_WithD_cond1_2        1 [ 2309602,  2309838]      *
  Grouped_C8_NoExt_WithD_cond1_3        1 [ 9958346,  9958709]      *
  Grouped_C8_NoExt_WithD_cond1_4        1 [18401454, 18401797]      *
  Grouped_C8_NoExt_WithD_cond1_5        1 [22603688, 22603920]      *
  Grouped_C8_NoExt_WithD_cond1_6        1 [23866394, 23866612]      *
  -------
  seqinfo: 29 sequences from an unspecified genome; no seqlengths
> head(v0.grange)
GRanges object with 6 ranges and 2 metadata columns:
                        seqnames         ranges strand |     score      fold
                           <Rle>      <IRanges>  <Rle> | <integer> <numeric>
  V0_Peaks_NoExt_peak_1        1 [11160, 11408]      * |        91   4.31469
  V0_Peaks_NoExt_peak_2        1 [13468, 13668]      * |        64   3.71956
  V0_Peaks_NoExt_peak_3        1 [16855, 17058]      * |        42   3.12443
  V0_Peaks_NoExt_peak_4        1 [27463, 27918]      * |       983  16.81243
  V0_Peaks_NoExt_peak_5        1 [28820, 29068]      * |        58   3.57078
  V0_Peaks_NoExt_peak_6        1 [36493, 37032]      * |       546  11.45626
  -------
  seqinfo: 457 sequences from an unspecified genome; no seqlengths
I'd like to:
- Find where the peaks overlap, and subsequently:
 - Merge based on that overlap.
 
So far:
> c8.v0<-findOverlapsOfPeaks(c8.grange,v0.grange,minoverlap = 200)
> overlaps<-c8.v0$peaklist[["c8.grange///v0.grange"]]
> overlaps
GRanges object with 475 ranges and 1 metadata column:
          seqnames               ranges strand |                                                                        peakNames
             <Rle>            <IRanges>  <Rle> |                                                                  <CharacterList>
    [1]          1 [ 2215669,  2216098]      * |      v0.grange__V0_Peaks_NoExt_peak_98,c8.grange__Grouped_C8_NoExt_WithD_cond1_1
    [2]          1 [18401238, 18402379]      * |    v0.grange__V0_Peaks_NoExt_peak_698a,c8.grange__Grouped_C8_NoExt_WithD_cond1_4
    [3]          1 [22603511, 22603993]      * |     v0.grange__V0_Peaks_NoExt_peak_784,c8.grange__Grouped_C8_NoExt_WithD_cond1_5
    [4]          1 [23866184, 23866789]      * |     v0.grange__V0_Peaks_NoExt_peak_816,c8.grange__Grouped_C8_NoExt_WithD_cond1_6
    [5]          1 [25970340, 25971358]      * |    v0.grange__V0_Peaks_NoExt_peak_887a,c8.grange__Grouped_C8_NoExt_WithD_cond1_7
    ...        ...                  ...    ... .                                                                              ...
  [471] KN150112.1       [ 1834,  2755]      * | v0.grange__V0_Peaks_NoExt_peak_42395,c8.grange__Grouped_C8_NoExt_WithD_cond1_607
  [472] KN150127.1       [26653, 27288]      * | v0.grange__V0_Peaks_NoExt_peak_42417,c8.grange__Grouped_C8_NoExt_WithD_cond1_608
  [473] KN150342.1       [24017, 24364]      * | c8.grange__Grouped_C8_NoExt_WithD_cond1_609,v0.grange__V0_Peaks_NoExt_peak_42582
  [474] KN150342.1       [24739, 25115]      * | c8.grange__Grouped_C8_NoExt_WithD_cond1_610,v0.grange__V0_Peaks_NoExt_peak_42583
  [475] KN150342.1       [25567, 25951]      * | c8.grange__Grouped_C8_NoExt_WithD_cond1_611,v0.grange__V0_Peaks_NoExt_peak_42584
  -------
  seqinfo: 457 sequences from an unspecified genome; no seqlengths
What I'm attempting to do is join several different files (v0,v1,v2,v3, etc.) to this singular base file (c8) to have a way to compare the peak scores for overlapping peaks from different samples. Outside of cutting up the peakNames string to try and do a merge that way, does anyone know of a base function within ChIPpeakAnno that can do this for me? Thanks!
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
 [1] grid      stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
 [1] dplyr_0.5.0          purrr_0.2.2          readr_1.0.0          tidyr_0.6.0          tibble_1.2           ggplot2_2.1.0        tidyverse_1.0.0      ChIPpeakAnno_3.6.5  
 [9] VennDiagram_1.6.17   futile.logger_1.4.3  GenomicRanges_1.24.3 GenomeInfoDb_1.8.7   Biostrings_2.40.2    XVector_0.12.1       IRanges_2.6.1        S4Vectors_0.10.3    
[17] BiocGenerics_0.18.0  BiocInstaller_1.22.3 gplots_3.0.1         edgeR_3.14.0         limma_3.28.21       
loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7                   lattice_0.20-34               GO.db_3.3.0                   Rsamtools_1.24.0              gtools_3.5.0                  assertthat_0.1               
 [7] digest_0.6.10                 mime_0.5                      R6_2.1.3                      plyr_1.8.4                    futile.options_1.0.0          RSQLite_1.0.0                
[13] httr_1.2.1                    zlibbioc_1.18.0               GenomicFeatures_1.24.5        lazyeval_0.2.0                gdata_2.17.0                  Matrix_1.2-7.1               
[19] splines_3.3.1                 BiocParallel_1.6.6            AnnotationHub_2.4.2           RCurl_1.95-4.8                biomaRt_2.28.0                munsell_0.4.3                
[25] shiny_0.14                    httpuv_1.3.3                  rtracklayer_1.32.2            multtest_2.28.0               htmltools_0.3.5               SummarizedExperiment_1.2.3   
[31] interactiveDisplayBase_1.10.3 idr_1.2                       matrixStats_0.50.2            XML_3.98-1.4                  GenomicAlignments_1.8.4       MASS_7.3-45                  
[37] bitops_1.0-6                  RBGL_1.48.1                   xtable_1.8-2                  gtable_0.2.0                  DBI_0.5-1                     magrittr_1.5                 
[43] scales_0.4.0                  graph_1.50.0                  KernSmooth_2.23-15            seqinr_3.3-1                  lambda.r_1.1.9                ensembldb_1.4.7              
[49] tools_3.3.1                   ade4_1.7-4                    BSgenome_1.40.1               Biobase_2.32.0                survival_2.39-5               AnnotationDbi_1.34.4         
[55] colorspace_1.2-6              regioneR_1.4.2                caTools_1.17.1                memoise_1.0.0
                    
                
                
You know how to find overlaps, so write a function to do the merging (since it's not entirely clear how you want that done).
mergeByOverlaps exists but it won't allow you to select for any particular type of overlap.
As mentionned by Devon Ryan if you want more specific merging a function build around findOverlaps will be the best option.