Long to wide segment file to plot copy number gain/loss frequency
Entering edit mode
3.7 years ago
yesitsjess • 0

I'd like to visualise some copy number segment data using copynumber::plotFreq or GenVisR::cnFreq.

I have a file with segment calls from a pipeline which is bascially normalisation -> CBS -> GISTIC. Here's how it looks:

> head(seg, 5)
  sample chromosome    start       end nprobes   segmean
  1 CL1         1     12412    1523277     257   0.9872586
  2 CL1         1   1523278    1687515      68   1.1767436
  3 CL1         1   1687516   20015742    1776   0.9631951
  4 CL1         1  20015743   23470985     390   0.5212690
  5 CL1         1  23470986   65964697    3506   0.9709709

The file has multiple samples and segment coordinates are not consistent across samples.

To use plotFreq and cnFreq I need the file to look more like this:

chromosome median.bp CL1  CL2  CL3  CL4  ...
1          767845    0.98 1.20 5.23 0.92 ...
1          1605396   1.17 #    #    #    ...

Any tips on how to go about this?

I've looked at the documentation for GenomicRanges::disjoin and GenomicRanges::reduce and I think they're close to doing what I want but I just can't figure it out at this minute in time. Any help would be greatly appreciated.

next-gen sequencing copynumber segment CBS • 1.1k views
Entering edit mode
3.7 years ago

I have done this before also for copy number data and there are a few ways to do it. Thinking about it logically, all that you need to do is:

  1. create a unique list of genomic regions
  2. obtain the segmean from each sample that overlaps each unique region

The main design choice that needs to be made is how much overlap (of each unique region) is required before you allow for a segmean to be assigned to a particular region.

The above can be done with GenomicRanges, standard R loops, or even in shell scripting via AWK, BEDTools, or something else.



Login before adding your answer.

Traffic: 736 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6