Question: Grange object for hi-c
0
gravatar for Siegfried
4 months ago by
Siegfried10
Siegfried10 wrote:

I would like to ask how to establish a grange object with double seqnames columns as well as start and end columns for strore interaction data?

R genome • 96 views
ADD COMMENTlink modified 4 months ago by benformatics1.6k • written 4 months ago by Siegfried10

Please be more specific by giving example data and more context. What files do you have and how is the output supposed to look like?

ADD REPLYlink written 4 months ago by ATpoint36k
0
gravatar for benformatics
4 months ago by
benformatics1.6k
ETH Zurich
benformatics1.6k wrote:

Assuming you are using R then you could use GenomicAlignments.

Example:

## import your reads
gr <- readGAlignmentPairs('your_hic_reads.bam')

## look at them
head(gr)
GAlignmentPairs object with 99420 pairs, strandMode=1, and 0 metadata columns:
            seqnames strand   :      ranges  --            ranges
               <Rle>  <Rle>   :   <IRanges>  --         <IRanges>
        [1]    chr2      *   :   5035-5184  --       32376-32457
        [2]     <NA>      *   :   5089-5122  --   2700963-2701112

## here you see that seqnames can either be a single value or <NA>. If it's a single value then the two reads are on the same chromosome, if it's NA then they are on different chromosomes

## let's subset and look at the different chrom
granges(gr[2],on.discordant.seqnames='split')
  [1]    chr1       55089-5122       +
  [2]    chr2       2700963-2701112       -

At this point you have would need to split the reads manually. The GRanges object cannot support two seqnames side-by-side (i.e. in "columns"), you could however convert each pairs of reads into a GRanges [see below] object and then merge all reads into a giant GRangesList. For the seqnames() call to return the two different chromosomes, you need to convert the alignment object to GRanges with on.discordant.seqnames='split'. Then we build a final data.frame to your specifications [note we no longer have GRanges]. Here is an example but it's not IMO the best way to do it - so I won't include suggest it.

## pull example discordant read
b <- gr[2]

## duplicate the read and then split them the two reads (R1 and R2)
bb <- granges(c(b,b),on.discordant.seqnames='split')

## remember which read is which
c('read1','read2') -> bb$read

## merge into giant data.frame
b.df <- cbind(data.frame(subset(bb,read == 'read1')),data.frame(subset(bb,read == 'read2')))

## check it
head(b.df)
  seqnames start  end width strand  read seqnames   start     end width strand
1    chr1  5089 5122    34      + read1    chr2 2700963 2701112   150      -
2    chr1  5089 5122    34      + read1    chr2 2700963 2701112   150      -
   read
1 read2
2 read2
ADD COMMENTlink modified 4 months ago • written 4 months ago by benformatics1.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 839 users visited in the last hour