Question: How to find Significant Trans- Interactions in HiC data
0
gravatar for komal.rathi
3.9 years ago by
komal.rathi3.4k
Children's Hospital of Philadelphia, Philadelphia, PA
komal.rathi3.4k wrote:

Hi everyone,

I recently got some HiC data from GSE65126 (SRA file SRX851861) for mouse liver, mapped to mm10. I am fairly new at handling HiC data so please bear with me if I am not technically correct. I got the fastq files from the sra file & used them as input to HiCUP (uses Bowtie2 for aligning). HiCUP does not run Bowtie2 in paired-end mode, instead it starts two independent Bowtie2 jobs (one for each file) and then pairs reads where both reads map uniquely to the genome. This produces a SAM/BAM file in which paired-reads are on adjacent lines. The output bam looks like this:

SRR1771322.4393742    163    chr1    3003916    42    75M    chr6    143502841    0    AACAGGGGTATGTCCCAGACACTGTGTAGCTTCTGCCTGCCCCAGAAGATGTGTCACTTCCTCAGTCTGCTTGTT    B@@FFFFF:DHDAFHJAHHGHIFIEHGIGIGGG@FHIGGIJIJIEG;FCGGIJIJIIJIJIIIAHEHGIIHHHC?    AS:i:0    XN:i:0    XM:i:0    XO:i:0    XG:i:0    NM:i:0    MD:Z:75    YT:Z:UU    CT:Z:TRANS

SRR1771322.4393742    83    chr6    143502841    42    32M    chr1    3003916    0    AAGCTTCATTTTGTGACTCGGAACACTTTCAG    IIIIIIIIIIHF?CA;<F<FHHHHDDDFF@@?    AS:i:0    XN:i:0    XM:i:0    XO:i:0    XG:i:0    NM:i:0    MD:Z:32    YT:Z:UU    CT:Z:TRANS

SRR1771322.178    115    chr6    15801748    42    75M    =    15512512    0    ATTTGCCTCATTATCCTGTAAAACTGTTTAACCAAGAGGCTTGTCTTATGCTTGAATATATCTTGCTATGATTTG    CGEHAEF=@<<EGCF?<DGIHHGHEHGCIIIHFEF?DCGHECBGHC<FA+FFC:HHAHGHEEHD?HBDD?DD@??    AS:i:-4    XN:i:0    XM:i:1    XO:i:0    XG:i:0    NM:i:1    MD:Z:46G28    YT:Z:UU    CT:Z:FAR

SRR1771322.178    179    chr6    15512512    42    64M    =    15801748    0    AAGCTTATACTAAACAAATCATCCAACAATGCCAACAAGAATATATATATATATTATGTAATAT    D?*DHHGGIIGGGIHFCC3EF<HBIIGGHABGHEJJHICEGGGIJJHEGBB?F>BDAEFFF@@@    AS:i:0    XN:i:0    XM:i:0    XO:i:0    XG:i:0    NM:i:0    MD:Z:64    YT:Z:UU    CT:Z:FAR

So you have cis & trans interactions mapped in the bam file. An '=' sign means that the mate is on the same chromosome i.e. cis. The file also states explicitly if it is a cis- or trans-, far- or close- interaction (last column). 

Our lab is specifically interested in a region that is on chr2 and spans ~100kb. We are interested in knowing which regions from the other chromosomes (i.e. TRANS or Inter-chromosomal interactions) interact with that particular region on chr2. I already have the data in the above format where the third column is chr2 and the 7th column is some other chromosome. 

So there may be multiple chromosomal positions (say on chr1, chr11, chr12 etc) interacting with this region, but is there a way to find which regions have significantly more reads/coverage compared to all the the regions that interact with this region? I have tried GOTHiC but it takes two input files whereas I only have one. Something simpler, like calculating a coverage or depth would also work. 

Thanks!

gothic hicup hic bowtie2 • 2.0k views
ADD COMMENTlink modified 3.7 years ago by Biostar ♦♦ 20 • written 3.9 years ago by komal.rathi3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1704 users visited in the last hour