Entering edit mode
5.8 years ago
fadhil.abubaker
▴
20
I'm running intersectBed as follows:
intersectBed -a introns.bed -b accepted_hits.bam -wao > result.bed
Where introns.bed
is a bed file containing all introns in hg38 and accepted_hits.bam
is from STAR.
Here is a sample row from result.bed
:
chr1 13220 14409 exon:NR_046018:3 . + refGene exon . ID=exon:NR_046018:3;Parent=NR_046018 chr1 13763 13864 SRR2149928_MCF10A_R1.17281814 0 - 101
In total there are 17 columns; I am a bit confused as to what the 15th and 17th column represent, which in the example above have values of 0 and 101 respectively.
Can anyone guide me as to what they are?
The last column is the number of basepairs overlapping between the two features, triggered by the
-wao
option. The columns before that are simply the entire entry from-a
and-b
. Only the last column is appended by bedtools, the rest must already be present in your input files.That definitely makes sense, although I'm a bit confused as to what the 15th column stands for. I understand it comes from the bam file, but I'm not sure what it stands for.
Ok, sorry I misunderstood your initial question. This column is the 5th column in the BAM (SAM file) and indicates the mapping quality of the read alignment. In this case, it is 0, representing a read that aligned to multiple locations with equal score (multimapper). This is not uncommon, because based on the coordinates, it is at the very left of chromosome 1, which is a repetitive (low-complexity) region.
You should use the
-split
parameter with intersectBed, as this is RNAseq.