R: Readaligned Only Junction Reads From Bam-File
2
0
Entering edit mode
12.9 years ago

Hi,

I want to read in a BAM-file into R using the readAligned function from the ShortRead-package. However, I am just interested in the junction reads without indels and would like to filter using the CIGAR string for reads just containing M & Ns. How is that possible? I am just aware of the ScanBam argument "simpleCigar", which is not sufficient.

library("ShortRead")
param <- ScanBamParam(simpleCigar=F)

.. reads in fully aligned reads and junction reads.. I would need a filter for reading in junction reads without indels

readAligned(".", pattern=.bam,type="BAM",param=param)

Thanks for helping me out!!

r bam cigar • 3.7k views
ADD COMMENT
0
Entering edit mode
12.9 years ago

You might take a look at readGappedAlignments in the GenomicRanges package. Once you read in your sequences, call ngap() on the resulting object to get the number of gaps per read.

ADD COMMENT
0
Entering edit mode
12.9 years ago

In general if you want to filter SAM files the fastest and most efficient way is to filter from command line. If you want to avoid insertions/deletions you should filter for not having I and D in the CIGAR string like so:

# create the headers
egrep '^@' data.sam > small.sam 
# filter the CIGAR string   
cat data.sam | awk ' $6 !~ /(I|D)/ { print $0 } ' >> small.sam
ADD COMMENT

Login before adding your answer.

Traffic: 2007 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6