Question: R: Readaligned Only Junction Reads From Bam-File
0
gravatar for lydia.herzel
7.8 years ago by
lydia.herzel0 wrote:

Hi,

I want to read in a BAM-file into R using the readAligned function from the ShortRead-package. However, I am just interested in the junction reads without indels and would like to filter using the CIGAR string for reads just containing M & Ns. How is that possible? I am just aware of the ScanBam argument "simpleCigar", which is not sufficient.

library("ShortRead")
param <- ScanBamParam(simpleCigar=F)

.. reads in fully aligned reads and junction reads.. I would need a filter for reading in junction reads without indels

readAligned(".", pattern=.bam,type="BAM",param=param)

Thanks for helping me out!!

R bam cigar • 2.7k views
ADD COMMENTlink modified 7.8 years ago by Istvan Albert ♦♦ 84k • written 7.8 years ago by lydia.herzel0
0
gravatar for Sean Davis
7.8 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

You might take a look at readGappedAlignments in the GenomicRanges package. Once you read in your sequences, call ngap() on the resulting object to get the number of gaps per read.

ADD COMMENTlink written 7.8 years ago by Sean Davis26k
0
gravatar for Istvan Albert
7.8 years ago by
Istvan Albert ♦♦ 84k
University Park, USA
Istvan Albert ♦♦ 84k wrote:

In general if you want to filter SAM files the fastest and most efficient way is to filter from command line. If you want to avoid insertions/deletions you should filter for not having I and D in the CIGAR string like so:

# create the headers
egrep '^@' data.sam > small.sam 
# filter the CIGAR string   
cat data.sam | awk ' $6 !~ /(I|D)/ { print $0 } ' >> small.sam
ADD COMMENTlink modified 7.8 years ago • written 7.8 years ago by Istvan Albert ♦♦ 84k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1143 users visited in the last hour