Soft-clipping read ends based on read group
2
0
Entering edit mode
2.5 years ago
Martyna • 0

Is it possible to clip (soft-clip preferably) n (for example, 3) nucleotides from both ends of reads in a bam file, but only for the reads with a certain defined read group?

I have merged bams for ancient DNA samples and the bams used for merging come from both UDG-treated and non-UDG-treated libraries. I would like to clip 3 nts from the ends of only the reads coming from non-UDG-treated libraries. These are marked by a label in Read Group: @RG LB:**dmg

So as a result I would like to have the same bam output but with the "dmg" labelled reads clipped

I hoped for gatk ClipReads or perhaps trimBam function of bamUtil to allow this but I haven't really found any clue as to whether and how that could be done.

Any tips appreciated.

Cheers, Martyna

ancient molecule at bam DNA RG ends clipping damage • 1.1k views
ADD COMMENT
2
Entering edit mode
2.5 years ago

split the bam per read group (samtools view --read-group xxx) , clip the read with your solution above, merge the groups (samtools merge).

ADD COMMENT
1
Entering edit mode
2.5 years ago

I think you could use a multistep approach where you split your SAM file by readgroups with

samtools split

trim the relevant sample with either of the tools ClipRead or bamUtil trim (softclip is supported):

then merge the BAM files back together with

samtools merge 
ADD COMMENT
1
Entering edit mode

Pierre Lindenbaum beat me to it

ADD REPLY
0
Entering edit mode

samtools split is new to me !

ADD REPLY
0
Entering edit mode

@Pierre Lindenbaum, @Istvan Albert, thanks a lot! My brain has somehow fixed on trying to do this in a single step.

ADD REPLY

Login before adding your answer.

Traffic: 1619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6