Question: Set bam read group from existing tag
3
gravatar for jbalberge
12 months ago by
jbalberge150
France/Nantes
jbalberge150 wrote:

In bam file from single-cell RNA-seq experiments processed with 10X's cellranger 3.0 pipeline, the @RG:SM tag is set by sample while the @CB is computed from cell barcode.

$ samtools view possorted_genome_bam.bam | head -n 1
> NS500651:49:"RUN":4:12405:6951:17414  256 1   11256   0   97M1S   *   0   0   "SEQ" "qual"
> NH:i:5    HI:i:2  AS:i:83 nM:i:6  RE:A:I  li:i:0  BC:Z:ATCTTTAG   QT:Z:A/AAAAEE   CR:Z:GAACGGACAGACGCAA   CY:Z:/AAAAEAEEEEEEEEE   CB:Z:GAACGGACAGACGCAA-1 UR:Z:TAGCTGTAAT UY:Z:EEEEEEEEEE UB:Z:TAGCTGTAAT RG:Z:MYSAMPLE:0:1:"RUN":4

For cell-specific genotyping, it would make more sense to treat the @CB barcode as individual's identity.

Is there an efficient way to move the @CB tag values to @RG tag ? This way, each unique @RG defines a cellular barcode.

Currently I do:

(1) split the bam by @CB tag (thanks to https://www.biostars.org/p/263346/#263348)

samtools view -@ 32 <input.bam> | cut -f 12- | tr "\t" "\n" | grep  "^CB:Z:" | cut -d ':' -f 3 | sort | uniq | while read S; do samtools view -h <input.bam> | awk -v tag="CB:Z:$S" '($0 ~ /^@/ || index($0,tag)>0)' > out/${S}.sam ; done

(2) edit @RG SM based on file name with Picard/AddOrReplaceReadGroups

for s in $(ls out/*.sam); do SM=$(basename $s .sam) ; java -jar /sandbox/apps/bioinfo/binaries/picard/2.19.0/picard.jar AddOrReplaceReadGroups I=$s O=out-RGSM/${SM}.bam RGID=1 RGLB=0.1 RGSM=${SM} RGPU=MYSAMPLE RGPL=ILLUMINA; done
samtools merge -r - out-RGSM/*.bam | samtools view -o merged-RGSM.bam -
samtools index merged-RGSM.bam

(3) call variants per sample

bcftools mpileup -r chr22 -Ou -f GRCh38.fasta merged-RGSM.bam | bcftools call -mv -Ob -o calls-RGSM.bcf

This is very inefficient and memory/disk-consuming. How could I improve this process by either (1) properly replace @RG:SM with CB or to run (2) bcftools mpileup with CB instead of RG:SM?

Thanks, JB

rna-seq bcftools bam samtools • 517 views
ADD COMMENTlink modified 12 months ago • written 12 months ago by jbalberge150

Hi JB, I am dealing with a similar problem and using your solution, thanks! .. but did you ( or somebody else) find a better solution for this? Cheers, Raúl

ADD REPLYlink written 4 months ago by rulicosentino0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1805 users visited in the last hour
_