Bam File Barcodes and UMIs not Formatted Correctly For Variant Calling
1
0
Entering edit mode
3.0 years ago
d • 0

I am working with bam files for single-cell RNA sequencing data that have barcodes under the tag -XX and UMIs under the tag -SS. However, I am trying to use this bam file as input into multiple variant-calling workflows (freebayes+vartrix, cellsnp-lite, etc.), but am running into issues, as the outputs of all of these are empty and don't contain variants. I know these workflows work with bam files outputted by Cellranger, and am wondering if this is an issue with the way my Bam file is formatted. Since Cellraner uses the -CB tag for barcodes, how can I change my -XX to by -CB? Or is there another workaround?

vcf bam scRNAseq variant • 2.0k views
ADD COMMENT
0
Entering edit mode

You may want to try --cellTAG and --UMItag options of cellsnp-lite to specify tag_for_cell_barcode and tag_for_UMI (check a list of full parameters with cellsnp-lite -h).

For example, cellsnp-lite <other options> --cellTAG XX --UMItag SS

ADD REPLY
1
Entering edit mode
3.0 years ago

If you are comfortable with python, a quick script with pysam should sort this

import pysam

inbam = pysam.AlignmnetFile("input_file.bam", "rb")
outbam = pysam.AlignmentFile("output_bam_file.bam", "wb", template=inbam)

for read in inbam.fetch(until_eof=True):

    try:
        cell_barcode = read.get_tag("XX")
    except KeyError:
        # No XX tag - either skip read
        continue
        # or output as is without tag (delete one)
        outbam.write(read)

    read.set_tag("CB", cell_barcode)
    outbam.write(read)

outbam.close()

You could also use sed

$ samtools view -H input_file.bam | sed 's/XX:Z/CB:Z/g' | samtools view -b > output_bam_file.bam

However, this relies on the string XX:Z not appearing anywhere else in the file, other than in the tags. This is likely true, but its just possible it could appear in the a read name.

ADD COMMENT

Login before adding your answer.

Traffic: 1781 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6