samtools version 1.17 having difficulty parsing a filtering command. advice?
1
0
Entering edit mode
3 days ago
RNAseqer ▴ 270

I am working with som bam files generated from single cell sequencing. as part of a python script I need to run samtools version 1.17 filtering for several tags, ideally in a single commandline. So far I have:

cmd_final = f"samtools view -@ {cpus} -O BAM -e 'exists([CB]) && exists([UB]) && [CB]!=\"-\" && [UB]!=\"-\"' temp_filtered.bam > {te_bam}"

I have also tried:

cmd_final = f"samtools view -h -@ {cpus} -O BAM -e 'exists(CB) && exists(UB) && CB!=\"-\" && UB!=\"-\"' temp_filtered.bam > {output}"

but both result in the error:

ÄE::sam_passes_filterÅ Couldn't process filter expression

AndI've tried many other commandline variations. But I just can't seem to get it right.

I have checked my input file temp_filtered.bam and it DOES contain reads with these tags and they are NOT malformed. It seems like there are some subtle and important differences in the many versions of samtools' commandlines, which complicates my task. Could anyone point out what I'm doing wrong and suggest a fix? I would greatly, GREATLY appreciate the help, as this problem has taken entirely too much of my time and I'm getting quite frustrated with it.

samtools cell rnaseq single • 287 views
ADD COMMENT
0
Entering edit mode

despite your explanation I still think towards a data integrity issue ... (with/or an encoding issue, I've seen this ÄE pop up before but can't recall why or when ...)

Can you check data integrity ? or use a different file for testing?

ADD REPLY
0
Entering edit mode
3 days ago
aw7 ▴ 340

I tried this samtools view -e 'exists([CB]) && exists([UB]) && [CB]!="-" && [UB]!="-"' my.sam -o new.bam on the command line and it worked without complaint (admittedly for v1.21).

If you just print out cmd_final what does it look like?

ADD COMMENT
0
Entering edit mode

samtools quickcheck -v returned no warnings etc. and I have used a few different inputs to get the same message.

While I'm not sure how it could be related, I noticed my UTF-8 encodings had been turned off. I reset my terminal and tried the code again and it your suggestion worked (albiet with the additon of escape \ s before the quotations since i was running this inside a python script.

I much appreciate the feedback!

ADD REPLY

Login before adding your answer.

Traffic: 1625 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6