Extracting reads from Pacbio Sequel BAM by read name
1
0
Entering edit mode
3.2 years ago
roryC • 0

Hi,

Does anyone know of a good way to filter reads from a Pacbio BAM file (the Sequel data format)? I have identified some contaminant reads which I would like to remove before mapping the BAM file to my draft assembly using pbalign. I want to use Arrow, so I think it is important to keep the original BAM format for mapping (or at least I'm getting errors with Arrow after converting to FASTA and mapping that with pbalign).

Picard FilterSamReads doesn't seem to work with the Pacbio format, neither does pysam.

Cheers!

BAM Pacbio Sequel Arrow pbalign • 1.3k views
ADD COMMENT
1
Entering edit mode

Hello,

Picard FilterSamReads doesn't seem to work with the Pacbio format, neither does pysam.

"doesn't work" is never a good description of the problem. Please give us more details.

fin swimmer

ADD REPLY
0
Entering edit mode
3.2 years ago
roryC • 0

Thanks,

On further inspection it is due to the lack of a SM tag in the Pacbio BAM header, which Picard is expecting: https://github.com/PacificBiosciences/blasr/issues/212

The work-around for now looks like running picard with "VALIDATION_STRINGENCY=LENIENT"

ADD COMMENT

Login before adding your answer.

Traffic: 840 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6