BLASR Clipping for Pacbio Subreads
2
2
Entering edit mode
8.9 years ago
xianfan.jhu ▴ 40

I used BLASR to align Pacbio raw reads in hdf5 to the human reference. I believe these are long reads. To separate them into sub-reads, I chose the option "-clipping subread". However, the alignment didn't separate subreads into different read names. Any suggestion about how to use this option? Also, is there any other tool that have this split function? I know that BLASR's pls2fasta could do this. But I cannot use it right now due to that hdf5 package hasn't been installed. Any help will be appreciated. Thanks.

alignment • 3.6k views
ADD COMMENT
2
Entering edit mode
8.9 years ago
mpauper ▴ 20

Hello Xianfan.

Certainly not an expert on this, but I'll try to help you with what I have figured out. If something is wrong, I hope someone corrects me.

It is not entirely clear from your question what you are trying to achieve. PacBio raw reads (a.k.a polymerase reads) is the continuous stretch sequenced and contains the SMRTbell adapters. Each ZMW produces one polymerase read.

If you want to split the polymerase reads into subreads, you can use the PFilter module from smrtanalysis. Otherwise, if you provide raw polymerase reads to BLASR, it will by default split them into subreads, which you are able to see from the readID in the output: <movie>/<ZMW>/X_Y

  • <movie> should not concern you right now, and should be the same for all reads
  • <ZMW> corresponds to the well on the SMRTcell where this specific molecule was sequenced
  • X, Y correspond to the coordinates on the polymerase read that define the start and end of the subread

Now, concerning the -clipping option in BLASR. This defines how is BLASR gonna report parts of a subread that do not map continuously on the reference. For more information on clipping, please refer to the SAM specification.

ADD COMMENT
1
Entering edit mode
8.9 years ago
thackl ★ 3.0k

dextract of the DEXTRACTOR packages very efficiently converts .bas.h5 to fasta/q and in the process splits reads into subreads.

blasr splits reads into subreads by default, unless you explicitly specify -noSplitSubreads.

And as mpauper mentioned -clipping controls SAM output and not the subread splitting behaviour.

ADD COMMENT

Login before adding your answer.

Traffic: 2697 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6