I have been using fastx_barcode_splitter to demultiplex my reads. Today I found that there are some of the reads that did not match to any barcodes we used in the experiment. I took a closer look and I found the problem of reads not sorted because there was atleast one base in the beginning of the read.
Example Fasta Sequenece:
>HWI-ST863:238:C20G3ACXX:4:1204:18858:57161 1:N:0:AAACAAAA TACTTACCTACTTCCGCTGGTCATCCTGCGCCAATTTGATGTGTGTGGTTTTTAATTGAGCTGTATAATCTGTTTATTTTGAGGCCAAAAAAAAAAAA
This is however a match, but the read is not sorted into corresponding barcode file.
The command I use is the following:
cat <file_name> | fastx_barcode_splitter.pl --bcfile mybarcodes.txt --bol --mismatches 3 --prefix code_ --suffix "_1" > code_1.stats
I tried option
--partial, but this is super slow and I almost had to kill the process and did not improve code splitting efficiently.
Can some one help me understand if there is any better way to manage this? is there anyother splitter that can be used with ease and easily be integrated with some existing pipeline?