Background: I have completed adapter trimming and checked QC on Illumina NextSeq miRNA single end reads of length 75bp. I want to run umi_tools to extract the UMI information before I align the reads to the reference. I am unable to run umi_tools extract.
umi_tools extract --stdin=XYZ_R1-trim.fastq.gz --bc-pattern=NNNNNNNNNNNN -L XYZ-extract.log --stdout=XYZ-UMIextracted.fastq.gz
I have a 12 bp UMI barcode here. I think the pattern could be the culprit here. I am new to this UMI analysis, could anyone please share their insight as to what is the mistake here? Error message seen on screen:
Traceback (most recent call last): File "/home/xyz/.local/bin/umi_tools", line 11, in <module> sys.exit(main()) File "/home/xyz/.local/lib/python2.7/site-packages/umi_tools/umi_tools.py", line 57, in main module.main(sys.argv) File "/home/xyz/.local/lib/python2.7/site-packages/umi_tools/extract.py", line 330, in main new_read = ReadExtractor(read) File "/home/xyz/.local/lib/python2.7/site-packages/umi_tools/umi_methods.py", line 971, in __call__ umi_values = self.getBarcodes(read1, read2) File "/home/xyz/.local/lib/python2.7/site-packages/umi_tools/umi_methods.py", line 726, in _getBarcodesString umi_quals = [bc_qual1[x] for x in self.umi_bases] IndexError: string index out of range
Also, am I supposed to use whitelist command before extract? This is not single cell RNA data and hence I omitted that step.