I'm using BBMap (mapPacBio) to map cDNA reads from a nanopore sequencer. I got the results below, and I'm a little confused as to how I've got a 75% mapped rate, but NA match rate, and 86%+ for all errors. Also <0.0065 perfect/semiperfect sites and <0.4% N.
I might be misunderstanding what these statistics mean, but is this maybe because long reads are more prone to have an error within them, and match rate = perfect match?
Any help would be wonderful!
Thanks!
(I also used usemodulo
due to memory limitations on the computer I have access to at home during lockdown. A fuller explanation of what this is would also be greatly appreciated!)
chris@chris-Virtual-Machine:~/data$ bbmap/mapPacBio.sh ref=genomic.fna in=bc01.fastq outm=bbout/C1_R1.sam outu=bbout/C1_R1_unmapped.sam -Xmx19g -da usemodulo=t qin=33 maxlen=5000
java -da -Xmx19g -cp /home/chris/data/bbmap/current/ align2.BBMapPacBio build=1 overwrite=true minratio=0.40 fastareadlen=6000 ambiguous=best minscaf=100 startpad=10000 stoppad=10000 midpad=6000 ref=genomic.fna in=bc01.fastq outm=bbout/C1_R1.sam outu=bbout/C1_R1_unmapped.sam -Xmx19g -da usemodulo=t qin=33 maxlen=5000
Executing align2.BBMapPacBio [build=1, overwrite=true, minratio=0.40, fastareadlen=6000, ambiguous=best, minscaf=100, startpad=10000, stoppad=10000, midpad=6000, ref=genomic.fna, in=bc01.fastq, outm=bbout/C1_R1.sam, outu=bbout/C1_R1_unmapped.sam, -Xmx19g, -da, usemodulo=t, qin=33, maxlen=5000]
Version 38.84
Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.400
Retaining first best site only for ambiguous mappings.
NOTE: Ignoring reference file because it already appears to have been processed.
NOTE: If you wish to regenerate the index, please manually delete ref/genome/1/summary.txt
Set genome to 1
Loaded Reference: 41.380 seconds.
Loading index for chunk 1-7, build 1
Generated Index: 3.107 seconds.
Analyzed Index: 2.173 seconds.
Started output stream: 0.273 seconds.
Started output stream: 0.006 seconds.
Cleared Memory: 0.138 seconds.
Processing reads in single-ended mode.
Started read stream.
Started 12 mapping threads.
Detecting finished threads: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
------------------ Results ------------------
Genome: 1
Key Length: 12
Max Indel: 100
Minimum Score Ratio: 0.4
Mapping Mode: normal
Reads Used: 130792 (246593288 bases)
Mapping: 13642.747 seconds.
Reads/sec: 9.59
kBases/sec: 18.08
Read 1 data: pct reads num reads pct bases num bases
mapped: 74.7026% 97705 78.9886% 194780668
unambiguous: 35.9250% 46987 45.4884% 112171292
ambiguous: 38.7776% 50718 33.5003% 82609376
low-Q discards: 0.1820% 238 0.0147% 36209
perfect best site: 0.0054% 7 0.0001% 303
semiperfect site: 0.0054% 7 0.0001% 303
Match Rate: NA NA 77.7629% 166855321
Error Rate: 86.0866% 97698 22.1700% 47570066
Sub Rate: 86.0699% 97679 5.9545% 12776573
Del Rate: 85.9985% 97598 9.2225% 19788575
Ins Rate: 85.9818% 97579 6.9930% 15004918
N Rate: 0.3622% 411 0.0670% 143856
Total time: 13690.052 seconds.