Entering edit mode
11.7 years ago
GPR
▴
390
Hello, I have ran Bowtie with the following command-line: 'bowtie -p 16 -q -n 3 -k 1 -m 1 --best --strata -S BowtieIndex *.fastq >& output.sam &' When running CleanSam.jar to remove overhanging reads, I get the following error: "Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. Not enough fields" I do not get this error when I align my reads with BWA or TopHat Any suggestions on how to fix this one? Thanks, G.
Have you tried skimming through some of your
output.sam
file to see if something looks blatantly wrong?I did actually, nothing that caught my eye. This is a snapshot. << HWI-ST974:67:C0545ACXX:2:1101:14321:25195 0 chr2 242170185 255 76M * 0 0 CCATGATTTTGCGAATGGCTTTGCCGCGGGCACCAATGATGCGGGCGTGAACGCGGTGGTCCAGCGGGACGTCCTC CCCFFFFFHHHHHIJJIGIIJFIHIIIIGIIJIEHIFHHHHHFFDBB;<B?C?BDD8B@-:@CDCB5;@;9;5?@@ XA:i:0="" MD:Z:76="" NM:i:0<="" p="">
When I try to convert the same SAM file to BAM with "samtools view -bT genome.fa input.sam > output.bam" I get the following error message: " reference 'HWI-ST974:67:C0545ACXX:2:1101:11263:2144' is recognized as '*'. Parse error at line 4698: unmatched CIGAR operation"
The "something is recognized as *"-error in samtools happens when you use a genome-reference (in your case, genome.fa) that doesn't include all the references that the sam-file input.sam lists.
It looks like you mistakenly aligned the reads to another set of reads, as the reference-name "HWI-ST974:67:C0545ACXX:2:1101:11263:2144" is a standard name for a read and not a chromosome or anything else.
Looking at your original command, you missed inserting the reference. Here's the fixed command:
bowtie -p 16 -q -n 3 -k 1 -m 1 --best --strata genome.fa -1 first_reads.fastq -2 second_reads.fastq > output.sam &
If I'm not mistaken, --best, --strata and --k 1 don't really make that much sense together. --k 1 just reports one alignment, while --strata tries to report all alignments that fall into the best stratum. What are you trying to achieve?
Incidentally, this usage does make sense. The key is the
-m 1
(which supercedes-k 1
), saying that reads with more than 1 match are not reported.best
andstrata
control the matching criteria:strata
says that to count as a match, the alternate match must be in the same alignment stratum.The relevant snippet from the bowtie documentation is: 'Intuitively, the -m option, when combined with the --best and --strata options, guarantees a principled, though weaker form of "uniqueness." A stronger form of uniqueness is enforced when -m is specified but --best and --strata are not.'