Question: Bowtie Output "Not Enough Fields"
gravatar for GPR
7.2 years ago by
GPR320 wrote:

Hello, I have ran Bowtie with the following command-line: 'bowtie -p 16 -q -n 3 -k 1 -m 1 --best --strata -S BowtieIndex *.fastq >& output.sam &' When running CleanSam.jar to remove overhanging reads, I get the following error: "Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. Not enough fields" I do not get this error when I align my reads with BWA or TopHat Any suggestions on how to fix this one? Thanks, G.

ADD COMMENTlink written 7.2 years ago by GPR320

Have you tried skimming through some of your output.sam file to see if something looks blatantly wrong?

ADD REPLYlink written 7.2 years ago by Steve Lianoglou5.0k

I did actually, nothing that caught my eye. This is a snapshot. << HWI-ST974:67:C0545ACXX:2:1101:14321:25195 0 chr2 242170185 255 76M * 0 0 CCATGATTTTGCGAATGGCTTTGCCGCGGGCACCAATGATGCGGGCGTGAACGCGGTGGTCCAGCGGGACGTCCTC CCCFFFFFHHHHHIJJIGIIJFIHIIIIGIIJIEHIFHHHHHFFDBB;<B?C?BDD8B@-:@CDCB5;@;9;5?@@ XA:i:0="" MD:Z:76="" NM:i:0<="" p="">


ADD REPLYlink written 7.2 years ago by GPR320

When I try to convert the same SAM file to BAM with "samtools view -bT genome.fa input.sam > output.bam" I get the following error message: " reference 'HWI-ST974:67:C0545ACXX:2:1101:11263:2144' is recognized as '*'. Parse error at line 4698: unmatched CIGAR operation"

ADD REPLYlink written 7.2 years ago by GPR320

The "something is recognized as *"-error in samtools happens when you use a genome-reference (in your case, genome.fa) that doesn't include all the references that the sam-file input.sam lists.

It looks like you mistakenly aligned the reads to another set of reads, as the reference-name "HWI-ST974:67:C0545ACXX:2:1101:11263:2144" is a standard name for a read and not a chromosome or anything else.

Looking at your original command, you missed inserting the reference. Here's the fixed command:

bowtie -p 16 -q -n 3 -k 1 -m 1 --best --strata genome.fa -1 first_reads.fastq -2 second_reads.fastq > output.sam &

If I'm not mistaken, --best, --strata and --k 1 don't really make that much sense together. --k 1 just reports one alignment, while --strata tries to report all alignments that fall into the best stratum. What are you trying to achieve?

ADD REPLYlink written 7.2 years ago by Philipp Bayer6.6k

Incidentally, this usage does make sense. The key is the -m 1 (which supercedes -k 1), saying that reads with more than 1 match are not reported. best and strata control the matching criteria: strata says that to count as a match, the alternate match must be in the same alignment stratum.

The relevant snippet from the bowtie documentation is: 'Intuitively, the -m option, when combined with the --best and --strata options, guarantees a principled, though weaker form of "uniqueness." A stronger form of uniqueness is enforced when -m is specified but --best and --strata are not.'

ADD REPLYlink modified 7.2 years ago • written 7.2 years ago by matted7.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1499 users visited in the last hour