Hi All, I seem to be having a problem assembling genomes with ABYSS. The files have come straight from the MiSeq, to Fastq.gz, to Trimmomatic and into ABYSS, so I'm not quite sure where any errors might have occurred (Especially as other genomes prepared in the same way have assembled well), but I am getting the following error:
`/home/jf0781/AbyssGen2/TrimmedGen2/1Gen2Crop.p2.fastq': discarded 1 reads containing non-ACGT characters
snp-0.fa:0: warning: file is empty
snp-1.fa:0: warning: file is empty
snp-2.fa:0: warning: file is empty
snp-3.fa:0: warning: file is empty
snp-4.fa:0: warning: file is empty
snp-5.fa:0: warning: file is empty
snp-6.fa:0: warning: file is empty
snp-7.fa:0: warning: file is empty
Building the suffix array...
Building the Burrows-Wheeler transform...
Building the character occurrence table...
Mateless 0
Unaligned 0
Singleton 1 2.94e-05%
FR 8 0.000235%
RF 0
FF 104219 3.06%
Different 3299742 96.9%
Total 3403970
abyss-fixmate: error: The mate pairs of this library are oriented forward-forward (FF), which is not supported by ABySS.
make: *** [Gen2-3.dist] Error 1
make: *** Deleting file `Gen2-3.dist'
The assembly files are formatted like this: 1Gen2Crop.p1.fastq
@DHWCT801_420_H3CYJBCXX_1_1101_1212_2153/1
TAATTTTTTAGTACTTTTATTATACAGTAAAATCTGGTTATGTCCAAATTGAGGGGACTGAATAAAAACTCCGACATAGGAGATCGGAAGAGCACACGTCTACTCCAGTCACCGCTATATC
+
IIIIIIIIIIGIIIIIIIIIIIIGIIIGIIIIIIIIGIIIIIIIIGGGIIIGIGGGAGGIIIIIIIIIIGGG<GIIIIIIIIIGIIIIIIIIIIIGGGIIIIIIIIIGIIIIIIGIIGGGG
@DHWCT801_420_H3CYJBCXX_1_1101_1443_2066/1
AGAAACAAAGTTAAATTGCCACCTAATAAAAAAAGATATGAACATGAAACTTGACATACATATAAATCTCAAGTTTCTTATTCAGAAAAATATCATCTTAATCTATAAAGTTATGTCAAAG
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIG
@DHWCT801_420_H3CYJBCXX_1_1101_1376_2151/1
AGTCGGAATCGGTGTAGGAGTTGGATTTTTTTAATAACTTATGGTCGGAATGTAAAAGGAATTTTGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACCGCTATATCTCGTATGCCGTC
1Gen2Crop.p2.fastq
@DHWCT801_420_H3CYJBCXX_1_1101_1212_2153/2
AATTTGGACATAACCAGATTTTACTGTATAATAAAAGTACTAAAAAATTATTTATAAAAATATATAATGAAGTAATAAATAGATCGGAAGAGCGTCGTGTAGGGAAAGAGT
+
IIIIIIGIIIGIIIIIGGIIIGGIIIIIIIIIGIIIIIIIIIGGIGGIIIIGIIGIIIIIGIIGIGGGGIGGGGIGIIIIIIIII.GGGGIGGGAGAAGGIIGGGIGGGGG
@DHWCT801_420_H3CYJBCXX_1_1101_1443_2066/2
CTTATCATCAGNATGAGGTCTTTGACATAACTTTATAGATTAAGATGATATTTTTCTGAATAAGAAACTTGAGATTTATATGTATGTCAAGTTTCATGTTCATATCTTTTTTTATTAGGTG
+
IIIIIIIIIII#<GGIIIGIIIIIIIIIIIIIIIIIIIIIIIGIIIIIIIIIIIIIIIG<GGGGGGIIIIIGGIIIIIIIIIIIIIIIIIIGGIIIIIIIIIIIIIIIIIIIIIIIIGGGA
@DHWCT801_420_H3CYJBCXX_1_1101_1376_2151/2
TATTAAAAAAATCCAACTCCTACACCGATTCCGACTCCTATAACTTACAAAATTCCTCTTAGACTCAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCG
I have tried renaming sequences and reformatting the sequences with reformat.sh, but I can't seem to clean them up. Does anybody have any advice? (Or indeed, any way to stop this occurring again, or the source of the problem?)
Many thanks