Entering edit mode
4.0 years ago
Bioinfo
▴
20
Hello everyone
I have two files containing reads that I want to assembly
I run spades and it shows this error message
Error log:
== Running assembler: K27
0:00:00.000 4M / 4M INFO General (main.cpp : 74) Loaded config from
/data/friesen/testdrive/agro-dir/spades_assembly/assembly/K27/configs/config.info
0:00:00.000 4M / 4M INFO General (memory_limit.cpp : 49) Memory limit
set to 250 Gb 0:00:00.000 4M / 4M INFO General (main.cpp : 87)
Starting SPAdes, built from refs/heads/spades_3.13.0, git revision
8ea46659e9b2aca35444a808db550ac333006f8b 0:00:00.000 4M / 4M INFO
General (main.cpp : 88) Maximum k-mer length: 128 0:00:00.000 4M / 4M
INFO General (main.cpp : 89) Assembling dataset
(/data/friesen/testdrive/agro-dir/spades_assembly/assembly/dataset.info)
with K=27 0:00:00.000 4M / 4M INFO General (main.cpp : 90) Maximum #
of threads to use (adjusted due to OMP capabilities): 1 0:00:00.000
4M / 4M INFO General (launch.hpp : 51) SPAdes started 0:00:00.000 4M
/ 4M INFO General (launch.hpp : 58) Starting from stage: construction
0:00:00.000 4M / 4M INFO General (launch.hpp : 65) Two-step RR
enabled: 0 0:00:00.000 4M / 4M INFO StageManager (stage.cpp : 132)
STAGE == de Bruijn graph construction 0:00:00.008 4M / 4M INFO
General (read_converter.hpp : 77) Converting reads to binary format
for library #0 (takes a while) 0:00:00.008 4M / 4M INFO General
(read_converter.hpp : 78) Converting paired reads 0:00:00.401 80M /
132M INFO General (binary_converter.hpp : 93) 16384 reads processed
0:00:00.606 92M / 132M INFO General (binary_converter.hpp : 93) 32768
reads processed 0:00:01.021 120M / 132M INFO General
(binary_converter.hpp : 93) 65536 reads processed 0:00:02.071 184M /
184M INFO General (binary_converter.hpp : 93) 131072 reads processed
0:00:04.251 320M / 320M INFO General (binary_converter.hpp : 93)
262144 reads processed 0:00:06.537 464M / 464M ERROR General
(paired_readers.hpp : 56) The number of right read-pairs is larger
than the number of left read-pairs 0:00:06.537 464M / 464M ERROR
General (paired_readers.hpp : 60) Unequal number of read-pairs
detected in the following files:
/data/friesen/testdrive/agro-dir/spades_assem
bly/corrected_1.fastq.gz
/data/friesen/testdrive/agro-dir/spades_assembly/corrected_2.fastq.gz
== Error == system call for: "['/home/richard.white3/SPAdes-3.13.0-Linux/bin/spades-core',
'/data/friesen/testdrive/agro-dir/spades_assembly/assembly/K27/configs/config.info']"
finished abnormally, err c ode: 255
In case you have troubles running SPAdes, you can write to
spades.support@cab.spbu.ru or report an issue on our GitHub
repository github.com/ablab/spades Please provide us with params.txt
and spades.log files from the output directory. I have checked the
inputs prior to error correction and it had the same number of reads.
I tried to check the corrected reads but it formats them weird so I
can 't check with fastqc.
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Midline
'AAAAAAAADDDDDDDDGGGGGFHIHFHHHHHHHHHHHHIHHHHHHHIIHHHHHHHHHHHHHHHFGGEGEDEGEGCEGGEDGGGG?DG?GGGGGGGGGGGGGGGGGGGGGGGGGDGGGGGGGDGGGGGGGGGGGGGGGGGGGEGGGAGDGGEGGGGGGGGGGGGGGGGGAGGEGGGGGCEGGGGGGEEGGGGGGGGGGGGGGDGGGEEGGGGGGGGGGGGGGGGDA>DGGGGGAGGGGGDGGGG'
didn't start with '+' at
uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:172)
at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:125)
at
uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:76)
at java.lang.Thread.run(Thread.java:748)
Failed to process file corrected_2.fastq.gz
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: ID line
didn't start with '@' at
uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:158)
at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:125)
at
uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:76)
at java.lang.Thread.run(Thread.java:748)
i used fastq_pair to see reads that have a mate and to separate out singletons. i obtained four files :
left.fastq.paired.fq
left.fastq.single.fq
right.fastq.paired.fq
right.fastq.single.fq
I have to mention that the number of reads in the single files is very much higher than the number of reads in paired files
Left paired: 37027
Right paired: 37027
Left single: 1512745
Right single: 1509165
My question is how can I use the four file for the assembly or it's okay to use the paired files
Thank you
You appear to have corrupted the data files so they no longer are in FastQ format.
You also may have trimmed these files independently (not advisable) so the number and order of read pairs in your files no longer match.
Best option would likely be to start over with raw data and re-do your trimming (in proper pairs).
Thank you very much for your message , please i extracted unmapped reads frow data and i don't exactly in which step i had the error
first i mapped my reads on the eference using bowtie2 after that i transformed sam file to bam file using samtools and i extracted unmapped reads using samtools view -f 4 -h finally i used this commande to obatain forward an dreverse unmapped reads
samtools fastq -1 file_1_Unmapped_reads.fastq -2 file_2_Unmapped_reads.fastq file_Unmapped_reads.bam
While those steps look ok where the corruption occurred is hard to say. It seems to be there though since programs are complaining.