Question: Losing all data after filter step using SGA
0
gravatar for jolespin
4.0 years ago by
jolespin120
United States
jolespin120 wrote:

Raw fastq > preprocess > index > correct > index = processed_reads.ec.fa (6.5 GB)

 

I've taken the processed_reads.ec.fa and ran the following command:

 

sga filter -k 31 processed_reads.ec.fa

 

stderr:

sga: QCProcess.cpp:233: DuplicateCheckResult QCProcess::performDuplicateCheck(const SequenceWorkItem&): Assertion `fwdIntervals.interval[0].isValid() || rcIntervals.interval[0].isValid()' failed.

 

 

stdout:

...

[sga] Processed 8000000 sequences (912.782961s elapsed)

[sga] Processed 8050000 sequences (918.024303s elapsed)

[sga] Processed 8100000 sequences (925.718005s elapsed)

[sga] Processed 8150000 sequences (932.858792s elapsed)

Abort (core dumped)

 

 

The output files were:

processed_reads.ec.filter.pass.fa (1.4 MB)

processed_reads.ec.discard.fa (2.0 GB)

 

Does anyone know what's happening? What are these errors? I've tried different thread sizes and different kmer sizes and have not seen any significant improvements...

sga assemly kmer filter • 947 views
ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by jolespin120
1
gravatar for jolespin
4.0 years ago by
jolespin120
United States
jolespin120 wrote:

Needed the BWT file from the reindexing step

ADD COMMENTlink written 4.0 years ago by jolespin120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 686 users visited in the last hour