Question: FastX toolkit - problems with Collapser
0
gravatar for jcarlosariute
15 months ago by
jcarlosariute0 wrote:

Hello,

I have been trying to use FastX Toolkit's Collapser on my RNA-seq data. However, the collapsed outfiles are coming all empty. Has anybody ever had this problem?

EDITED

Considering that Collapser it's not used any longer, what would be the next steps after merging my files?

Thank you all

rna-seq fastx collapser • 454 views
ADD COMMENTlink modified 15 months ago • written 15 months ago by jcarlosariute0
1

That is not appropriate. You want to count reads so collapsing them would defeat the purpose.

ADD REPLYlink written 15 months ago by genomax73k

what are the error messages ? what is the output of

module x && module load fastx-toolkit/0.0.14  && cd "$SCRATCH/dir" && fastx_collapser -v -i mergedR1_file.fastq > /dev/null
ADD REPLYlink written 15 months ago by Pierre Lindenbaum123k

The outputfiles were comming all empty. It didn't even show me an specific error code. They were just empty.

ADD REPLYlink written 15 months ago by jcarlosariute0

What exactly do you want to do? Fastx_toolkit is an ancient tool that does not well support paired-end data (or actually does not suport it at all). Give some details on your aim so that we can direct you to a better tool.

ADD REPLYlink written 15 months ago by ATpoint24k

Thank you for help. I would like to remove the repeated reads of the same transcripts to assemble the transcriptome (at least that's what I thought of).

ADD REPLYlink written 15 months ago by jcarlosariute0

And why would you do that?

ADD REPLYlink written 15 months ago by ATpoint24k

Considering that Collapser it's not used any longer, what would be the next steps after merging my files?

You need to explain what you are trying to do. Why did you merge the files (did you mean to say you concatenated files)? Insert sizes for RNAseq libraries are generally in a range where even the longest possible Illumina reads should not allow R1/R2 reads to merge/overlap.

Normally, one would take RNAseq data, scan/trim it as needed, align with a splice-aware aligner (if you expect splicing) and then the aligned reads are counted using featureCounts/htseq-count to generate raw counts that are then fed into DESeq2 for diff exp analysis.

If you are deviating from these steps then you need to have a good reason to do so.

ADD REPLYlink modified 15 months ago • written 15 months ago by genomax73k
1
gravatar for michael.ante
15 months ago by
michael.ante3.5k
Austria/Vienna
michael.ante3.5k wrote:

Try the undocumented -Q33 option. The fastx toolkit is quite old and uses per default the phred 64 encoding. FastQ files are now encoded in phred 33.

Cheers,

Michael

ADD COMMENTlink written 15 months ago by michael.ante3.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2306 users visited in the last hour