Question: FastX toolkit - problems with Collapser
0
gravatar for jcarlosariute
24 months ago by
jcarlosariute0 wrote:

Hello,

I have been trying to use FastX Toolkit's Collapser on my RNA-seq data. However, the collapsed outfiles are coming all empty. Has anybody ever had this problem?

EDITED

Considering that Collapser it's not used any longer, what would be the next steps after merging my files?

Thank you all

rna-seq fastx collapser • 657 views
ADD COMMENTlink modified 24 months ago • written 24 months ago by jcarlosariute0
1

That is not appropriate. You want to count reads so collapsing them would defeat the purpose.

ADD REPLYlink written 24 months ago by genomax85k

what are the error messages ? what is the output of

module x && module load fastx-toolkit/0.0.14  && cd "$SCRATCH/dir" && fastx_collapser -v -i mergedR1_file.fastq > /dev/null
ADD REPLYlink written 24 months ago by Pierre Lindenbaum129k

The outputfiles were comming all empty. It didn't even show me an specific error code. They were just empty.

ADD REPLYlink written 24 months ago by jcarlosariute0

What exactly do you want to do? Fastx_toolkit is an ancient tool that does not well support paired-end data (or actually does not suport it at all). Give some details on your aim so that we can direct you to a better tool.

ADD REPLYlink written 24 months ago by ATpoint36k

Thank you for help. I would like to remove the repeated reads of the same transcripts to assemble the transcriptome (at least that's what I thought of).

ADD REPLYlink written 24 months ago by jcarlosariute0

And why would you do that?

ADD REPLYlink written 24 months ago by ATpoint36k

Considering that Collapser it's not used any longer, what would be the next steps after merging my files?

You need to explain what you are trying to do. Why did you merge the files (did you mean to say you concatenated files)? Insert sizes for RNAseq libraries are generally in a range where even the longest possible Illumina reads should not allow R1/R2 reads to merge/overlap.

Normally, one would take RNAseq data, scan/trim it as needed, align with a splice-aware aligner (if you expect splicing) and then the aligned reads are counted using featureCounts/htseq-count to generate raw counts that are then fed into DESeq2 for diff exp analysis.

If you are deviating from these steps then you need to have a good reason to do so.

ADD REPLYlink modified 24 months ago • written 24 months ago by genomax85k
1
gravatar for michael.ante
24 months ago by
michael.ante3.6k
Austria/Vienna
michael.ante3.6k wrote:

Try the undocumented -Q33 option. The fastx toolkit is quite old and uses per default the phred 64 encoding. FastQ files are now encoded in phred 33.

Cheers,

Michael

ADD COMMENTlink written 24 months ago by michael.ante3.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1098 users visited in the last hour