Prinseq lite data preprocessing
2
0
Entering edit mode
4 months ago
Adarsh Kuamr ▴ 40

Hello everyone..

I am learning RNA seq analysis. Firstly, I am using Prinseq lite for preprocessing of data.

I used command:

perl prinseq-lite.pl –fastq read_1.fastq -fastq2 read_2.fastq -out_format 5 -min_len 50 -min_qual_mean 25

I got three output files in same folder for each data file. These are _prinseq_good_singletons_, _prinseq_good_, _prinseq_bad_.

Further, the size of _prinseq_good_ is greater than input data file. Is it OK?

Please suggest me that which file could I use for downstream analysis?

Data rna-seq preprocessing Forum Prinseq lite • 249 views
ADD COMMENT
0
Entering edit mode

File sizes are not a good measure of anything by themselves. Does prinseq print a log file or a stats file of some sort? That would be useful in understanding what happens in the run. Also, read the manual - that should describe each output file.

ADD REPLY
0
Entering edit mode
4 months ago
GenoMax 101k

I am not a prinseq user but based on the names _prinseq_good_singletons_, _prinseq_good_ would be the files you would want. Good are reads where both reads (from R1/R2) survived the trimming. You will want to be cautious about using the singleton file. Most aligners will not allow you to mix paired and singleton reads in the same alignment.

File sizes are never a good metric for anything (unless you are just making sure file produced is not empty). Since your files don't appear to be compressed hopefully the size difference is negligible. Generally compressibility of data results in file size changes as data is lost via trimming/filtering for example.

ADD COMMENT
0
Entering edit mode

Thank you for your response

ADD REPLY
0
Entering edit mode
4 weeks ago
nzulapa • 0

_prinseq_good_singletons_: contain the reads which lost their pairs

_prinseq_good_: contain the remained pairs after removing duplicate, low complexity,...

ADD COMMENT

Login before adding your answer.

Traffic: 1218 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6