Question

FastQC on rna-seq data

0

Entering edit mode

8.3 years ago

ropni ▴ 10

Hi, I am doing a quality check for my rna-seq data. Attached below are the data I'm working with. I have run a qc with FastQC and i need some advice on all the data like at which position i have to trim the bases and perhaps how to analyse and understand these data. Thank you.

qc on raw data1

qc on rawdata2

qc on rawdata3

RNA-Seq • 2.3k views

ADD COMMENT • link updated 8.3 years ago by bruce.moran ▴ 970 • written 8.3 years ago by ropni ▴ 10

0

Entering edit mode

Quality tends to fall off usually towards the end of the read, it is not normally a big deal if it isn't too bad and you can just align anyway - Tophat2 should take quality into consideration when aligning. Alternatively you can trim the end off until a threshold is reached, there is a fast tool called trimmomatic that can do that for you.

ADD REPLY • link 8.3 years ago by chris86 ▴ 400

0

Entering edit mode

It looks like this may be PacBio Iso-Seq data? You would want to keep that in mind as you go forward with any analysis. PacBio Iso-Seq wiki would be useful for you.

ADD REPLY • link 8.3 years ago by GenoMax 146k

score 0 · Answer 1 · 2016-06-28

0

Entering edit mode

8.3 years ago

emmapead2 ▴ 60

I would use trim galore

eg. trim_galore --phred64 --paired --trim1 *fastqc

ADD COMMENT • link 8.3 years ago by emmapead2 ▴ 60

2

Entering edit mode

It is better to use a multi-thread enabled tool for speed such as trimmomatic

ADD REPLY • link 8.3 years ago by chris86 ▴ 400

0

Entering edit mode

What makes you think this is ascii-64 encoded? The charts indicate that it is ascii-33, and all modern data is ascii-33, so I would conclude that it is ascii-33.

ADD REPLY • link 8.3 years ago by Brian Bushnell 20k

score 0 · Answer 2 · 2016-06-28

0

Entering edit mode

8.3 years ago

bruce.moran ▴ 970

BBDuk in the BBMap package is incredibly fast with many, many options for use, highly recommended and great support from the author.

https://sourceforge.net/projects/bbmap/

http://seqanswers.com/forums/showthread.php?t=42776

ADD COMMENT • link 8.3 years ago by bruce.moran ▴ 970

0

Entering edit mode

Is this multi core enabled? Trimmomatic should be faster if not.

ADD REPLY • link 8.3 years ago by chris86 ▴ 400

0

Entering edit mode

BBDuk is multithreaded and faster than Trimmomatic in my tests.

@ropni -

It appears that there is something seriously wrong with your data, and the files are corrupt. Illumina runs never produce 752bp or 896bp reads, which FastQC indicates you have. Are you sure this is the raw data? Where did the data come from, and what platform generated it?