Question: RNA-seq Hiseq 2000 Trimming
0
gravatar for valopes
23 months ago by
valopes30
valopes30 wrote:

Hi everybody,

I am working with some Illumina sequences and I have some doubts. Could someone help me, please? This sequences are from Hi-Seq 2000 but it was run in 2010.

I am not sure if it is single or paired end. How can I check that?

I am trimming it using trimmomatic

trimmomatic SE -threads 10 -phred64 \
../../rawdata/BRS_I24.fastq \
BRS_I24_trim.fastq \
ILLUMINACLIP:/data/apps/trimmomatic/0.36/adapters/TruSeq2-SE.fa:2:30:10 \
LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36

-However, in that time. Illumina had changed the type of quality score and then later the got back to the old one, I don´t know which one should I use. If is it the -phred64 or -phred33? The quality score looks like this:

@HWI-ST365_0091:2:1:1192:1999#TGTCAT/1
CCTGCTCTAAATGCTTCTATTTGCCGCATGATTCCAGTCTTGACAGTTGCATCTGCCACCAAGGATATATACTCCTCCAAATTGTTGATATCAACAATT
+HWI-ST365_0091:2:1:1192:1999#TGTCAT/1
YXYY[[X[[cccccccccccccccccccc_____c__ccccc_c[cccZccccccZccccc\_c[\]]Z]YYYY[ZXXRYYY]V[[[[XSUSUUXXXRZ

-Also I was seeing that when the sequencing is from GA II it is better to use TruSeq2-SE.fa:2:30:10, while with Hiseq it is better use TruSeq3-SE.fa:2:30:10. Is that right? I know the the sequencing kit used was TruSeq(TM) SBS v5.

Thanks in advance

rna-seq • 772 views
ADD COMMENTlink modified 23 months ago by genomax65k • written 23 months ago by valopes30
1
gravatar for genomax
23 months ago by
genomax65k
United States
genomax65k wrote:

I am not sure if it is single or paired end

If you only have one file then most likely the data is single-end.

That said, there is some chance that the file contains interleaved reads. If you take a look at the read that follows the example you posted and if it has a /2 at the end of the fastq headers (instead of /1 like in your example) then you would have paired-end data interleaved in a single file.

ADD COMMENTlink modified 23 months ago • written 23 months ago by genomax65k

Thank you, genomax!

Do you also have any idea about my other questions?

ADD REPLYlink written 23 months ago by valopes30

The example you posted seems be to in sanger fastq format. I generally use bbduk.sh from BBMap suite for trimming. That software contains an adapters.fa file in the resources directory in software bundle that covers all commonly used adapter sequences where you do not need to know the specifics of which version of adapters were used.

ADD REPLYlink modified 23 months ago • written 23 months ago by genomax65k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1273 users visited in the last hour