Question

How to trim poly-sequence before or after trimming by Trimmomatic.

1

Entering edit mode

4.7 years ago

Apprentice ▴ 160

Hi.

I'm analyzing RNA-seq data.

My pipeline of RNA-seq was as below. (1) Trimmomatic exclude adapter sequences and low-quality bases from my fastq files. (2) Tophat2 mapped my reads to the reference sequence (hg19).

Trimmomatic didn't remove poly-A sequence from my fastq files.

I would like to know how to trim poly-sequence before or after trimming adapter sequences and low-quality bases by Trimmomatic.

rna-seq • 5.2k views

ADD COMMENT • link updated 4.7 years ago by WouterDeCoster 47k • written 4.7 years ago by Apprentice ▴ 160

1

Entering edit mode

The trimmomatic directory should contain fasta files with the adapter sequences to be trimmed. From what I understand you can simply add new sequences to these files, in your case polyA and polyT.

ADD REPLY • link 4.7 years ago by ATpoint 82k

0

Entering edit mode

Thank you for you comment. I don't know how sequence should be added to the file as the polyA and polyT.

Could you tell me examples of the file?

ADD REPLY • link 4.7 years ago by Apprentice ▴ 160

2

Entering edit mode

Open in a text editor and add

>p-A
AAAAAAAAAAAAAAAAAAAAAAAAA
>p-T
TTTTTTTTTTTTTTTTTTTTTTTTTTT

I am not familiar with trimmomatic, so I cannot tell where the files are, you will have to find out. This has probably been asked before, check with the search function.

ADD REPLY • link updated 4.7 years ago by WouterDeCoster 47k • written 4.7 years ago by ATpoint 82k

0

Entering edit mode

Thank you for your advice.

ADD REPLY • link 4.7 years ago by Apprentice ▴ 160

score 1 · Answer 1 · 2019-08-28

1

Entering edit mode

4.7 years ago

WouterDeCoster 47k

For removing polyA/T sequences I have been using prinseq with -trim_tail_right and -trim_tail_left, but that's probably just one of the tools which can do that.

ADD COMMENT • link 4.7 years ago by WouterDeCoster 47k

0

Entering edit mode

Thank you for your advice. I'll try it.

ADD REPLY • link 4.7 years ago by Apprentice ▴ 160

0

Entering edit mode

I would like to remove polyA tails using prinseq.

My RNA-seq data is paired end. My fastq files were gzipped.

It seems that prinseq can't read gzipped fastq files. I don't want to decompress the fastq files. Could you tell me how to use prinseq for paired-end gzipped fastq files?

ADD REPLY • link 4.7 years ago by Apprentice ▴ 160

1

Entering edit mode

Hmmm, I had been using Prinseq for SE data, in which case I could use piping:

zcat reads.fastq.gz | prinseq <other arguments | gzip > trimmed_reads.fastq.gz

But that's probably not an option for you. Then the solution from ATpoint is probably best, to modify the fasta file containing the adapter sequences to trim to also include AAAAAAAAAAAAAAAA and TTTTTTTTTTTTTTTTTTT