How to trim poly-sequence before or after trimming by Trimmomatic.
1
1
Entering edit mode
2.3 years ago
Apprentice ▴ 90

Hi.

I'm analyzing RNA-seq data.

My pipeline of RNA-seq was as below. (1) Trimmomatic exclude adapter sequences and low-quality bases from my fastq files. (2) Tophat2 mapped my reads to the reference sequence (hg19).

Trimmomatic didn't remove poly-A sequence from my fastq files.

I would like to know how to trim poly-sequence before or after trimming adapter sequences and low-quality bases by Trimmomatic.

rna-seq • 2.1k views
1
Entering edit mode

The trimmomatic directory should contain fasta files with the adapter sequences to be trimmed. From what I understand you can simply add new sequences to these files, in your case polyA and polyT.

0
Entering edit mode

Thank you for you comment. I don't know how sequence should be added to the file as the polyA and polyT.

Could you tell me examples of the file?

2
Entering edit mode

Open in a text editor and add

>p-A
AAAAAAAAAAAAAAAAAAAAAAAAA
>p-T
TTTTTTTTTTTTTTTTTTTTTTTTTTT


I am not familiar with trimmomatic, so I cannot tell where the files are, you will have to find out. This has probably been asked before, check with the search function.

0
Entering edit mode

1
Entering edit mode
2.3 years ago

For removing polyA/T sequences I have been using prinseq with -trim_tail_right and -trim_tail_left, but that's probably just one of the tools which can do that.

0
Entering edit mode

0
Entering edit mode

I would like to remove polyA tails using prinseq.

My RNA-seq data is paired end. My fastq files were gzipped.

It seems that prinseq can't read gzipped fastq files. I don't want to decompress the fastq files. Could you tell me how to use prinseq for paired-end gzipped fastq files?

1
Entering edit mode

Hmmm, I had been using Prinseq for SE data, in which case I could use piping:

zcat reads.fastq.gz | prinseq <other arguments | gzip > trimmed_reads.fastq.gz


But that's probably not an option for you. Then the solution from ATpoint is probably best, to modify the fasta file containing the adapter sequences to trim to also include AAAAAAAAAAAAAAAA and TTTTTTTTTTTTTTTTTTT

0
Entering edit mode

I'll try to use the solution from ATpoint.