Question: How to trim poly-sequence before or after trimming by Trimmomatic.
1
gravatar for Apprentice
13 months ago by
Apprentice40
Apprentice40 wrote:

Hi.

I'm analyzing RNA-seq data.

My pipeline of RNA-seq was as below. (1) Trimmomatic exclude adapter sequences and low-quality bases from my fastq files. (2) Tophat2 mapped my reads to the reference sequence (hg19).

Trimmomatic didn't remove poly-A sequence from my fastq files.

I would like to know how to trim poly-sequence before or after trimming adapter sequences and low-quality bases by Trimmomatic.

rna-seq • 814 views
ADD COMMENTlink modified 13 months ago by WouterDeCoster44k • written 13 months ago by Apprentice40
1

The trimmomatic directory should contain fasta files with the adapter sequences to be trimmed. From what I understand you can simply add new sequences to these files, in your case polyA and polyT.

ADD REPLYlink modified 13 months ago • written 13 months ago by ATpoint38k

Thank you for you comment. I don't know how sequence should be added to the file as the polyA and polyT.

Could you tell me examples of the file?

ADD REPLYlink written 13 months ago by Apprentice40
2

Open in a text editor and add

>p-A
AAAAAAAAAAAAAAAAAAAAAAAAA
>p-T
TTTTTTTTTTTTTTTTTTTTTTTTTTT

I am not familiar with trimmomatic, so I cannot tell where the files are, you will have to find out. This has probably been asked before, check with the search function.

ADD REPLYlink modified 13 months ago by WouterDeCoster44k • written 13 months ago by ATpoint38k

Thank you for your advice.

ADD REPLYlink written 13 months ago by Apprentice40
1
gravatar for WouterDeCoster
13 months ago by
Belgium
WouterDeCoster44k wrote:

For removing polyA/T sequences I have been using prinseq with -trim_tail_right and -trim_tail_left, but that's probably just one of the tools which can do that.

ADD COMMENTlink written 13 months ago by WouterDeCoster44k

Thank you for your advice. I'll try it.

ADD REPLYlink written 13 months ago by Apprentice40

I would like to remove polyA tails using prinseq.

My RNA-seq data is paired end. My fastq files were gzipped.

It seems that prinseq can't read gzipped fastq files. I don't want to decompress the fastq files. Could you tell me how to use prinseq for paired-end gzipped fastq files?

ADD REPLYlink written 13 months ago by Apprentice40
1

Hmmm, I had been using Prinseq for SE data, in which case I could use piping:

zcat reads.fastq.gz | prinseq <other arguments | gzip > trimmed_reads.fastq.gz

But that's probably not an option for you. Then the solution from ATpoint is probably best, to modify the fasta file containing the adapter sequences to trim to also include AAAAAAAAAAAAAAAA and TTTTTTTTTTTTTTTTTTT

ADD REPLYlink written 13 months ago by WouterDeCoster44k

Thank you for your reply.

I'll try to use the solution from ATpoint.

ADD REPLYlink written 13 months ago by Apprentice40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1328 users visited in the last hour