Question: How to trim poly-sequence before or after trimming by Trimmomatic.
1
gravatar for Apprentice
11 weeks ago by
Apprentice30
Apprentice30 wrote:

Hi.

I'm analyzing RNA-seq data.

My pipeline of RNA-seq was as below. (1) Trimmomatic exclude adapter sequences and low-quality bases from my fastq files. (2) Tophat2 mapped my reads to the reference sequence (hg19).

Trimmomatic didn't remove poly-A sequence from my fastq files.

I would like to know how to trim poly-sequence before or after trimming adapter sequences and low-quality bases by Trimmomatic.

rna-seq • 166 views
ADD COMMENTlink modified 11 weeks ago by WouterDeCoster42k • written 11 weeks ago by Apprentice30
1

The trimmomatic directory should contain fasta files with the adapter sequences to be trimmed. From what I understand you can simply add new sequences to these files, in your case polyA and polyT.

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by ATpoint25k

Thank you for you comment. I don't know how sequence should be added to the file as the polyA and polyT.

Could you tell me examples of the file?

ADD REPLYlink written 11 weeks ago by Apprentice30
2

Open in a text editor and add

>p-A
AAAAAAAAAAAAAAAAAAAAAAAAA
>p-T
TTTTTTTTTTTTTTTTTTTTTTTTTTT

I am not familiar with trimmomatic, so I cannot tell where the files are, you will have to find out. This has probably been asked before, check with the search function.

ADD REPLYlink modified 10 weeks ago by WouterDeCoster42k • written 10 weeks ago by ATpoint25k

Thank you for your advice.

ADD REPLYlink written 10 weeks ago by Apprentice30
1
gravatar for WouterDeCoster
11 weeks ago by
Belgium
WouterDeCoster42k wrote:

For removing polyA/T sequences I have been using prinseq with -trim_tail_right and -trim_tail_left, but that's probably just one of the tools which can do that.

ADD COMMENTlink written 11 weeks ago by WouterDeCoster42k

Thank you for your advice. I'll try it.

ADD REPLYlink written 11 weeks ago by Apprentice30

I would like to remove polyA tails using prinseq.

My RNA-seq data is paired end. My fastq files were gzipped.

It seems that prinseq can't read gzipped fastq files. I don't want to decompress the fastq files. Could you tell me how to use prinseq for paired-end gzipped fastq files?

ADD REPLYlink written 10 weeks ago by Apprentice30
1

Hmmm, I had been using Prinseq for SE data, in which case I could use piping:

zcat reads.fastq.gz | prinseq <other arguments | gzip > trimmed_reads.fastq.gz

But that's probably not an option for you. Then the solution from ATpoint is probably best, to modify the fasta file containing the adapter sequences to trim to also include AAAAAAAAAAAAAAAA and TTTTTTTTTTTTTTTTTTT

ADD REPLYlink written 10 weeks ago by WouterDeCoster42k

Thank you for your reply.

I'll try to use the solution from ATpoint.

ADD REPLYlink written 10 weeks ago by Apprentice30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1827 users visited in the last hour