How to trim adapters and remove polyA tails from QuantSeq 3' FWD data
0
0
Entering edit mode
4.7 years ago
Chloe • 0

I am trying to work out the best way to trim poly A tails of variable length from the 3' end of my data and if possible also remove adapters and other contaminating sequence at the same time from the 5' end.

My data is 75 bp single end. Libraries were prepared using the QuantSeq 3' FWD kit and sequenced using the Illumina NextSeq 500

I think in order to remove the adapters etc from the 5' end I will just need to trim the first 12 nts (this is what it says to do under FAQ on the QuantSeq website, although it does not specifically say how to remove adapters)

The QuantSeq website says to do this:

for sample in runID*R1_001.fastq; do cat ${sample} | bbduk.sh in=stdin.fq out=${sample}_trimmed_clean ref=/data/resources/polyA.fa.gz,/data/resources/truseq.fa.gz k=13 ktrim=r forcetrimleft=11 useshortkmers=t mink=5 qtrim=t trimq=10 minlength=20; done

I downloaded bbduk to try this but it didn't have the ref=/data/resources/polyA.fa.gz file and I am at a loss on how to make it myself.

Does anyone have any ideas on how to do this in either bbduk or trimmomatic or something else?

RNA-Seq quantseq trim poly-A tails • 3.7k views
1
Entering edit mode

Brian Bushnell : may have included the polyA file in a past iteration of bbmap suite but that file is no longer there.

Can you try the following instead (replace path_to with a real path on your computer in the command below): for sample in runID*R1_001.fastq; do cat ${sample} | bbduk.sh in=stdin.fq out=${sample}_trimmed_clean ref=/path_to/bbmap/resources/truseq.fa.gz literal=AAAAAAAAA k=13 ktrim=r forcetrimleft=11 useshortkmers=t mink=5 qtrim=t trimq=10 minlength=20; done

0
Entering edit mode

Thanks I'll try that. Will that only remove poly As of that exact length from the end of the read? The reads are short and generated towards the 3' end, so some of the poly A tails seem to be in the middle of the read

0
Entering edit mode

With short fragment lengths it's possible/likely that you sequenced first the mRNA, then the polyA tail and then in the adapter. So you would first have to remove the adapter to "expose" the terminal poly A sequence, and remove that too. For NextSeq sequencing you can also have a polyG tail, corresponding to the two-colour chemistry in which 'G' is absence. So you would want to trim those too.