primer trimming tools
2
0
Entering edit mode
6.9 years ago
J.F.Jiang ▴ 880

Hi all,

Due to the PCR based amplification, we need to trim to primers off the PE reads.

However, it seems that the trimming process might bring in low quality in tails, especially that some reads might even fall into Q15-Q20, which might lead to low Quality using GATK variant calling.

$cutadapt -q 10 -g file:$fprimer -G file:$rprimer --minimum-length 20$read1 $read2 -o$read1_trim -p \$read2_trim;

The command listed above is the one I used in my calling pipeline, although I can adjust the q to 30, there would be quite a lot reads to be removed.

Is there any smarter method to trim the primers?

Thanks.

Junfeng

primer trimming • 3.8k views
0
Entering edit mode
6.9 years ago

Doing adapter-trimming will not somehow reduce the quality of your reads. Variant callers are expected to take quality into consideration when making calls, and good mapping programs are robust against low quality tails on reads; the extra length improves mapping confidence.

I don't recommend trimming to a quality threshold above 12 for mapping. If cutadapt has trouble finding adapters in low-quality reads without aggressive quality-trimming, I suggest you use BBDuk instead, which does quality-trimming after adapter-trimming, as it is does not need quality-trimming to increase sensitivity. Trimming for quality prior to trimming adapters reduces the number of adapter bases present, ultimately making them harder to detect.

Edit - it looks like the OP was talking about targeted amplicon sequencing and trimming the 3' primer, in which case my answer is irrelevant.

0
Entering edit mode
6.9 years ago
J.F.Jiang ▴ 880

After trying and searching references, I found the most efficient way is to align the sequences using aligners first, e.g., bwa-mem, then using GATK ClipReads later to remove the matched primer sequences, since primers can help to align the reads to ref genomes, while trimming the reads first will generate a lot of "bad" quality reads that can result in lower depth of coverage or waste of inputs.

And another question is that I calculated the Ti/Tv ratio for target sequencing, e.g., BRCA1 and BRCA2 genes, and obtained a varied value from 2.X to 8.X for different samples. Is this normal or not?

0
Entering edit mode

We also adopted the workflow of BWA-MEM alignment first and then soft-clip primer afterwards. Since ours are nested amplicons, GATK ClipReads cannot properly handle the primer clipping. We then developed and use BAMClipper (Scientific Reports 7:1567). The bonus is that errors in primer sequences (synthesis and/or sequencing) are tolerated.