Question

BBDuk aggressive ktrim on ancient DNA samples

0

Entering edit mode

3.6 years ago

ido.idobar ▴ 10

Hi,

I'm using BBDuk to trim adapters and quality-filter reads of WGS data from ancient DNA (>200 years old samples). The reads are 150bp PE. When I use the recommended parameters for PE reads (which I've used many times before), a large amount of my reads is being trimmed by ktrim=r (25-50%).

This is my command:

bbduk.sh -Xmx1g ref=$BBMAP_DIR/bbmap-38.79-0/resources/adapters.fa ktrim=r k=23 pigz=f mink=11 hdist=1 qtrim=rl trimq=10 tpe tbo int minlen=30 ziplevel=9 threads=12 in=./D15_#.fastq.gz out=trimmed_reads/trimmed_D15_#.fastq.gz stats=D15.stats ow

And this is the output:

Input:                          93969406 reads          14189380306 bases.
QTrimmed:                       393623 reads (0.42%)    1688764 bases (0.01%)
KTrimmed:                       90265990 reads (96.06%)         6997869744 bases (49.32%)
Trimmed by overlap:             970534 reads (1.03%)    4869518 bases (0.03%)
Total Removed:                  347220 reads (0.37%)    7004428026 bases (49.36%)
Result:                         93622186 reads (99.63%)         7184952280 bases (50.64%)

I suspect that this might be due to the fragmented nature of the aDNA, resulting in short fragments, flanked by adapter sequences, but I'd like to have a second opinion, to make sure that I don't need to alter the parameters somehow to retain more "real" sequences.

Many thanks, Ido

wgs ancient DNA BBMap adapter trimming • 967 views

ADD COMMENT • link updated 3.6 years ago by GenoMax 141k • written 3.6 years ago by ido.idobar ▴ 10

score 1 · Answer 1 · 2020-09-07

1

Entering edit mode

3.6 years ago

GenoMax 141k

flanked by adapter sequences

That is very likely. On other hand "total removed" reads are relatively small. They probably represent primer dimers (with no inserts) or ones with very short inserts.

ADD COMMENT • link 3.6 years ago by GenoMax 141k