Hi all,
I have some metagenomics reads, I trim them with trimmomatic and remove adapters using some illumina adapters library I found on the internet.
Then I look at them in Fastqc and it says that "adapter content" is very bad and "kmer content" is very bad as well (which must mean that there are indeed lots of adapters), the question is -- how do I remove these adapters?
How does Fastqc know they are there? Is it using some adapter database? If so, I would like to use that database to remove them! Where is the Fullest Possible Illumina Adapters Library (because the one I'm using right now isn't full enough)? Or Fastqc is mistaking?
there is a nice tool called BBmap from which you can use
bbduck.sh
that contains what you needand it would be nice if you can support us with the tool, command and parameters that would make the answer more clear
Program is actually called
bbduk.sh
. BBMap suite also includes aadapters.fa
file (in "resources" directory in source) that includes common commercial adapter sequences so you don't need to search for dodgy sources on the net or type the sequences in by hand.according to the author documentation he wrote
which will take you to the BBmap web (maybe I am mistaken about it)
duk
stands fordecontamination using k-mers
. You wrotebbduck.sh
in the post above (which is a mistake I sometimes do). I was just correcting the program name. All BBTools are included in a single download.you are totally right I did not see this C (my bad :) )
@hed.robin Yes. FastQC uses a database kept inside Configuration directory. Look for files: adapter_list.txt and contaminant_list.txt