I am trying to learn DNA sequencing analysis by using GIAB datasets [Link to Chinese trio - HG005_NA24631_son/], the readme file they had provided does not provide the adapter sequences used. I tried to use
bbmerge.sh like in this biostars post ;
bbmerge.sh in1=r1.fq in2=r2.fq outa=adapters.fa
and it identified the following as adapter sequences in the input file.
This matches with the Illumina Truseq adapter;
I have tried both these sequences for adapter trimming via cutadapt
cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTT -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT -m 20 -q 30 -o HG005_R1_trimmed.fastq.gz -p HG005_R2_trimmed.fastq.gz MPHG005_S4_L004_R1_001.fastq.gz MPHG005_S4_L004_R2_001.fastq.gz
But still the adapter content fails in FASTQC.
Since, the readme mentioned being prepared using Nextera Mate Pair Sample Preparation Kit, I am now trying using Nextera Mate pair adapter sequence from Illumina support docs.
If the nextera sequence works, doesn't that mean
bbmerge.sh approach is wrong or am I doing something wrong with cutadapt. Any help would be appreciated.
BBMerge approach is not wrong. You identified and removed the primary adapter. FastQC "failures" are not a reason to stop nor is there a reason to get a "pass" on each FastQC category. Aligners will take care of any extraneous sequence remaining at this point since they will soft-clip those parts of the reads that do not match the reference.
Note: You could have also used
bbduk.shfrom BBTools to do the same trimming operation.
Hi GenoMax, thanks for the reply. I ran cutadapt and fastqc again on the same sample now with Nextera Mate pair adapters and the FASTQC passed all checks including adapter content.
I am curious (and very confused) to know which adapter sequences to use for trimming in this case, as
bbmerge.shseemed to suggest Illumina Truseq adapter whereas in the readme it says Nextera Mate pair sequencing and the Nextera Mate pair adapter sequence resulted in passing FASTQC adapter content check?
I used cutadapt as it was used in the tutorial I was following.
Thanks in advance!