Question

Closed:After UMI extraction, observing weird alignment percentage

0

Entering edit mode

5.4 years ago

Rituriya ▴ 40

Hi All,

Background: I have completed adapter trimming and checked QC on Illumina NextSeq miRNA single end reads of length 75bp. I want to run umi_tools to extract the UMI information before I align the reads to the reference. UMI extraction failed and then I was advised to run UMI extraction first and then do adapter trimming. It worked fine!

Current scenario: If I followed UMI extraction first and then adapter trimming later, there is a weird long trail of N's (approx 20 N's) towards all the read ends. So to see if something has changed drastically due to UMI extraction, I did adapter trimming in a more effective manner and then did UMI extraction.

Cutadapt command:

cutadapt -a AACTGTAGGCACCATCAAT -g GTTCAGAGTTCTACAGTCCGACGATC --discard-untrimmed --minimum-length=15 -a AGATCGGAAGAG -e 0.1 -o XYZ-3adapters-trimmed.fastq.gz ../XYZ.fastq

UMI extraction command:

umi_tools extract --stdin=XYZ-3adapters-trimmed.fastq --bc-pattern=NNNNNNNNNNNN -L XYZ-UMIextract.log --stdout=XYZ-3adapters-trimmed-UMIextracted.fastq

Now following is my situation in question:

Situation 1: PhiX contamination (aligned reads) of only adapter trimmed reads using Bowtie1 = 0.03%. Average read length here: 23 bp

Situation 2: PhiX contamination (aligned reads) after UMI extraction on adapter trimmed reads using Bowtie1 = 78.72% (mindbogglingly high). Average read length here: 11 bp

Now I do not see too many trailing N's in my data but this PhiX contamination level is bothering me. Please help me understand this behaviour. Due to UMI extraction I know the reads have become shorter but this is too drastic and tells me something is not right with UMI extraction. I confirmed with the wet lab technician, they said 10% PhiX was spiked-in as expected.

Thank you, Rituriya.

umi_tools PhiX alignment cutadapt • 190 views

ADD COMMENT • link 5.4 years ago by Rituriya ▴ 40