Oxford Nanopore Single-Molecule Amplicon-Seq Phasing
1
0
Entering edit mode
19 months ago
KoppesEA ▴ 80

I am working on a research project to define novel variants in a compact gene of interest from patient samples using long-read single-molecule oxford nanopore amplicon sequencing. A ~6.1 kb fragment was cleanly isolated by PCR and sent to plasmidsaurus for amplicon sequencing. The summary files show SNPs that are clearly heterozygous indicating two alleles for each patient. I would like to now phase each variant to determine with certainty whether each SNP is found in trans on opposing alleles or present on the same allele in cis.

As a background I am a molecular biologist self-trained in command line tools and competent with illumina short-read RNA-Seq and WGS. However, I do not have experience with single-molecule long-read sequencing.

If a knowledgeable expert in the community could direct me to an optimal pipeline starting from raw .fastq reads to do QC, Trimming (if necessary?), alignment to reference gene PCR amplicon sequence and phasing that would be fantastic. With some direction to which tools to use I can probably figure out the command line, although any consideration to critical command line options is appreciated. I will post coding problems below if issues arise.

I’ve considered a partial solution to just use a grep text search to extract fastq reads that a) contain the amplicon and b) separate based on SNPs. But I know a more elegant solution must exist. My other option is to just clone each allele into a plasmid and sequence enough to get each allele separate.

Thanks in advance, --EK

SNP Long-Read Phasing Oxford-Nanopore Single-Molecule • 1.1k views
ADD COMMENT
0
Entering edit mode
module load minimap2/2.24
module load gcc/8.2.0
module load samtools/1.14 ##note clair 3 dependencies is 1.15
module load clair3/0.1-r12

inFolder=./raw_amplicon_fastq/
outFolder=./phased_amplicons
refGenome=./Chr17Ref/Homo_sapiens.GRCh38.dna.chromosome.17.fa

rm -rf $outFolder/*
rmdir $outFolder
mkdir $outFolder

for fastq_file in ${inFolder}/*.fastq
do
    echo $fastq_file    
    samplename=`basename $fastq_file .fastq`
    echo $samplename
    outDir=$outFolder/$samplename
    echo $outDir
    rm -r $outDir
    mkdir $outDir
    minimap2 -a ${refGenome}.gz $fastq_file > $outDir/${samplename}.sam
    samtools view -b -o $outDir/${samplename}.bam $outDir/${samplename}.sam 
    samtools sort -O BAM -o $outDir/${samplename}_sorted.bam $outDir/${samplename}.bam 
    samtools index -b $outDir/${samplename}_sorted.bam 
    run_clair3.sh \
    --bam_fn=$outDir/${samplename}_sorted.bam \
    --ref_fn=$refGenome \
    --output=$outDir \
    --threads=4 \
    --platform=ont \
    --model_path=./r941_prom_sup_g5014/ \
    --enable_phasing
    echo $samplename >> $outFolder/mygene_phaseSummary.vcf
    gunzip -c $outDir/phased_merge_output.vcf.gz >> $outFolder/mygene_phaseSummary.vcf

    done
ADD REPLY
0
Entering edit mode

This is what I came up with so far. Mapping PCR amplicons to a specific chromosome and then phasing alleles.

ADD REPLY
1
Entering edit mode
19 months ago
GenoMax 141k

For QC: PycoQC (LINK) (you will need the sequence summary file from nanopore run) and Nanoplot (LINK).

For Alignments: minimap2 https://github.com/lh3/minimap2

If you want to filter reads out that have the amplicon then bbduk.sh should work. A Guide is available: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/

Nanopore seems to have a phasing workflow (no personal experience): https://nanoporetech.com/resource-centre/snvs-and-phasing-workflow

ADD COMMENT
0
Entering edit mode

Thanks! I appreciate the guidance and I'm looking into your suggestions. Will update when I start making some progress.

Following from the Nanopore link it looks like WhatsHap (https://whatshap.readthedocs.io/en/latest/) or something like Clair3 (https://github.com/HKU-BAL/Clair3) might be necessary for phasing.

ADD REPLY
0
Entering edit mode

I have used Clair3 for SNP analysis and it is good.

ADD REPLY

Login before adding your answer.

Traffic: 1801 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6