Upstream (5′) Coverage Depletion Observed in IGV
0
0
Entering edit mode
3 hours ago
safeassli ▴ 10

Hello, I’m hoping for some guidance. I’m working with ONT reads and want to randomly select 7,000 reads from a fastq.gz, keeping only those with Phred > 15. I use Chopper for quality filtering, a fixed random.seed for reproducibility, and reservoir sampling to draw the 7000 reads.

After mapping this subset to a reference with minimap2 and viewing the alignments in IGV, I observe what looks like a “funnel” pattern in the pileup (pic 1). To address this, I added a minimum-coverage filter for read acceptance and set it to 70; that produced pic 2, but some reads still don’t align end-to-end. Increasing mincov to 90 gave pic 3.

Because many reads pass Phred > 15, I sometimes choose 7000–12000 reads to keep runtime reasonable. However, when the resulting alignments don’t fully cover the reference, I worry I’m losing information. Is this expected?

Basecalling was performed for all Nanopore runs with Guppy version 3.2.9 or 4.3.4

In my scriprt I dont trim reads.

I’m puzzled because, using EPI2ME with only Phred > 15 and reads = 7000, the BAM it produces does not show the same profile in IGV; specifically, I do not see the left-flank (5′) coverage drop-off there.

PS: My ultimate goal is variant calling.

This might be a basic question, but I’m a bit lost. Any insights would be appreciated. I’m hoping for some guidance.

pic1:enter image description here

pic2enter image description here

pic3enter image description here

igv ont • 117 views
ADD COMMENT
0
Entering edit mode

Minimum-coverage logic:

For each read, I iterate over minimap2 hits and accept the read if at least one hit has coverage_read >= min_coverage, where:

aligner = mp.Aligner(reference_genome, preset="map-ont")

coverage_read_percent = (hit.q_en - hit.q_st) / len(read) * 100

hit.q_st, hit.q_en: start/end of the aligned segment on the read (query coordinates) len(read): total read length from the FASTQ

ADD REPLY
0
Entering edit mode

What locus are we even looking at?

ADD REPLY
0
Entering edit mode

thank you for replying it's cytochrome b gene

ADD REPLY
0
Entering edit mode

I’m puzzled because, using EPI2ME with only Phred > 15 and reads = 7000, the BAM it produces does not show the same profile in IGV; specifically, I do not see the left-flank (5′) coverage drop-off there.

With the same 7000 reads you select that you also manually align with minimap2? Are you using default parameters for alignment? This is a multi-copy gene?

ADD REPLY
0
Entering edit mode

yes , I’m using minimap2 with the default -x map-ont preset (no extra flags beyond read-group), then filtering with samtools view -F 2304 to drop secondary and supplementary alignments. Then I sort and index

ADD REPLY
0
Entering edit mode

It looks like ONT is using sligntly different pre-sets for their EPI2ME protocols. See the post explaining the change here --> https://epi2me.nanoporetech.com/how-to-align/ See if you are able to get similar results if you switch.

Is this data from whole genome sequencing or are you using selective sequencing to just get cyt b gene. Are you using the complete genome as reference?

ADD REPLY

Login before adding your answer.

Traffic: 4034 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6