Question

All sequenced reads in piared end sequencing don't have mate pairs?

0

Entering edit mode

4.6 years ago

piyushjo ▴ 710

Hi,

I am using the new HISAT2 v2.2.0 to perform alignment of paired end RNA-seq data. The alignment report suggests that not all the reads were paired ended. I am slightly confused by this. I thought all reads in PE sequencing have mate pairs. Thew new HISAT2 summary looks like this.

HISAT2 summary stats:

    Total pairs: 76700832
            Aligned concordantly or discordantly 0 time: 3686808 (4.81%)
            Aligned concordantly 1 time: 60966514 (79.49%)
            Aligned concordantly >1 times: 11843091 (15.44%)
            Aligned discordantly 1 time: 204419 (0.27%)
    Total unpaired reads: 7373616
            Aligned 0 time: 3934307 (53.36%)
            Aligned 1 time: 2579590 (34.98%)
            Aligned >1 times: 859719 (11.66%)
    Overall alignment rate: 97.44%

While the old HISAT2 summary, used to look like this (taken from HISAT2 website)

Alignment summary (not for the same data, just want to show it used to say 100% of reads were paired end)

10000 reads; of these:   10000 (100.00%) were paired; of these:
    650 (6.50%) aligned concordantly 0 times
    8823 (88.23%) aligned concordantly exactly 1 time
    527 (5.27%) aligned concordantly >1 times
    ----
    650 pairs aligned concordantly 0 times; of these:
      34 (5.23%) aligned discordantly 1 time
    ----
    616 pairs aligned 0 times concordantly or discordantly; of these:
      1232 mates make up the pairs; of these:
        660 (53.57%) aligned 0 times
        571 (46.35%) aligned exactly 1 time

        1 (0.08%) aligned >1 times
96.70% overall alignment rate

I calculated that now almost 9% of my reads are not paired end. Is that normal? Did older HISAT2 used to discard unpaired reads for alignment?

HISAT2 paired end sequencing • 1.3k views

ADD COMMENT • link updated 4.6 years ago by h.mon 35k • written 4.6 years ago by piyushjo ▴ 710

0

Entering edit mode

Did you manipulate the fastq files somehow? Like trimming or any custom kind of filtering?

ADD REPLY • link 4.6 years ago by ATpoint 85k

0

Entering edit mode

No. I just ran fasqtc before aligning and din't perform any trimming. The second summary is actually not for the same data. I will clarify that in the question.

This unrelated post also has an example where the read summary is separated into "paired end reads" and "unpaired end reads".

hisat2 --sra-acc with paired reads producing single read output

ADD REPLY • link 4.6 years ago by piyushjo ▴ 710

score 1 · Answer 1 · 2020-03-09

You should run the same sample with both versions of hisat and then compare the summary output, it would be less confusing. It seems to me just a change in terminology in the summary output: whereas in hisat 2.2.0 there is a "section" called unpaired reads:

Total unpaired reads: 7373616
        Aligned 0 time: 3934307 (53.36%)
        Aligned 1 time: 2579590 (34.98%)
        Aligned >1 times: 859719 (11.66%)

Previously, in hisat 2.1.0, corresponds to this section:

616 pairs aligned 0 times concordantly or discordantly; of these:
  1232 mates make up the pairs; of these:
    660 (53.57%) aligned 0 times
    571 (46.35%) aligned exactly 1 time
    1 (0.08%) aligned >1 times

The term unpaired read wasn't explicitly used, but the same information was present in the summary output.

Again, you will get a better picture by comparing the summary of the same sample, instead of comparing two different samples, each run with a different version of hisat.