How to unique mappers become concordant pairs aligned >1 times
0
0
Entering edit mode
3.5 years ago
bertb ▴ 20

Hello,

I prepared .sam files from PE sequencing results as follows:

hisat2 -p 8 --rg-id=UWN_t3 --rg SM:UWN_t3 --rg LB:UWN_t3 --rg PL:ILLUMINA --rg PU:CE9PNANXX.8 -x $RNA_REF_INDEX --dta --rna-strandness RF -1 $RNA_DATA_DIR/trimmed/UW_N_Mix.trimmed_1.fastq.gz -2 $RNA_DATA_DIR/trimmed/UW_N_Mix.trimmed_2.fastq.gz -S ./UWN_t3.sam
#output
43288187 reads; of these:
  43288187 (100.00%) were paired; of these:
    9023789 (20.85%) aligned concordantly 0 times
    16704076 (38.59%) aligned concordantly exactly 1 time
    17560322 (40.57%) aligned concordantly >1 times
    ----
    9023789 pairs aligned concordantly 0 times; of these:
      3361606 (37.25%) aligned discordantly 1 time
    ----
    5662183 pairs aligned 0 times concordantly or discordantly; of these:
      11324366 mates make up the pairs; of these:
        9894957 (87.38%) aligned 0 times
        731747 (6.46%) aligned exactly 1 time
        697662 (6.16%) aligned >1 times
88.57% overall alignment rate

At which point, I remembered that my organism (yeast) does not have any introns larger than 2500bp, so I added the option --max-intronlen 2500 to the command, and got the following output:

hisat2 -p 8 --rg-id=UWN_t4 --rg SM:UWN_t4 --rg LB:UWN_t4 --rg PL:ILLUMINA --rg PU:CE9PNANXX.8 --max-intronlen 2500 -x $RNA_REF_INDEX --dta --rna-strandness RF -1 $RNA_DATA_DIR/trimmed/UW_N_Mix.trimmed_1.fastq.gz -2 $RNA_DATA_DIR/trimmed/UW_N_Mix.trimmed_2.fastq.gz -S ./UWN_t4.sam
43288187 reads; of these:
  43288187 (100.00%) were paired; of these:
    20057896 (46.34%) aligned concordantly 0 times
    5856233 (13.53%) aligned concordantly exactly 1 time
    17374058 (40.14%) aligned concordantly >1 times
    ----
    20057896 pairs aligned concordantly 0 times; of these:
      3360282 (16.75%) aligned discordantly 1 time
    ----
    16697614 pairs aligned 0 times concordantly or discordantly; of these:
      33395228 mates make up the pairs; of these:
        9894750 (29.63%) aligned 0 times
        725032 (2.17%) aligned exactly 1 time
        22775446 (68.20%) aligned >1 times
88.57% overall alignment rate

What I mainly notice that has changed is the number of aligned concordantly exactly 1 time category has dropped by ~20%, and moved to the aligned concordantly 0 times category, and further the aligned >1 times category within that.

My question is, I understand how reducing intron length would filter reads into the aligned concordantly 0 times category, but I don't understand how the majority of those are aligned >1 times, since they aligned 'exactly 1 time' prior to filtering.

Thanks in advance to anybody who can help!

RNA-Seq alignment • 682 views
ADD COMMENT

Login before adding your answer.

Traffic: 1605 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6