Bowtie2 reads alignment question
0
0
Entering edit mode
5.6 years ago
afli ▴ 190

Hi, I'm using bowtie2 to align reads to the genome, when I use the chr1 ~chr12 to build the index and then do mapping, the pair reads 'V100002715L1C001R026000072' is assigned to chr6 with good quality(mapping score 33). But if I add the chrC and chrM and build the new index, then do mapping, 'V100002715L1C001R026000072' is assigned to chrM with a lower mapping score 1. The command line I use is

bowtie2 -x bowtie2_index -1 read_1.fa.gz -2 read_2.fa.gz --very-sensitive-local -p 10 -S result.sam

The detail information of the 'V100002715L1C001R026000072' read pairs are(for index has chrM and chrC)

V100002715L1C001R026000072  83  ChrM    193580  1   50M =   193461  -169    TTGTTTTTCTTGTTCTTCTTTCTCGAAGAGATGGGTGCACCGCCTTGGAG  7G@F4>FFG=FBFBFG*FBEF5AF?FFEEE?CD1D@CED=GFFGFG<FFF  AS:i:100    XS:i:100    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:5YS:i:100  YT:Z:CP
V100002715L1C001R026000072  163 ChrM    193461  1   50M =   193580  169 GGACAATGGTTTTCTAGGTTGTTTCACCAATCTGTTGAATTGGAATGGAG  D<AFEEBEFF8C>FFF/BCF>?FFFGFGCEFCFFFF6FE9FGF>F>FF>9  AS:i:100    XS:i:100    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:5YS:i:100  YT:Z:CP

(for index just chr1~chr12)

V100002715L1C001R026000072  83  Chr06   8173354 33  50M =   8173235 -169    TTGTTTTTCTTGTTCTTCTTTCTCGAAGAGATGGGTGCACCGCCTTGGAG  7G@F4>FFG=FBFBFG*FBEF5AF?FFEEE?CD1D@CED=GFFGFG<FFF  AS:i:100    XS:i:100    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:5YS:i:100  YT:Z:CP
V100002715L1C001R026000072  163 Chr06   8173235 33  50M =   8173354 169 GGACAATGGTTTTCTAGGTTGTTTCACCAATCTGTTGAATTGGAATGGAG  D<AFEEBEFF8C>FFF/BCF>?FFFGFGCEFCFFFF6FE9FGF>F>FF>9  AS:i:100    XS:i:70 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:50 YS:i:100    YT:Z:CP

Could someone please tell me why the genome with chrM and chrC get lower mapping score, do the reads really map to chrM? Thank you very much!

Aifu.

Bowtie2 • 1.5k views
ADD COMMENT
1
Entering edit mode

These are probably mitochondrial homologs. They map to both the mitochondrial genome as well as other sequences in the genome with equal identity (= so no mismatches, identical sequence). This is a good example on why it makes sense to always include all chromosomes to the alignment reference. These reads are called multimappers, being assigned a mapping quality of 1 or 0 (not exactly sure how bowtie2 does it, bwa uses 0 for them AFAIK). Anyway, you cannot determine what the true origin of these reads in the genome is. Therefore they should be considered carefully when it comes to quantification of reads over a certain region. Typically they are categorically removed. What do you plan to do with these data?

ADD REPLY
0
Entering edit mode

Thank you ATpoint, I would like to keep this multi-mapped reads as is recommanded in this post: Bowtie 2 - is there a way to discard reads mapping to multiple locations?.

ADD REPLY
0
Entering edit mode

I add the option -a to display all the alignment.

For index with chrM and chrC,

V100002715L1C001R026000072  83  Chr06   8173354 1   50M =   8173235 -169    TTGTTTTTCTTGTTCTTCTTTCTCGAAGAGATGGGTGCACCGCCTTGGAG  7G@F4>FFG=FBFBFG*FBEF5AF?FFEEE?CD1D@CED=GFFGFG<FFF  AS:i:100    XS:i:100    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:5YS:i:100  YT:Z:CP
V100002715L1C001R026000072  163 Chr06   8173235 1   50M =   8173354 169 GGACAATGGTTTTCTAGGTTGTTTCACCAATCTGTTGAATTGGAATGGAG  D<AFEEBEFF8C>FFF/BCF>?FFFGFGCEFCFFFF6FE9FGF>F>FF>9  AS:i:100    XS:i:100    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:5YS:i:100  YT:Z:CP
V100002715L1C001R026000072  339 ChrM    193580  1   50M =   193461  -169    TTGTTTTTCTTGTTCTTCTTTCTCGAAGAGATGGGTGCACCGCCTTGGAG  7G@F4>FFG=FBFBFG*FBEF5AF?FFEEE?CD1D@CED=GFFGFG<FFF  AS:i:100    XS:i:100    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:5YS:i:100  YT:Z:CP
V100002715L1C001R026000072  419 ChrM    193461  1   50M =   193580  169 GGACAATGGTTTTCTAGGTTGTTTCACCAATCTGTTGAATTGGAATGGAG  D<AFEEBEFF8C>FFF/BCF>?FFFGFGCEFCFFFF6FE9FGF>F>FF>9  AS:i:100    XS:i:100    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:5YS:i:100  YT:Z:CP
V100002715L1C001R026000072  323 Chr10   9381217 1   50M =   9381336 169 CTCCAAGGCGGTGCACCCATCTCTTCGAGAAAGAAGAACAAGAAAAACAA  FFF<GFGFFG=DEC@D1DC?EEEFF?FA5FEBF*GFBFBF=GFF>4F@G7  AS:i:93 XS:i:100    XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:4C45     YS:i:70   YT:Z:CP
V100002715L1C001R026000072  435 Chr10   9381336 1   35M15S  =   9381217 -169    CTCCATTCCAATTCAACAGATTGGTGAAACAACCTAGAAAACCATTGTCC  9>FF>F>FGF9EF6FFFFCFECGFGFFF?>FCB/FFF>C8FFEBEEFA<D  AS:i:70 XS:i:100    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:35 YS:i:9YT:Z:CP

For index without chrM and chrC,

V100002715L1C001R026000072  83  Chr06   8173354 33  50M =   8173235 -169    TTGTTTTTCTTGTTCTTCTTTCTCGAAGAGATGGGTGCACCGCCTTGGAG  7G@F4>FFG=FBFBFG*FBEF5AF?FFEEE?CD1D@CED=GFFGFG<FFF  AS:i:100    XS:i:100    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:5YS:i:100  YT:Z:CP
V100002715L1C001R026000072  163 Chr06   8173235 33  50M =   8173354 169 GGACAATGGTTTTCTAGGTTGTTTCACCAATCTGTTGAATTGGAATGGAG  D<AFEEBEFF8C>FFF/BCF>?FFFGFGCEFCFFFF6FE9FGF>F>FF>9  AS:i:100    XS:i:70 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:50 YS:i:100    YT:Z:CP
V100002715L1C001R026000072  323 Chr10   9381217 33  50M =   9381336 169 CTCCAAGGCGGTGCACCCATCTCTTCGAGAAAGAAGAACAAGAAAAACAA  FFF<GFGFFG=DEC@D1DC?EEEFF?FA5FEBF*GFBFBF=GFF>4F@G7  AS:i:93 XS:i:100    XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:4C45     YS:i:70   YT:Z:CP
V100002715L1C001R026000072  435 Chr10   9381336 33  35M15S  =   9381217 -169    CTCCATTCCAATTCAACAGATTGGTGAAACAACCTAGAAAACCATTGTCC  9>FF>F>FGF9EF6FFFFCFECGFGFFF?>FCB/FFF>C8FFEBEEFA<D  AS:i:70 XS:i:70 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:35 YS:i:93 YT:Z:CP

So, for the index with chrM and chrC, bowtie2 actually find 3 positions, chr6 is the primary position(infered from the flag), but all the mapping quality is 1, why not 33? And, why chrM is choosed?

ADD REPLY

Login before adding your answer.

Traffic: 1935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6