When dealing with strand-specific pair-end RNA-Seq data, sequenced from 5' end and mapped with --library-type fr-secondstrand, I got some strang flags and XS:A labeling listed as follows.
H2208:7439:9018 pPr1 chr1 4887061 50 100M = 4887050 -111 XS:A:+ NH:i:1 H2208:7439:9018 pPR2 chr1 4887050 50 100M = 4887061 111 XS:A:+ NH:i:1 H16340:40651 pPR2 chr1 6284001 50 100M = 6284208 307 XS:A:- NH:i:1 H16340:40651 pPr1 chr1 6284208 50 100M = 6284001 -307 XS:A:- NH:i:1
According to my library type, I can guess the reads are from negative strand. Why in the first pair, these reads are assumed to be from positive strand? There are about 1 pair of reads like the first one in every 400 pairded reads with flag pPr1 or 83. I wonder if this is a bug of Tophat or my assumption is wrong?