Flag contains many important information for that read in BAM/SAM.
I was wondering how to determine the gene strand/orientation based on BAM flag.
In a single-end RNA-seq experiment, e.g. 10x 3'library v2, R2 is the single end read. In such experiment, if gene A is located at the reverse strand of genome (meaning that the sequence of A transcript is equal to that sequence in reverse strand of genome, or gene strand is "-" ), reads aligned to gene A in BAM should have flag containing 16, right? I draw a diagram refering to this case. Single end RNA-seq, R2 is the single end read. R1 only contains UMI and barcode sequence, which is out of concern.
Therefore, R2 flag should represent the same direction of the gene strand, am I correct here?
In another case, pair-end RNA-seq. If gene A is reverse stranded gene, R2 flag should contain 16 and R1 flag should contain 0. So that R2 flag represent the same direction of gene strand but R1 flag is opposite. Right?
If gene A is forward stranded gene, R1 flag should contain 16 and R2 flag should contain 0, right? Therefore, R2 flag still represent the same direction of gene.
May I ask if I understand all abovementioned things correctly? Is it true that the strand of a gene is only determined by R2 flag (not matter single-end or pair-end sequencing)?
Is this method extracting gene strand information from flag safe and consistent with established genome annotation like gencode 40 gtf etc?