Question: How to extract reads on positive strand for a specific region from bam?

0

venkat •

**0**wrote:Dear All,

I have paired-end RNAseq data that were aligned to hg38 genome using hisat2 with strand-specific information --dta --rna-strandness RF option.

I have lncRNA transcript which is on a positive-strand and the start and end positions are `9:115046349-115047199`

. I have the bam files and wanted to extract reads on the `positive strand`

for that specific region with +/- 5kb of that region.

I tried with samtools:

```
samtools view -F 16 -b sample.sorted.bam “9:115041349-115052199" > sample_specificregion.bam
```

I see that -F discard the reads mapping on the negative strand with the flag 16.

But I see the reads in `sample_specificregion.bam`

were showing `- strand`

Is this flag `-F 16`

right way to get reads on the positive strand for that specific region?

```
D00535:34:CBMGRANXX:5:2109:12145:31393 99 9 115052134 60 92M = 115052157 120 AGAGGAAAATTCTGTCATTTTCAACAACACGGATGAACCCAGAAGACATTGTACAGAGTGAAATAATCGAGGCACAGAAAGACAAATACTGC GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:92 YS:i:0 YT:Z:CP XS:A:- NH:i:1
D00535:34:CBMGRANXX:5:2103:4351:30673 99 9 115052135 60 93M = 115052143 106 GAGGAAAATTCTGTCATTTTCAACAACACGGATGAACCCAGAAGACATTGTACAGAGTGGAATAATCGAGGCACAGAAAGACAAATACTGCGT FBGGFGGBGGGGGGGGGGGEGGGGGGGGGGGCGGGGGGGGGGGGGGGGGEGGGGGGG<1FFGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGG AS:i:-5 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:59A33YS:i:-5 YT:Z:CP XS:A:- NH:i:1
D00535:34:CBMGRANXX:6:2110:18471:54222 99 9 115052135 60 93M = 115052316 278 GAGGAAAATTCTGTCATTTTCAACAACACGGATGAACCCAGAAGACATTGTACAGAGTGAAATAATCGAGGCACAGAAAGACAAATACTGCGT GGGGGGGGGGGGGGGGGGGGGGGGGGGGBGGGGGGGGGGGFGGGGGGGGGGGGGGFGGGEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:93 YS:i:0 YT:Z:CP XS:A:- NH:i:1
D00535:34:CBMGRANXX:6:2107:12093:34963 97 9 115052135 60 89M = 115052133 -94 GAGGAAAATTCTGTCATTTTCAACAACACGGATGAACCCAGAAGACATTGTACAGAGTGAAATAATCGAGGCACAGAAAGACAAATACT GFGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:89 YS:i:0 YT:Z:DP XS:A:- NH:i:1
D00535:34:CBMGRANXX:6:1301:21091:74975 99 9 115052135 60 93M = 115052143 106 GAGGAAAATTCTGTCATTTTCAACAACACGGATGAACCCAGAAGACATTGTACAGAGTGAAATAATCGAGGCACAGAAAGACAAATACTGCGT FGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGFGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:93 YS:i:-4 YT:Z:CP XS:A:- NH:i:1
D00535:34:CBMGRANXX:6:1111:8020:42383 99 9 115052136 60 93M = 115052222 184 AGGAAAATTCTGTCATTTTCAACAACACGGATGAACCCAGAAGACATTGTACAGAGTGAAATAATCGAGGCACAGAAAGACAAATACTGCGTG GGGGGGGGGGGGGGGGFGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGGGGGGGGGGGGGGGGGGCGGGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:93 YS:i:0 YT:Z:CP XS:A:- NH:i:1
D00535:34:CBMGRANXX:5:1109:7064:82839 99 9 115052152 60 92M = 115052152 98 TTTCAACAACACGGATGAACCCAGAAGACATTGTACAGAGTGAAATAATCGAGGCACAGAAAGACAAATACTGCGTGATCTCATTTGTGTAA GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:92 YS:i:0 YT:Z:CP XS:A:- NH:i:1
D00535:34:CBMGRANXX:5:1307:7118:29073 65 9 115052181 60 91M2S 1 66292253 0 ATTGTACAGAGTGAAATAATCGAGGCACAGAAAGACAAATACTGCGTGATCTCATTTGTGTAATCTAAGAAAGCTGAACACATAGAAGCACTC GGGGGGGGGGGGGGFGGGEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:-2 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:91 YS:i:0 YT:Z:DP XS:A:- NH:i:1
D00535:34:CBMGRANXX:6:2315:7732:93870 99 9 115052182 60 92M = 115052270 186 TTGTACAGAGTGAAATAATCGAGGCACAGAAAGACAAATACTGCGTGATCTCATTTGTGTAATCTAAGAAAGCTGAACACATAGAAGCACAG GGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGFGGGGGGFGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:92 YS:i:0 YT:Z:CP XS:A:- NH:i:1
```

Refer to Samtools View: Only Forward Or Reverse Strand

14k