read and write a bam: but duplicate the quality column (11) and write one with "OQ:Z:" flag
0
0
Entering edit mode
15 months ago
gnanakkan • 0

Hi All,

I have over 1000 bam files and I need to read and write them as bam files. The output bam shd contain a header with the alignment, but the column 11 (quality) to be repeated once, the second should contain quality with "OQ:Z:" flag. The script/package/tools shd be robust. Appreciated providing the script or tools to do this! Thanks in advance.

eg: input bam::

E00579:50:HK2VJALXX:6:1220:15300:41040  2115    chr1    9999    0       90H60M  chr5    18606598        0       GATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC    JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJFJJJJFJJJJJFFJJJJJJJJJFJ<A    SA:Z:chr5,18606834,-,51S99M,37,0;     MD:Z:60 PG:Z:MarkDuplicates     RG:Z:HK2VJALXX.6        NM:i:0  AS:i:60 XS:i:58

E00579:50:HK2VJALXX:6:1212:4066:24884   113     chr1    9999    0       34S60M56S       chr5    18606897        0       CCTAGAACAGCTCTTCCTTTATTTTCTTTTTCTGGATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACAGAGATAACTATTGATACAACACCTTCATGACCCTAAGGTACTATCATAGAGTTCT<<-FA-7<AFF<<<7<FJAF-JFJFF-J<JJAAJFFF-AAJJJJFJJJJJAJJJJJJFAJJJJJJJJFJJJJJJJJJJF<FJJJFJJJJFJJFJFJJJJJJJFJJJJJJJJJJFJJJJFJJJJJJJJFJJJJJJJJJJJJJJJJJFFFAA  SA:Z:chr5,18606769,+,58M92S,0,0;chr5,18606834,+,107S43M,0,0;    MD:Z:60 PG:Z:MarkDuplicates     RG:Z:HK2VJALXX.6   NM:i:0     AS:i:60 XS:i:59

output bam::

E00579:50:HK2VJALXX:6:1220:15300:41040  2115    chr1    9999    0       90H60M  chr5    18606598        0       GATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC    JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJFJJJJFJJJJJFFJJJJJJJJJFJ<A    SA:Z:chr5,18606834,-,51S99M,37,0;     MD:Z:60 PG:Z:MarkDuplicates     RG:Z:HK2VJALXX.6        NM:i:0  AS:i:60 XS:i:58 OQ:Z:JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJFJJJJFJJJJJFFJJJJJJJJJFJ<A

E00579:50:HK2VJALXX:6:1212:4066:24884   113     chr1    9999    0       34S60M56S       chr5    18606897        0       CCTAGAACAGCTCTTCCTTTATTTTCTTTTTCTGGATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACAGAGATAACTATTGATACAACACCTTCATGACCCTAAGGTACTATCATAGAGTTCT<<-FA-7<AFF<<<7<FJAF-JFJFF-J<JJAAJFFF-AAJJJJFJJJJJAJJJJJJFAJJJJJJJJFJJJJJJJJJJF<FJJJFJJJJFJJFJFJJJJJJJFJJJJJJJJJJFJJJJFJJJJJJJJFJJJJJJJJJJJJJJJJJFFFAA  SA:Z:chr5,18606769,+,58M92S,0,0;chr5,18606834,+,107S43M,0,0;    MD:Z:60 PG:Z:MarkDuplicates     RG:Z:HK2VJALXX.6   NM:i:0     AS:i:60 XS:i:59  OQ:Z:T<<-FA-7<AFF<<<7<FJAF-JFJFF-J<JJAAJFFF-AAJJJJFJJJJJAJJJJJJFAJJJJJJJJFJJJJJJJJJJF<FJJJFJJJJFJJFJFJJJJJJJFJJJJJJJJJJFJJJJFJJJJJJJJFJJJJJJJJJJJJJJJJJFFFAA
alignment bam base quality pysam samtools • 295 views
ADD COMMENT
0
Entering edit mode

Since OQ:Z tags contain Q scores before calibration, I assume you are looking to do calibration of Q scores rather than just I have over 1000 bam files and I need to read and write them as bam files..

So perhaps GATK BaseRecalibrator followed by ApplyBQSR?

ADD REPLY

Login before adding your answer.

Traffic: 1502 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6