Entering edit mode
4.0 years ago
gnanakkan
•
0
Hi All,
I have over 1000 bam files and I need to read and write them as bam files. The output bam shd contain a header with the alignment, but the column 11 (quality) to be repeated once, the second should contain quality with "OQ:Z:" flag. The script/package/tools shd be robust. Appreciated providing the script or tools to do this! Thanks in advance.
eg: input bam::
E00579:50:HK2VJALXX:6:1220:15300:41040 2115 chr1 9999 0 90H60M chr5 18606598 0 GATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJFJJJJFJJJJJFFJJJJJJJJJFJ<A SA:Z:chr5,18606834,-,51S99M,37,0; MD:Z:60 PG:Z:MarkDuplicates RG:Z:HK2VJALXX.6 NM:i:0 AS:i:60 XS:i:58
E00579:50:HK2VJALXX:6:1212:4066:24884 113 chr1 9999 0 34S60M56S chr5 18606897 0 CCTAGAACAGCTCTTCCTTTATTTTCTTTTTCTGGATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACAGAGATAACTATTGATACAACACCTTCATGACCCTAAGGTACTATCATAGAGTTCT<<-FA-7<AFF<<<7<FJAF-JFJFF-J<JJAAJFFF-AAJJJJFJJJJJAJJJJJJFAJJJJJJJJFJJJJJJJJJJF<FJJJFJJJJFJJFJFJJJJJJJFJJJJJJJJJJFJJJJFJJJJJJJJFJJJJJJJJJJJJJJJJJFFFAA SA:Z:chr5,18606769,+,58M92S,0,0;chr5,18606834,+,107S43M,0,0; MD:Z:60 PG:Z:MarkDuplicates RG:Z:HK2VJALXX.6 NM:i:0 AS:i:60 XS:i:59
output bam::
E00579:50:HK2VJALXX:6:1220:15300:41040 2115 chr1 9999 0 90H60M chr5 18606598 0 GATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJFJJJJFJJJJJFFJJJJJJJJJFJ<A SA:Z:chr5,18606834,-,51S99M,37,0; MD:Z:60 PG:Z:MarkDuplicates RG:Z:HK2VJALXX.6 NM:i:0 AS:i:60 XS:i:58 OQ:Z:JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJFJJJJFJJJJJFFJJJJJJJJJFJ<A
E00579:50:HK2VJALXX:6:1212:4066:24884 113 chr1 9999 0 34S60M56S chr5 18606897 0 CCTAGAACAGCTCTTCCTTTATTTTCTTTTTCTGGATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACAGAGATAACTATTGATACAACACCTTCATGACCCTAAGGTACTATCATAGAGTTCT<<-FA-7<AFF<<<7<FJAF-JFJFF-J<JJAAJFFF-AAJJJJFJJJJJAJJJJJJFAJJJJJJJJFJJJJJJJJJJF<FJJJFJJJJFJJFJFJJJJJJJFJJJJJJJJJJFJJJJFJJJJJJJJFJJJJJJJJJJJJJJJJJFFFAA SA:Z:chr5,18606769,+,58M92S,0,0;chr5,18606834,+,107S43M,0,0; MD:Z:60 PG:Z:MarkDuplicates RG:Z:HK2VJALXX.6 NM:i:0 AS:i:60 XS:i:59 OQ:Z:T<<-FA-7<AFF<<<7<FJAF-JFJFF-J<JJAAJFFF-AAJJJJFJJJJJAJJJJJJFAJJJJJJJJFJJJJJJJJJJF<FJJJFJJJJFJJFJFJJJJJJJFJJJJJJJJJJFJJJJFJJJJJJJJFJJJJJJJJJJJJJJJJJFFFAA
Since
OQ:Z
tags contain Q scores before calibration, I assume you are looking to docalibration
of Q scores rather than justI have over 1000 bam files and I need to read and write them as bam files.
.So perhaps GATK BaseRecalibrator followed by ApplyBQSR?