Question: MuTect2 still discard many reads, how to fix?
1
gravatar for Sharon
14 months ago by
Sharon420
Sharon420 wrote:

Hello everyone

MuTect2 from GATK3 discarded 87.33% of the reads. This is during preparing panel of normals. I will use sp1.vcf in the PON creation !

sp1.vcf: sp1.bam
        java -jar ${GATK}/GenomeAnalysisTK.jar \
        -T MuTect2 \
        -R ${hg38}.fasta \
        -I:tumor sp1.bam \
        --dbsnp ${DBSNP} \
        --cosmic ${COSMIC} \
        --artifact_detection_mode \
        -o sp1.vcf

Does this seem okay to you? Is there anything I can do to fix non primary reads (42.77%) and duplicate reads (44.56%)?

INFO  00:57:17,188 MicroScheduler - 158885515 reads were filtered out during the traversal out of approximately 181931389 total reads (87.33%) 
INFO  00:57:17,190 MicroScheduler -   -> 9210 reads (0.01% of total) failing BadCigarFilter 
INFO  00:57:17,192 MicroScheduler -   -> 81066454 reads (44.56% of total) failing DuplicateReadFilter 
INFO  00:57:17,193 MicroScheduler -   -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter 
INFO  00:57:17,195 MicroScheduler -   -> 0 reads (0.00% of total) failing MalformedReadFilter 
INFO  00:57:17,197 MicroScheduler -   -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter 
INFO  00:57:17,198 MicroScheduler -   -> 77809851 reads (42.77% of total) failing NotPrimaryAlignmentFilter 
INFO  00:57:17,200 MicroScheduler -   -> 0 reads (0.00% of total) failing UnmappedReadFilter 
------------------------------------------------------------------------------------------
Done. ------------------------------------------------------------------------------------------

This is how STAR log summary looks:

            UNIQUE READS:
               Uniquely mapped reads number |       29463720
                    Uniquely mapped reads % |       72.35%
                      Average mapped length |       271.00
                   Number of splices: Total |       21158459
        Number of splices: Annotated (sjdb) |       21067563
                   Number of splices: GT/AG |       20952149
                   Number of splices: GC/AG |       126813
                   Number of splices: AT/AC |       5622
           Number of splices: Non-canonical |       73875
                  Mismatch rate per base, % |       0.48%
                     Deletion rate per base |       0.02%
                    Deletion average length |       1.27
                    Insertion rate per base |       0.01%
                   Insertion average length |       1.84
                         MULTI-MAPPING READS:
    Number of reads mapped to multiple loci |       9507878
         % of reads mapped to multiple loci |       23.35%
    Number of reads mapped to too many loci |       355263
         % of reads mapped to too many loci |       0.87%
                              UNMAPPED READS:
   % of reads unmapped: too many mismatches |       0.00%
             % of reads unmapped: too short |       2.76%
                 % of reads unmapped: other |       0.67%
                              CHIMERIC READS:
                   Number of chimeric reads |       0
                        % of chimeric reads |       0.00%
rna-seq variant calling • 642 views
ADD COMMENTlink modified 14 months ago by dariober10.0k • written 14 months ago by Sharon420
2
gravatar for dariober
14 months ago by
dariober10.0k
WCIP | Glasgow | UK
dariober10.0k wrote:

If you are sure you want to retain non primary and duplicates, I think you can add to mutect the option --disable_read_filter NotPrimaryAlignmentFilter --disable_read_filter DuplicateReadFilter (not tested). See some docs here.

ADD COMMENTlink written 14 months ago by dariober10.0k

Thanks dariober. I don't know it I should retain or not !

ADD REPLYlink written 14 months ago by Sharon420
1

If you don't know, then you should probably not retain. There is a reason why they are not retained by default. You can consult GATK Best Practices for more info: https://software.broadinstitute.org/gatk/best-practices/workflow?id=11146

ADD REPLYlink modified 14 months ago • written 14 months ago by igor7.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1640 users visited in the last hour