Entering edit mode
8 months ago
Peter Chung
▴
210
Hello, I am new in tackling the containmation issues in bam files.I used below packages to do so:
capmq:
https://github.com/mcshane/capmq
verifybamid:
https://genome.sph.umich.edu/wiki/VerifyBamID
First I used the capmq modules to limit the highest mapping quality to 30 bam files:
capmq -C30 input.bam
and then I run the GATK to output the vcf files and then use
verifyBamID --vcf {input.vcf} --bam {input.bam} --smID {params.smID} --out {output} --best
However, all my sequencing results FREEMIX values are larger than the >1%, which are failed. So what should I do to decrease the FREEMIX values or is there any others packages can tackle the containmation issues. Please advice, thanks.
I guess you are trying to identify contamination/sample swap(?) rather than index hopping.
index hopping
means something different and happens at the time of sequencing where a read may be misassigned to a sample it does not belong to (LINK).I don't think what you are doing above is going to address index hopping (if that is what you are interested in). Index hopping did not become a significant issue (which can be minimized with some care) until patterned flowcells started being used.
Yes, you are correct, I am dealing with the containmation issues, and after the capmq package, I still can not lower the FREEMIX value to meet the criteria. I am wondering how to lower the FREEMIX value or I should use other packages to detect containmation ? thanks.