Hi, community, I am facing a bit of a struggle to find the genomic region in Mb for the TMB calculations,
I am using the TCGA-READ data, where can I get the Mb value?
I want to calculate the TMB as follows = Total number of mutations / Mb.
I navigated Biostars' previous posts and couldn't find an answer to my question. Also, I looked up TCGA and didn't find any information regarding the Mb value (Megabase).
Is there any way I can infer the Mb from my data? Maf file of TCGA-READ I mean
I really appreciate any help you can provide.
Thank you really appreciate it
Thank you for help,
However, I could not identify the Mb for TCAG-READ, where should I do API queries to identify which kit they used for this dataset?
https://github.com/NCI-GDC/gdc-workflow-overview/blob/04b73036022ff1f53921dff5ef3b1b638b8ecfcd/gdc_target_capture_kit_size.tsv#L4
You can use this value. Just to clarify, this is not the exact capture kit TCGA has used, just GDC uses this as the default. TCGA has used like 40+ different kit/combinations, and sometimes the same file was generated of read groups from different kits.