Question: TCGA data: where to find the target area/basepairs coverage per sample
0
gravatar for skr2345+bio
11 months ago by
skr2345+bio0 wrote:

In order to calculate the mutation rate per Mb for TCGA SKCM dataset, I am looking for the exact values of target area coverage of each sample/patient, which is called as "#BasepairsCovered" in the supplementary table of Nature paper "Mutational landscape and significance across 12 major cancer types" published in 2013. Here is the URL for the supplementary data (Table S_3a):

https://images.nature.com/original/nature-assets/nature/journal/v502/n7471/extref/nature12634-s1.zip

Is it possible that I could find these values in the correspondent BAM files or in their XML files?

sequencing next-gen • 486 views
ADD COMMENTlink modified 11 months ago • written 11 months ago by skr2345+bio0

I have not come across '#BasepairsCovered' in relation to the TCGA in the past. Do they define it in their methods or in the table legend?

If it's literally the number of reference genome bases that have reached a specific level of specified coverage, then you can infer this from the BAMs. Here is some code that I wrote to do this: Compute mean depth coverage for exome data with paired end, overlapping, features

ADD REPLYlink written 11 months ago by Kevin Blighe30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2165 users visited in the last hour