Question: TCGA data: where to find the target area/basepairs coverage per sample
gravatar for skr2345+bio
3.3 years ago by
skr2345+bio10 wrote:

In order to calculate the mutation rate per Mb for TCGA SKCM dataset, I am looking for the exact values of target area coverage of each sample/patient, which is called as "#BasepairsCovered" in the supplementary table of Nature paper "Mutational landscape and significance across 12 major cancer types" published in 2013. Here is the URL for the supplementary data (Table S_3a):

Is it possible that I could find these values in the correspondent BAM files or in their XML files?

sequencing next-gen • 1.4k views
ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by skr2345+bio10

I have not come across '#BasepairsCovered' in relation to the TCGA in the past. Do they define it in their methods or in the table legend?

If it's literally the number of reference genome bases that have reached a specific level of specified coverage, then you can infer this from the BAMs. Here is some code that I wrote to do this: Compute mean depth coverage for exome data with paired end, overlapping, features

ADD REPLYlink written 3.3 years ago by Kevin Blighe70k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1048 users visited in the last hour