Question: Getting read depth for normal and tumour
gravatar for A
23 months ago by
A3.9k wrote:


I have called SNV for tumour and matched normal, now I have .vcf and I want to tumour and normal sequencing depth something like below

> head(mut_data)
    Sample Type CHROM       POS REF ALT `**Tumor_Varcount Tumor_Depth Normal_Depth**` Gene_Name Driver
1 CHC2432T  SNV  chr1 102961055   G   A              4          64           62      <NA>   <NA>
2 CHC2432T  SNV  chr1 105492588   A   T              7          66           73      <NA>   <NA>
3 CHC2432T  SNV  chr1 108628724   C   T              4          45           54      <NA>   <NA>
4 CHC2432T  SNV  chr1 109692113   G   T              2          53           29      <NA>   <NA>
5 CHC2432T  SNV  chr1 109692114   G   T              2          53           31      <NA>   <NA>
6 CHC2432T  SNV  chr1 120676701   T   C              3          48           87      <NA>   <NA>

For 5 columns at first I know this is the code %CHROM\t%POS\t%REF\t%ALT{0}\n but for getting Tumor_Varcount Tumor_Depth Normal_Depth I really don't know

Any help please?

wgs R vcf • 851 views
ADD COMMENTlink written 23 months ago by A3.9k

What have you attempted so far?

ADD REPLYlink written 23 months ago by Joe19k

I tried to installed mosdepth but I am getting error

(/home/fi1d18/.conda/onco) [fi1d18@cyan01 pcre-8.41]$ conda install mosdepth                                                   Collecting package metadata: done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - mosdepth

Current channels:


To search for alternate channels that may provide the conda package you're
looking for, navigate to

and use the search bar at the top of the page.

(/home/fi1d18/.conda/onco) [fi1d18@cyan01 pcre-8.41]$

I heard this tool calculate read depth as a bed

ADD REPLYlink written 23 months ago by A3.9k

As a rule, always use conda install -c <channel_name> <package_name>. That way, you are controlling the source explicitly. Also, add conda-forge and bioconda to your channels ensuring conda-forge is added last (so it becomes the first source to check).

ADD REPLYlink modified 23 months ago • written 23 months ago by Ram32k

Did you read the error?

It's not available in the channels you currently use for conda.

Simple googling conda mosdepth will lead you to which shows you exactly the command you need.

ADD REPLYlink written 23 months ago by Joe19k

thank you, even after getting read depth by mosdepth, I have to adapt the read depths at position based on my .filtered .vcf file that would be another challenge. Is the anyway to get these columns directly from .vcf file itself?

ADD REPLYlink written 23 months ago by A3.9k

mosdepth will give you a bed file. With a bed file you can annotate your vcf file. For example look at vcfanno.

There is only a couple of minutes between the help you receive from jrj.healey and your next question, which means that you did not think or try anything at all to solve your problem. You are in bioinformatics for years now, and the answer to many issues is just a google search away. Or just try a couple of things. And there is nothing wrong with that: we google all the time. There are many coding patterns in my python scripts which I have used tons of times and can't remember. I don't know the syntax of samtools addreplacerg. Make us do bioinformatics without internet and we're in trouble. We know nothing :) but we can figure it out, and it's time that you learn that too. This looks like imposter syndrome, in which you have the impression that you cannot do this and everyone else is smarter, but I'm sure you can, too.

ADD REPLYlink written 23 months ago by WouterDeCoster45k

By @jrj.healey comment I installed mosdepth and that is running on my tumour and normal samples meanwhile I imagined I have this depth then what can I do for Tumor_Varcount that is why I asked

I have a syndrome in which I afraid of being fired because I have different things to do :(

ADD REPLYlink modified 23 months ago • written 23 months ago by A3.9k

Believe it or not, I have that exact same fear of being fired because my boss "detemines I'm not worth having around". The way I address that fear is by getting more involved at work and focusing on being the person that can find solutions, not becoming the person that already has the solutions. Your institution/supervisor needs problem solvers, not encyclopedias.

Focus on finding solutions yourself and see how that fear goes away.

ADD REPLYlink written 23 months ago by Ram32k

You aren't doing yourself any favours by not reading error messages and the like. You couldn't ask for a clearer description of the problem, and guidance on what to try next, than the message you got from conda.

In less than the time it took you to write this post, you could have searched google for conda and mosdepth and have answered your own question.

The forum is, of course, here to help - but by leaning on us too heavily you are doing yourself no favours. It's equivalent to being told the answers when you take a test, sure, it might get you through the test in the short term - but you're cheating yourself out of useful knowledge, and you won't have those lessons in hand when you inevitably come to need them the next time.

ADD REPLYlink written 23 months ago by Joe19k

Cross-posted on GitHub

ADD REPLYlink written 23 months ago by ATpoint46k

Sorry but this is really a difficult problem

I finished with mosdepth and I have a bed with sequencing depth for cancer and normal but the positions are not the same with positions in .vcf at all

This is my combined bed from mosdepth

Chr Start   Counts.100Tumor Counts.100Normal
1   0   0   0
1   10000   60.31   78.57
1   20000   34.62   46.38
1   30000   26.11   39.36
1   40000   16.95   21.6
1   50000   16.25   17.33
ADD REPLYlink written 23 months ago by A3.9k

It does not matter how difficult the problem is, GitHub issues are not the place to ask questions on how to use a tool. Unless you find a bug in a tool or have a really specific feature request, please do not open issues on GitHub.

ADD REPLYlink written 23 months ago by Ram32k

Please add the command you used for mosdepth.

ADD REPLYlink written 23 months ago by WouterDeCoster45k

Code I used

# run mosdepth for tumor/normal
mosdepth --no-per-base -t 4 -b 10000  norm.bam
bgzip -d norm.bed.gz
cat norm.bed | cut -f 1,2,4 | awk 'BEGIN{print "Chr\tStart\tCounts.100"}1' > norm.bed

mosdepth --no-per-base -t 4 -b 10000 tumour.bam
bgzip -d tumour.bed.gz
cat tumour.bed | cut -f 1,2,4 | awk 'BEGIN{print "Chr\tStart\tCounts.100"}1' >tumour.bed

# load libraries

# read in the depth depth data
tumor_depth <- fread("tumour.bed")
normal_depth <- fread("norm.bed")

# merge the two
all_depth <- merge(tumor_depth, normal_depth, by=c("Chr", "Start"), suffixes=c("Tumor", "Normal"))
ADD REPLYlink written 23 months ago by A3.9k

So you ask bins per 10kb, and not per base coverage, and then you're surprised it doesn't match?

ADD REPLYlink written 23 months ago by WouterDeCoster45k

I used that per base but again nothing matches with positions in .vcf

(/home/fi1d18/.conda/onco) [fi1d18@cyan01 WGS_Tumor.mosdepth]$ head -10 tumour.bed
Chr     Start   Counts.100
1       0       0
1       9999    2
1       10000   29
1       10001   43
1       10002   59
1       10003   68
1       10004   79
1       10005   100
1       10006   106
ADD REPLYlink written 23 months ago by A3.9k

Can you post a couple of lines of your vcf file?

ADD REPLYlink written 23 months ago by Damian Kao15k

Sorry I have shared one of my .vcf files by this link

Actually I need to extract Tumor_Varcount = Number of variant bases at the position in the tumor sample , Tumor_Depth and Normal_Depth from that

This is 2 lines of my .vcf

##startTime=Fri Mar 29 16:46:32 2019
1   54586   .   T   C   .   PASS    DP=39;MQ=50.55;MQ0=0;NT=ref;QSS=48;QSS_NT=48;ReadPosRankSum=1.92;SGT=TT->CT;SNVSB=0.00;SOMATIC;SomaticEVS=10.83;TQSS=1;TQSS_NT=1    AU:CU:DP:FDP:GU:SDP:SUBDP:TU    0,0:0,0:20:0:0,0:0:0:20,20  0,0:6,6:18:0:0,0:0:0:12,13
1   103241  .   C   T   .   PASS    DP=120;MQ=24.94;MQ0=35;NT=ref;QSS=47;QSS_NT=47;ReadPosRankSum=2.09;SGT=CC->CT;SNVSB=0.00;SOMATIC;SomaticEVS=9.44;TQSS=2;TQSS_NT=2   AU:CU:DP:FDP:GU:SDP:SUBDP:TU    0,1:32,47:33:1:0,0:0:0:0,5  0,
ADD REPLYlink written 23 months ago by A3.9k

I am not sure what program you are using to generate the .vcfs. But you should look into the manual of the program and see if it outputs depths using the format fields in the vcf. For example, you have "DP" field in your vcf that shows the depth of the individual samples. Perhaps one of the other fields in there (SDP, SUBDP) will give you more specific depth info.

Edit** Looking at your vcf headers, it looks like you just have two samples, normal and tumor. So you can just use the DP info from the two samples and get your depth. For example, your first loci has the following format fields:

AU:CU:DP:FDP:GU:SDP:SUBDP:TU    0,0:0,0:20:0:0,0:0:0:20,20  0,0:6,6:18:0:0,0:0:0:12,13

So according to this (DP field of normal and tumor), your normal sample has a depth of 20 and tumor sample has a depth of 18.

ADD REPLYlink modified 23 months ago • written 23 months ago by Damian Kao15k

Thanks a lot, I called mutations by strelka

ADD REPLYlink written 23 months ago by A3.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2366 users visited in the last hour