Getting read depth for normal and tumour
0
0
Entering edit mode
3.6 years ago
fi1d18 ★ 4.1k

Hi,

I have called SNV for tumour and matched normal, now I have .vcf and I want to tumour and normal sequencing depth something like below

> head(mut_data)
Sample Type CHROM       POS REF ALT **Tumor_Varcount Tumor_Depth Normal_Depth** Gene_Name Driver
1 CHC2432T  SNV  chr1 102961055   G   A              4          64           62      <NA>   <NA>
2 CHC2432T  SNV  chr1 105492588   A   T              7          66           73      <NA>   <NA>
3 CHC2432T  SNV  chr1 108628724   C   T              4          45           54      <NA>   <NA>
4 CHC2432T  SNV  chr1 109692113   G   T              2          53           29      <NA>   <NA>
5 CHC2432T  SNV  chr1 109692114   G   T              2          53           31      <NA>   <NA>
6 CHC2432T  SNV  chr1 120676701   T   C              3          48           87      <NA>   <NA>
>


For 5 columns at first I know this is the code %CHROM\t%POS\t%REF\t%ALT{0}\n but for getting Tumor_Varcount Tumor_Depth Normal_Depth I really don't know

R WGS vcf • 1.5k views
0
Entering edit mode

What have you attempted so far?

0
Entering edit mode

I tried to installed mosdepth but I am getting error

(/home/fi1d18/.conda/onco) [fi1d18@cyan01 pcre-8.41]$conda install mosdepth Collecting package metadata: done Solving environment: failed PackagesNotFoundError: The following packages are not available from current channels: - mosdepth Current channels: - https://repo.anaconda.com/pkgs/main/linux-64 - https://repo.anaconda.com/pkgs/main/noarch - https://repo.anaconda.com/pkgs/free/linux-64 - https://repo.anaconda.com/pkgs/free/noarch - https://repo.anaconda.com/pkgs/r/linux-64 - https://repo.anaconda.com/pkgs/r/noarch To search for alternate channels that may provide the conda package you're looking for, navigate to https://anaconda.org and use the search bar at the top of the page. (/home/fi1d18/.conda/onco) [fi1d18@cyan01 pcre-8.41]$


I heard this tool calculate read depth as a bed

1
Entering edit mode

As a rule, always use conda install -c <channel_name> <package_name>. That way, you are controlling the source explicitly. Also, add conda-forge and bioconda to your channels ensuring conda-forge is added last (so it becomes the first source to check).

0
Entering edit mode

It's not available in the channels you currently use for conda.

Simple googling conda mosdepth will lead you to https://anaconda.org/bioconda/mosdepth which shows you exactly the command you need.

0
Entering edit mode

thank you, even after getting read depth by mosdepth, I have to adapt the read depths at position based on my .filtered .vcf file that would be another challenge. Is the anyway to get these columns directly from .vcf file itself?

1
Entering edit mode

mosdepth will give you a bed file. With a bed file you can annotate your vcf file. For example look at vcfanno.

There is only a couple of minutes between the help you receive from jrj.healey and your next question, which means that you did not think or try anything at all to solve your problem. You are in bioinformatics for years now, and the answer to many issues is just a google search away. Or just try a couple of things. And there is nothing wrong with that: we google all the time. There are many coding patterns in my python scripts which I have used tons of times and can't remember. I don't know the syntax of samtools addreplacerg. Make us do bioinformatics without internet and we're in trouble. We know nothing :) but we can figure it out, and it's time that you learn that too. This looks like imposter syndrome, in which you have the impression that you cannot do this and everyone else is smarter, but I'm sure you can, too.

0
Entering edit mode

By @jrj.healey comment I installed mosdepth and that is running on my tumour and normal samples meanwhile I imagined I have this depth then what can I do for Tumor_Varcount that is why I asked

I have a syndrome in which I afraid of being fired because I have different things to do :(

1
Entering edit mode

Believe it or not, I have that exact same fear of being fired because my boss "detemines I'm not worth having around". The way I address that fear is by getting more involved at work and focusing on being the person that can find solutions, not becoming the person that already has the solutions. Your institution/supervisor needs problem solvers, not encyclopedias.

Focus on finding solutions yourself and see how that fear goes away.

1
Entering edit mode

You aren't doing yourself any favours by not reading error messages and the like. You couldn't ask for a clearer description of the problem, and guidance on what to try next, than the message you got from conda.

In less than the time it took you to write this post, you could have searched google for conda and mosdepth and have answered your own question.

The forum is, of course, here to help - but by leaning on us too heavily you are doing yourself no favours. It's equivalent to being told the answers when you take a test, sure, it might get you through the test in the short term - but you're cheating yourself out of useful knowledge, and you won't have those lessons in hand when you inevitably come to need them the next time.

0
Entering edit mode
0
Entering edit mode

Sorry but this is really a difficult problem

I finished with mosdepth and I have a bed with sequencing depth for cancer and normal but the positions are not the same with positions in .vcf at all

This is my combined bed from mosdepth

Chr Start   Counts.100Tumor Counts.100Normal
1   0   0   0
1   10000   60.31   78.57
1   20000   34.62   46.38
1   30000   26.11   39.36
1   40000   16.95   21.6
1   50000   16.25   17.33

0
Entering edit mode

It does not matter how difficult the problem is, GitHub issues are not the place to ask questions on how to use a tool. Unless you find a bug in a tool or have a really specific feature request, please do not open issues on GitHub.

0
Entering edit mode

0
Entering edit mode

Code I used

# run mosdepth for tumor/normal
mosdepth --no-per-base -t 4 -b 10000  norm.bam
bgzip -d norm.bed.gz
cat norm.bed | cut -f 1,2,4 | awk 'BEGIN{print "Chr\tStart\tCounts.100"}1' > norm.bed

mosdepth --no-per-base -t 4 -b 10000 tumour.bam
bgzip -d tumour.bed.gz
cat tumour.bed | cut -f 1,2,4 | awk 'BEGIN{print "Chr\tStart\tCounts.100"}1' >tumour.bed

library(ggplot2)
library(data.table)
library(viridis)

# read in the depth depth data

# merge the two
all_depth <- merge(tumor_depth, normal_depth, by=c("Chr", "Start"), suffixes=c("Tumor", "Normal"))

2
Entering edit mode

So you ask bins per 10kb, and not per base coverage, and then you're surprised it doesn't match?

0
Entering edit mode

I used that per base but again nothing matches with positions in .vcf

(/home/fi1d18/.conda/onco) [fi1d18@cyan01 WGS_Tumor.mosdepth]\$ head -10 tumour.bed
Chr     Start   Counts.100
1       0       0
1       9999    2
1       10000   29
1       10001   43
1       10002   59
1       10003   68
1       10004   79
1       10005   100
1       10006   106

0
Entering edit mode

Can you post a couple of lines of your vcf file?

0
Entering edit mode

Sorry I have shared one of my .vcf files by this link

https://www.dropbox.com/s/lfn9greycjybnrt/_double_filtered_013_pre_indels.vcf?dl=0

Actually I need to extract Tumor_Varcount = Number of variant bases at the position in the tumor sample , Tumor_Depth and Normal_Depth from that

This is 2 lines of my .vcf

##startTime=Fri Mar 29 16:46:32 2019
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  NORMAL  TUMOR
1   54586   .   T   C   .   PASS    DP=39;MQ=50.55;MQ0=0;NT=ref;QSS=48;QSS_NT=48;ReadPosRankSum=1.92;SGT=TT->CT;SNVSB=0.00;SOMATIC;SomaticEVS=10.83;TQSS=1;TQSS_NT=1    AU:CU:DP:FDP:GU:SDP:SUBDP:TU    0,0:0,0:20:0:0,0:0:0:20,20  0,0:6,6:18:0:0,0:0:0:12,13
1   103241  .   C   T   .   PASS    DP=120;MQ=24.94;MQ0=35;NT=ref;QSS=47;QSS_NT=47;ReadPosRankSum=2.09;SGT=CC->CT;SNVSB=0.00;SOMATIC;SomaticEVS=9.44;TQSS=2;TQSS_NT=2   AU:CU:DP:FDP:GU:SDP:SUBDP:TU    0,1:32,47:33:1:0,0:0:0:0,5  0,

2
Entering edit mode

I am not sure what program you are using to generate the .vcfs. But you should look into the manual of the program and see if it outputs depths using the format fields in the vcf. For example, you have "DP" field in your vcf that shows the depth of the individual samples. Perhaps one of the other fields in there (SDP, SUBDP) will give you more specific depth info.

Edit** Looking at your vcf headers, it looks like you just have two samples, normal and tumor. So you can just use the DP info from the two samples and get your depth. For example, your first loci has the following format fields:

AU:CU:DP:FDP:GU:SDP:SUBDP:TU    0,0:0,0:20:0:0,0:0:0:20,20  0,0:6,6:18:0:0,0:0:0:12,13


So according to this (DP field of normal and tumor), your normal sample has a depth of 20 and tumor sample has a depth of 18.

0
Entering edit mode

Thanks a lot, I called mutations by strelka