Getting read depth for normal and tumour
0
0
Entering edit mode
5.6 years ago
zizigolu ★ 4.3k

Hi,

I have called SNV for tumour and matched normal, now I have .vcf and I want to tumour and normal sequencing depth something like below

> head(mut_data)
    Sample Type CHROM       POS REF ALT `**Tumor_Varcount Tumor_Depth Normal_Depth**` Gene_Name Driver
1 CHC2432T  SNV  chr1 102961055   G   A              4          64           62      <NA>   <NA>
2 CHC2432T  SNV  chr1 105492588   A   T              7          66           73      <NA>   <NA>
3 CHC2432T  SNV  chr1 108628724   C   T              4          45           54      <NA>   <NA>
4 CHC2432T  SNV  chr1 109692113   G   T              2          53           29      <NA>   <NA>
5 CHC2432T  SNV  chr1 109692114   G   T              2          53           31      <NA>   <NA>
6 CHC2432T  SNV  chr1 120676701   T   C              3          48           87      <NA>   <NA>
>

For 5 columns at first I know this is the code %CHROM\t%POS\t%REF\t%ALT{0}\n but for getting Tumor_Varcount Tumor_Depth Normal_Depth I really don't know

Any help please?

R WGS vcf • 3.1k views
ADD COMMENT
0
Entering edit mode

What have you attempted so far?

ADD REPLY
0
Entering edit mode

I tried to installed mosdepth but I am getting error

(/home/fi1d18/.conda/onco) [fi1d18@cyan01 pcre-8.41]$ conda install mosdepth                                                   Collecting package metadata: done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - mosdepth

Current channels:

  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/free/linux-64
  - https://repo.anaconda.com/pkgs/free/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.


(/home/fi1d18/.conda/onco) [fi1d18@cyan01 pcre-8.41]$

I heard this tool calculate read depth as a bed

ADD REPLY
1
Entering edit mode

As a rule, always use conda install -c <channel_name> <package_name>. That way, you are controlling the source explicitly. Also, add conda-forge and bioconda to your channels ensuring conda-forge is added last (so it becomes the first source to check).

ADD REPLY
0
Entering edit mode

Did you read the error?

It's not available in the channels you currently use for conda.

Simple googling conda mosdepth will lead you to https://anaconda.org/bioconda/mosdepth which shows you exactly the command you need.

ADD REPLY
0
Entering edit mode

thank you, even after getting read depth by mosdepth, I have to adapt the read depths at position based on my .filtered .vcf file that would be another challenge. Is the anyway to get these columns directly from .vcf file itself?

ADD REPLY
1
Entering edit mode

mosdepth will give you a bed file. With a bed file you can annotate your vcf file. For example look at vcfanno.

There is only a couple of minutes between the help you receive from jrj.healey and your next question, which means that you did not think or try anything at all to solve your problem. You are in bioinformatics for years now, and the answer to many issues is just a google search away. Or just try a couple of things. And there is nothing wrong with that: we google all the time. There are many coding patterns in my python scripts which I have used tons of times and can't remember. I don't know the syntax of samtools addreplacerg. Make us do bioinformatics without internet and we're in trouble. We know nothing :) but we can figure it out, and it's time that you learn that too. This looks like imposter syndrome, in which you have the impression that you cannot do this and everyone else is smarter, but I'm sure you can, too.

ADD REPLY
0
Entering edit mode

By @jrj.healey comment I installed mosdepth and that is running on my tumour and normal samples meanwhile I imagined I have this depth then what can I do for Tumor_Varcount that is why I asked

I have a syndrome in which I afraid of being fired because I have different things to do :(

ADD REPLY
1
Entering edit mode

Believe it or not, I have that exact same fear of being fired because my boss "detemines I'm not worth having around". The way I address that fear is by getting more involved at work and focusing on being the person that can find solutions, not becoming the person that already has the solutions. Your institution/supervisor needs problem solvers, not encyclopedias.

Focus on finding solutions yourself and see how that fear goes away.

ADD REPLY
1
Entering edit mode

You aren't doing yourself any favours by not reading error messages and the like. You couldn't ask for a clearer description of the problem, and guidance on what to try next, than the message you got from conda.

In less than the time it took you to write this post, you could have searched google for conda and mosdepth and have answered your own question.

The forum is, of course, here to help - but by leaning on us too heavily you are doing yourself no favours. It's equivalent to being told the answers when you take a test, sure, it might get you through the test in the short term - but you're cheating yourself out of useful knowledge, and you won't have those lessons in hand when you inevitably come to need them the next time.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Sorry but this is really a difficult problem

I finished with mosdepth and I have a bed with sequencing depth for cancer and normal but the positions are not the same with positions in .vcf at all

This is my combined bed from mosdepth

Chr Start   Counts.100Tumor Counts.100Normal
1   0   0   0
1   10000   60.31   78.57
1   20000   34.62   46.38
1   30000   26.11   39.36
1   40000   16.95   21.6
1   50000   16.25   17.33
ADD REPLY
0
Entering edit mode

It does not matter how difficult the problem is, GitHub issues are not the place to ask questions on how to use a tool. Unless you find a bug in a tool or have a really specific feature request, please do not open issues on GitHub.

ADD REPLY
0
Entering edit mode

Please add the command you used for mosdepth.

ADD REPLY
0
Entering edit mode

Code I used

# run mosdepth for tumor/normal
mosdepth --no-per-base -t 4 -b 10000  norm.bam
bgzip -d norm.bed.gz
cat norm.bed | cut -f 1,2,4 | awk 'BEGIN{print "Chr\tStart\tCounts.100"}1' > norm.bed

mosdepth --no-per-base -t 4 -b 10000 tumour.bam
bgzip -d tumour.bed.gz
cat tumour.bed | cut -f 1,2,4 | awk 'BEGIN{print "Chr\tStart\tCounts.100"}1' >tumour.bed


# load libraries
library(ggplot2)
library(data.table)
library(viridis)

# read in the depth depth data
tumor_depth <- fread("tumour.bed")
normal_depth <- fread("norm.bed")

# merge the two
all_depth <- merge(tumor_depth, normal_depth, by=c("Chr", "Start"), suffixes=c("Tumor", "Normal"))
ADD REPLY
2
Entering edit mode

So you ask bins per 10kb, and not per base coverage, and then you're surprised it doesn't match?

ADD REPLY
0
Entering edit mode

I used that per base but again nothing matches with positions in .vcf

(/home/fi1d18/.conda/onco) [fi1d18@cyan01 WGS_Tumor.mosdepth]$ head -10 tumour.bed
Chr     Start   Counts.100
1       0       0
1       9999    2
1       10000   29
1       10001   43
1       10002   59
1       10003   68
1       10004   79
1       10005   100
1       10006   106
ADD REPLY
0
Entering edit mode

Can you post a couple of lines of your vcf file?

ADD REPLY
0
Entering edit mode

Sorry I have shared one of my .vcf files by this link

https://www.dropbox.com/s/lfn9greycjybnrt/_double_filtered_013_pre_indels.vcf?dl=0

Actually I need to extract Tumor_Varcount = Number of variant bases at the position in the tumor sample , Tumor_Depth and Normal_Depth from that

This is 2 lines of my .vcf

##startTime=Fri Mar 29 16:46:32 2019
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  NORMAL  TUMOR
1   54586   .   T   C   .   PASS    DP=39;MQ=50.55;MQ0=0;NT=ref;QSS=48;QSS_NT=48;ReadPosRankSum=1.92;SGT=TT->CT;SNVSB=0.00;SOMATIC;SomaticEVS=10.83;TQSS=1;TQSS_NT=1    AU:CU:DP:FDP:GU:SDP:SUBDP:TU    0,0:0,0:20:0:0,0:0:0:20,20  0,0:6,6:18:0:0,0:0:0:12,13
1   103241  .   C   T   .   PASS    DP=120;MQ=24.94;MQ0=35;NT=ref;QSS=47;QSS_NT=47;ReadPosRankSum=2.09;SGT=CC->CT;SNVSB=0.00;SOMATIC;SomaticEVS=9.44;TQSS=2;TQSS_NT=2   AU:CU:DP:FDP:GU:SDP:SUBDP:TU    0,1:32,47:33:1:0,0:0:0:0,5  0,
ADD REPLY
2
Entering edit mode

I am not sure what program you are using to generate the .vcfs. But you should look into the manual of the program and see if it outputs depths using the format fields in the vcf. For example, you have "DP" field in your vcf that shows the depth of the individual samples. Perhaps one of the other fields in there (SDP, SUBDP) will give you more specific depth info.

Edit** Looking at your vcf headers, it looks like you just have two samples, normal and tumor. So you can just use the DP info from the two samples and get your depth. For example, your first loci has the following format fields:

AU:CU:DP:FDP:GU:SDP:SUBDP:TU    0,0:0,0:20:0:0,0:0:0:20,20  0,0:6,6:18:0:0,0:0:0:12,13

So according to this (DP field of normal and tumor), your normal sample has a depth of 20 and tumor sample has a depth of 18.

ADD REPLY
0
Entering edit mode

Thanks a lot, I called mutations by strelka

ADD REPLY

Login before adding your answer.

Traffic: 858 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6