Question: Convert the CNV calls to vcf format
0
gravatar for Ati
11 months ago by
Ati10
Ati10 wrote:

Dear all,

I'm trying to convert the CNV calls which generated by ExomeDepth package to VCF format. I've been searching for different ways and one solution using bedr package but was resulted to this error message:

The CNV file format from ExomeDepth package:

 CNV.calls
start.p     end.p        type     nexons    start            end     chromosome    id                          BF         reads.expected
 1       1     1          deletion      1      69091      70008       1     chr1:69091-70008          39.80            228
 2     350   351     duplication   2      1569569  1570001       1       chr1:1569569-1570001  7.30            161
 reads.observed     reads.ratio
 1              2           0.00877
 2            259         1.61000

 bedfile<- CNV.calls[,c("chromosome", "start", "end")]
 bedfile$chromosome<- sub("^", "chr", bedfile$chromosome)
 library (bedr)
 bed2vcf(bedfile, zero.based = FALSE, fasta = NULL, header = NULL)

 ERROR: Looks like bedtools had a problem
 Error in bedr(engine = "bedtools", input = list(bed = x), method = "getfasta",  : 
  In addition: Warning message:
  In fread(paste(outputDir, "/", outputFile, sep = ""), header = FALSE) :
  File '//tmp/RtmpgLGDR3/get.fasta5d33a3b640afc.bed' has size 0. Returning a NULL data.table.

Any thoughts on that or does anyone have a another solution for converting the CNV file or bed file to vcf file?

Thank you in advance for your help,

Ati

bedr cnv exomedepth vcf • 625 views
ADD COMMENTlink modified 11 months ago by Kevin Blighe68k • written 11 months ago by Ati10
1
gravatar for Kevin Blighe
11 months ago by
Kevin Blighe68k
Republic of Ireland
Kevin Blighe68k wrote:

You need to specify a FASTA file so that it can extract the sequence from it, for example:

require(bedr)

regions <- get.random.regions(n = 10, size.mean = 1)
regions
     chr     start       end
2  chr11  15172184  15172186
9  chr11 128006613 128006615
7  chr13  92245535  92245537
10 chr14  13058158  13058160
1  chr14  26692040  26692042
3   chr2  65159475  65159478
4   chr3  34905392  34905394
8   chr3 168564543 168564545
5   chr6 144121887 144121889
6   chr7  37255515  37255518

bed2vcf(
  regions,
  filename = 'test.vcf',
  zero.based = FALSE,
  fasta = '/ReferenceMaterial/hg38/hg38.fasta',
  header = NULL)

VALIDATE REGIONS
 * Check if index is a string... PASS
 * Check index pattern... PASS
 * Check for missing values... PASS
 * Check for larger start position... PASS.
 * Check if zero based... PASS
CONVERT TO BED
 * Checking input type... PASS
   Input is in bed format
VALIDATE REGIONS
 * Check if index is a string... PASS
 * Check index pattern... PASS
 * Check for missing values... PASS
 * Check for larger start position... PASS.
 * Check if zero based... PASS
 * Checking sort order... PASS
 * Checking for overlapping 'contiguous' regions... PASS
FASTA-QUERY
VALIDATE REGIONS
 * Check if index is a string... PASS
 * Check index pattern... PASS
 * Check for missing values... PASS
 * Check for larger start position... PASS.
 * Check if zero based... PASS
 * Processing input (1): bed
CONVERT TO BED
 * Checking input type... PASS
   Input is in bed format
 * Checking sort order... PASS
 * Checking for overlapping 'contiguous' regions... PASS
   bedtools getfasta -bed /tmp/Rtmp6O0uGa/bed_16d869940993.bed -fi /ReferenceMaterial/hg38/hg38.fasta -tab -fo stdout
WRITING VCF
NULL

Then, outside R:

cat test.vcf 

##fileformat=VCFv4.1
##fileDate=2019-25-20
##source=bedr
##reference=/ReferenceMaterial/hg38/hg38.fasta
#CHROM  POS ID  REF QUAL    FILTER  INFO
chr11   15172185    NA  TG  NA  NA  NA
chr11   128006614   NA  ag  NA  NA  NA
chr13   92245536    NA  TT  NA  NA  NA
chr14   13058159    NA  NN  NA  NA  NA
chr14   26692041    NA  ga  NA  NA  NA
chr2    65159476    NA  ttg NA  NA  NA
chr3    34905393    NA  aa  NA  NA  NA
chr3    168564544   NA  at  NA  NA  NA
chr6    144121888   NA  tg  NA  NA  NA
chr7    37255516    NA  GTT NA  NA  NA

This is obviously a minimalised VCF and would likely fail VCF specificatino checks in certain programs.

Kevin

ADD COMMENTlink modified 11 months ago • written 11 months ago by Kevin Blighe68k
1

Thank you so much. It worked and also I should have added "chr" to my fasta file!

ADD REPLYlink written 11 months ago by Ati10

Dear Kevin, Are you aware of any other tool/script to get the more comprehensive vcf file from CNV file?

ADD REPLYlink written 11 months ago by Ati10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1108 users visited in the last hour