Question

Suggestion of specific tools/pipelines for the manipulation and analysis of VCF files after somatic copy number variant calling

0

Entering edit mode

4.3 years ago

svlachavas ▴ 790

Dear Community,

briefly, implementing a somatic Copy Number Variant Calling using the cnv_facets command line tool, based on the original FACETS algorithm, resulted in a number of VCF files with the final CNAs. My main issue is regarding the manipulation of these vcf files:

1) to filter the VCF file based on the FILTER COLUMN to keep only the PASS CNV calls (which is doable and simple)

2) Additionally, along with the other columns, to keep specific values from the INFO column, such as END and TCN_EM values, in order to have all the needed columns for downstream analysis and annotation of the resulted variants. However, as you can see from the attached file, this is the major issue, because these values are included in the INFO column, but are not individual columns such as the others. I also tried to convert it manually into a txt file, but still if you see from the txt file, the same problem exists.

Is there a tool or a pipeline that you would suggest that could aid my purpose of specific manipulation, especially for my second question ? and also convert the vcf file to a different format ?

I have also included a link for an example input vcf file:

https://www.dropbox.com/s/qh9uwlb7j07qh4p/example.VCF.vcf?dl=0

Thank you in advance,

Efstathios

vcf cnv facets CNA VCF analysis • 875 views

ADD COMMENT • link 4.3 years ago by svlachavas ▴ 790