How to annotate CNV events with gene information?
1
2
Entering edit mode
7.9 years ago

Hello friends,

I have CNV calls from four different CNV callers. I would like to annotate each CNV calls with gene information.

What are commonly used tools to annotate CNVs?

How much of overlap do I need to consider between CNV calls and gene coordinates, if I am using bedtools intersect to annotate CNV calls.

I have 300 samples. Therefore I am looking for command line options.

SNP RNA DNA-seq CNV events annovar • 5.4k views
ADD COMMENT
0
Entering edit mode

CNV annotation can be easily automated (with OMIM, DGV, 1000g, haploinsufficiency, TAD, ... and also with your own in-house information)!

You can look at this post describing the annotSV tool: Annotation for SV and CNV

ADD REPLY
3
Entering edit mode
7.9 years ago
Amitm ★ 2.3k

hi,

A float of 0.5 passed to -f seems reasonable, in intersectBed. Once happy with threshold, make a shell script like this -

bedtools2-2.20.1/bin/intersectBed \
-a "$1" \
-b Homo_sapiens.GRCh37_BED_SORTD.txt \
-wao \
-f 0.5 \
>"$1"_ANNO

And save it in a file, say CNV_anno.sh Then you could run something like this on the shell -

for myCNVreslts in $(ls cnv_result_*); do
    sh CNV_anno.sh $myCNVreslts
done

Assuming that your result files start with prefix pattern cnv_result_*. Alter the pattern depending on your exact filename and dir location.

The output files would get a suffix of _ANNO. You can change again.

The -wao param in intersectBed, ensures that both features are printed out in the result, with the overlap detail.

ADD COMMENT
0
Entering edit mode

Thanks, Amit. I was away for a conference.

bedtools intersect -wa -wb -a Homo_sapiens.GRCh37_BED_SORTD.bed -b Sample1_cnv_file.bed -f 0.5 -r > GRCg37_Sample1_overlap.txt

Also, now I am annotating my CNV events with DGV database using annovar tool.

First I tried, this command "$ annotate_variation.pl -regionanno -build hg19 -out ex1 -dbtype dgvMerged example/ex1.avinput humandb/". All my 500 cnv events got annotated.

Do I need to increase the minimum overlap fraction ?

Does it mean all my CNV events are common in the population?

How do I check my CNVs are pathogenic or not?

ADD REPLY
0
Entering edit mode

I was able to follow this approach and get the annotations for the CNVs. Essentially, I got the genes which are overlapping with CNVs and then I assigned the status (Amp/Del/Neutral) to each gene according to CNV status. However, this is a mere overlap approach and what is your opinion on directly using this (Amp/Del) status in visualization tools like maftools? I know that there are tools like GISTIC can be run - but our data is non-human and GISTIC and many other standard tools may not work.

ADD REPLY

Login before adding your answer.

Traffic: 920 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6