Hi
I have a big annovar annotation of my SNV and INDEL from whole genome sequencing
In chromosome column I have chr 1 to 22 and another contigs like
> unique(anno_maf$Chromosome)
[1] "chr1" "chr6"
[3] "chr16" "chr17"
[5] "chr20" "chr2"
[7] "chr3" "chr4"
[9] "chr14" "chr19"
[11] "chr5" "chr10"
[13] "chr9" "chr12"
[15] "chr13" "chr11"
[17] "chr22" "chr7"
[19] "chr15" "chr8"
[21] "chr18" "chr21"
[23] "chrX" "chrY"
[25] "chr4_gl000194_random" "chr17_gl000205_random"
[27] "chrUn_gl000241" "hs37d5"
[29] "chrUn_gl000219" "chrUn_gl000234"
[31] "chr1_gl000191_random" "chrUn_gl000211"
[33] "chrUn_gl000224" "chrUn_gl000225"
[35] "chr17_gl000203_random" "chrUn_gl000212"
[37] "chrUn_gl000243" "chrUn_gl000214"
[39] "chrM" "chr1_gl000192_random"
[41] "chr7_gl000195_random" "chrUn_gl000232"
[43] "chr4_gl000193_random" "chr19_gl000208_random"
[45] "chrUn_gl000226" "chrUn_gl000218"
[47] "chr9_gl000199_random" "chrUn_gl000217"
[49] "chrUn_gl000229" "chrUn_gl000216"
[51] "chrUn_gl000231" "chr9_gl000198_random"
[53] "chr17_gl000204_random" "chrUn_gl000220"
[55] "chrUn_gl000235" "chr11_gl000202_random"
[57] "chrUn_gl000222" "chrUn_gl000240"
[59] "chrUn_gl000233" "chrUn_gl000230"
[61] "chrUn_gl000213" "chrUn_gl000238"
[63] "chr19_gl000209_random" "chrUn_gl000237"
[65] "#CHROM"
In your experiences, should I ignore anything else than chromosome 1 to chromosome 21 in my analysis? I mean for instance if my goal is comparing some somatic variations and copy number changes in two different conditions, does it make sense to analysis only chr 1 to 21 ignoring the rest of random or non well annotated parts of genome? Does it hurt at all?
Sex chromosomes (XY) and chr 1..22 are preferred chromosomes for alignment based on my understanding. Please refer to following links: