rDGIdb - VCF input error using locateVariant command (VariantAnnotation)
1
0
Entering edit mode
4.4 years ago

Hi Everyone!

I have a problem with using rDGIdb looking for possible drug candidates. I want to use the option of VCF input and I am following the code described in https://bioconductor.org/packages/release/bioc/vignettes/rDGIdb/inst/doc/vignette.pdf. However it always gives me an error in the end:

library(VariantAnnotation)

library(TxDb.Hsapiens.UCSC.hg19.knownGene)

library(org.Hs.eg.db)

txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene

vcf <- readVcf("somatic-wgs.vcf.gz", "hg19")

seqlevels(vcf) <- paste("chr", seqlevels(vcf), sep = "")

rd <- rowRanges(vcf)

loc <- locateVariants(rd, txdb, CodingVariants())

And the error is: Error in compatibleSeqnames(rep(seqnames(x), elementNROWS(y)), seqnames(y@unlistData)) : Level set of 'x' must be subset of that of 'y', or vice versa

I checked the seqlevels of my vcf file and TxDB as well:

seqlevels(vcf))

[1] "chr1" "chr2" "chr3" "chr4" "chr5"
[6] "chr6" "chr7" "chr8" "chr9" "chr10"
[11] "chr11" "chr12" "chr13" "chr14" "chr15"
[16] "chr16" "chr17" "chr18" "chr19" "chr20"
[21] "chr21" "chr22" "chrX" "chrY" "chrMT"
[26] "chrGL000207.1" "chrGL000226.1" "chrGL000229.1" "chrGL000231.1" "chrGL000210.1" [31] "chrGL000239.1" "chrGL000235.1" "chrGL000201.1" "chrGL000247.1" "chrGL000245.1" [36] "chrGL000197.1" "chrGL000203.1" "chrGL000246.1" "chrGL000249.1" "chrGL000196.1" [41] "chrGL000248.1" "chrGL000244.1" "chrGL000238.1" "chrGL000202.1" "chrGL000234.1" [46] "chrGL000232.1" "chrGL000206.1" "chrGL000240.1" "chrGL000236.1" "chrGL000241.1" [51] "chrGL000243.1" "chrGL000242.1" "chrGL000230.1" "chrGL000237.1" "chrGL000233.1" [56] "chrGL000204.1" "chrGL000198.1" "chrGL000208.1" "chrGL000191.1" "chrGL000227.1" [61] "chrGL000228.1" "chrGL000214.1" "chrGL000221.1" "chrGL000209.1" "chrGL000218.1" [66] "chrGL000220.1" "chrGL000213.1" "chrGL000211.1" "chrGL000199.1" "chrGL000217.1" [71] "chrGL000216.1" "chrGL000215.1" "chrGL000205.1" "chrGL000219.1" "chrGL000224.1" [76] "chrGL000223.1" "chrGL000195.1" "chrGL000212.1" "chrGL000222.1" "chrGL000200.1" [81] "chrGL000193.1" "chrGL000194.1" "chrGL000225.1" "chrGL000192.1"

(seqlevels(txdb))

[1] "chr1" "chr2" "chr3"
[4] "chr4" "chr5" "chr6"
[7] "chr7" "chr8" "chr9"
[10] "chr10" "chr11" "chr12"
[13] "chr13" "chr14" "chr15"
[16] "chr16" "chr17" "chr18"
[19] "chr19" "chr20" "chr21"
[22] "chr22" "chrX" "chrY"
[25] "chrM" "chr1_gl000191_random" "chr1_gl000192_random" [28] "chr4_ctg9_hap1" "chr4_gl000193_random" "chr4_gl000194_random" [31] "chr6_apd_hap1" "chr6_cox_hap2" "chr6_dbb_hap3"
[34] "chr6_mann_hap4" "chr6_mcf_hap5" "chr6_qbl_hap6"
[37] "chr6_ssto_hap7" "chr7_gl000195_random" "chr8_gl000196_random" [40] "chr8_gl000197_random" "chr9_gl000198_random" "chr9_gl000199_random" [43] "chr9_gl000200_random" "chr9_gl000201_random" "chr11_gl000202_random" [46] "chr17_ctg5_hap1" "chr17_gl000203_random" "chr17_gl000204_random" [49] "chr17_gl000205_random" "chr17_gl000206_random" "chr18_gl000207_random" [52] "chr19_gl000208_random" "chr19_gl000209_random" "chr21_gl000210_random" [55] "chrUn_gl000211" "chrUn_gl000212" "chrUn_gl000213"
[58] "chrUn_gl000214" "chrUn_gl000215" "chrUn_gl000216"
[61] "chrUn_gl000217" "chrUn_gl000218" "chrUn_gl000219"
[64] "chrUn_gl000220" "chrUn_gl000221" "chrUn_gl000222"
[67] "chrUn_gl000223" "chrUn_gl000224" "chrUn_gl000225"
[70] "chrUn_gl000226" "chrUn_gl000227" "chrUn_gl000228"
[73] "chrUn_gl000229" "chrUn_gl000230" "chrUn_gl000231"
[76] "chrUn_gl000232" "chrUn_gl000233" "chrUn_gl000234"
[79] "chrUn_gl000235" "chrUn_gl000236" "chrUn_gl000237"
[82] "chrUn_gl000238" "chrUn_gl000239" "chrUn_gl000240"
[85] "chrUn_gl000241" "chrUn_gl000242" "chrUn_gl000243"
[88] "chrUn_gl000244" "chrUn_gl000245" "chrUn_gl000246"
[91] "chrUn_gl000247" "chrUn_gl000248" "chrUn_gl000249"

Any suggestions? Thank you!

genome sequence • 2.0k views
ADD COMMENT
0
Entering edit mode
14 months ago
Gaurav • 0

I think the issue is that the seqlevelsdo not match. Apparently, one should be a subset of the other. You could try restricting only to chromosomes. Then setting seqlevels on your vcf for only chromosomes. Then it should work. If you want to use all contigs then ensure that the names match exactly.

ADD COMMENT

Login before adding your answer.

Traffic: 1978 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6