Question: Picard tool SortVcf gives an empty result file
0
gravatar for haiying.kong
3.6 years ago by
haiying.kong250
Germany
haiying.kong250 wrote:

  I have downloaded dbSNP data and trying to sort the .vcf file with the command line:

java -Xms10g -Xmx20g -Djava.io.tmpdir=tmp -jar picard.jar SortVcf INPUT=00-All_chr.vcf OUTPUT=00-All_chr_sorted.vcf SEQUENCE_DICTIONARY=hg38.dict

  I did add "chr" to the chromosome information in the dbSNP data.

  The following is part of the log in the end of the log file after running the command line. I did not get any error message, and it ran for more than an hour, but the result file is empty.

  Can anyone please help me with this?

INFO    2015-09-17 19:10:44     SortVcf read   143,025,000 records.  Elapsed time: 04:21:07s.  Time for last 25,000:    3s.  Last read position: chrY:2,405,259
INFO    2015-09-17 19:10:44     SortVcf read   143,050,000 records.  Elapsed time: 04:21:07s.  Time for last 25,000:    0s.  Last read position: chrY:3,851,348
INFO    2015-09-17 19:10:44     SortVcf read   143,075,000 records.  Elapsed time: 04:21:07s.  Time for last 25,000:    0s.  Last read position: chrY:7,480,875
INFO    2015-09-17 19:10:44     SortVcf read   143,100,000 records.  Elapsed time: 04:21:07s.  Time for last 25,000:    0s.  Last read position: chrY:10,151,437
INFO    2015-09-17 19:10:45     SortVcf read   143,125,000 records.  Elapsed time: 04:21:08s.  Time for last 25,000:    0s.  Last read position: chrY:11,324,243
INFO    2015-09-17 19:10:45     SortVcf read   143,150,000 records.  Elapsed time: 04:21:08s.  Time for last 25,000:    0s.  Last read position: chrY:12,617,451
INFO    2015-09-17 19:10:45     SortVcf read   143,175,000 records.  Elapsed time: 04:21:08s.  Time for last 25,000:    0s.  Last read position: chrY:15,351,020
INFO    2015-09-17 19:10:45     SortVcf read   143,200,000 records.  Elapsed time: 04:21:08s.  Time for last 25,000:    0s.  Last read position: chrY:19,370,690
INFO    2015-09-17 19:10:45     SortVcf read   143,225,000 records.  Elapsed time: 04:21:08s.  Time for last 25,000:    0s.  Last read position: chrY:21,843,643
INFO    2015-09-17 19:10:46     SortVcf read   143,250,000 records.  Elapsed time: 04:21:09s.  Time for last 25,000:    0s.  Last read position: chrY:57,084,728
[Thu Sep 17 19:10:59 CEST 2015] picard.vcf.SortVcf done. Elapsed time: 261.39 minutes.
Runtime.totalMemory()=12853444608
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" java.lang.NullPointerException
        at htsjdk.variant.variantcontext.VariantContextComparator.compare(VariantContextComparator.java:84)
        at htsjdk.variant.variantcontext.VariantContextComparator.compare(VariantContextComparator.java:21)
        at java.util.TimSort.countRunAndMakeAscending(TimSort.java:360)
        at java.util.TimSort.sort(TimSort.java:234)
        at java.util.Arrays.sort(Arrays.java:1512)
        at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:218)
        at htsjdk.samtools.util.SortingCollection.doneAdding(SortingCollection.java:190)
        at htsjdk.samtools.util.SortingCollection.iterator(SortingCollection.java:265)
        at htsjdk.samtools.util.SortingCollection.iterator(SortingCollection.java:58)
        at picard.vcf.SortVcf.writeSortedOutput(SortVcf.java:171)
        at picard.vcf.SortVcf.doWork(SortVcf.java:90)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:206)
        at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
        at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

 

snp software error • 1.4k views
ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by haiying.kong250

Additional information.

 

[kong@hpc85 hg38]$ grep SQ hg38.dict | cut -f 2
SN:chr1
SN:chr10
SN:chr11
SN:chr11_KI270721v1_random
SN:chr12
SN:chr13
SN:chr14
SN:chr14_GL000009v2_random
SN:chr14_GL000225v1_random
SN:chr14_KI270722v1_random
SN:chr14_GL000194v1_random
SN:chr14_KI270723v1_random
SN:chr14_KI270724v1_random
SN:chr14_KI270725v1_random
SN:chr14_KI270726v1_random
SN:chr15
SN:chr15_KI270727v1_random
SN:chr16
SN:chr16_KI270728v1_random
SN:chr17
SN:chr17_GL000205v2_random
SN:chr17_KI270729v1_random
SN:chr17_KI270730v1_random
SN:chr18
SN:chr19
SN:chr1_KI270706v1_random
SN:chr1_KI270707v1_random
SN:chr1_KI270708v1_random
SN:chr1_KI270709v1_random
SN:chr1_KI270710v1_random
SN:chr1_KI270711v1_random
SN:chr1_KI270712v1_random
SN:chr1_KI270713v1_random
SN:chr1_KI270714v1_random
SN:chr2
SN:chr20
SN:chr21
SN:chr22
SN:chr22_KI270731v1_random
SN:chr22_KI270732v1_random
SN:chr22_KI270733v1_random
SN:chr22_KI270734v1_random
SN:chr22_KI270735v1_random
SN:chr22_KI270736v1_random
SN:chr22_KI270737v1_random
SN:chr22_KI270738v1_random
SN:chr22_KI270739v1_random
SN:chr2_KI270715v1_random
SN:chr2_KI270716v1_random
SN:chr3
SN:chr3_GL000221v1_random
SN:chr4
SN:chr4_GL000008v2_random
SN:chr5
SN:chr5_GL000208v1_random
SN:chr6
SN:chr7
SN:chr8
SN:chr9
SN:chr9_KI270717v1_random
SN:chr9_KI270718v1_random
SN:chr9_KI270719v1_random
SN:chr9_KI270720v1_random
SN:chr1_KI270762v1_alt
SN:chr1_KI270766v1_alt
SN:chr1_KI270760v1_alt
SN:chr1_KI270765v1_alt
SN:chr1_GL383518v1_alt
SN:chr1_GL383519v1_alt
SN:chr1_GL383520v2_alt
SN:chr1_KI270764v1_alt
SN:chr1_KI270763v1_alt
SN:chr1_KI270759v1_alt
SN:chr1_KI270761v1_alt
SN:chr2_KI270770v1_alt
SN:chr2_KI270773v1_alt
SN:chr2_KI270774v1_alt
SN:chr2_KI270769v1_alt
SN:chr2_GL383521v1_alt
SN:chr2_KI270772v1_alt
SN:chr2_KI270775v1_alt
SN:chr2_KI270771v1_alt
SN:chr2_KI270768v1_alt
SN:chr2_GL582966v2_alt
SN:chr2_GL383522v1_alt
SN:chr2_KI270776v1_alt
SN:chr2_KI270767v1_alt
SN:chr3_JH636055v2_alt
SN:chr3_KI270783v1_alt
SN:chr3_KI270780v1_alt
SN:chr3_GL383526v1_alt
SN:chr3_KI270777v1_alt
SN:chr3_KI270778v1_alt
SN:chr3_KI270781v1_alt
SN:chr3_KI270779v1_alt
SN:chr3_KI270782v1_alt
SN:chr3_KI270784v1_alt
SN:chr4_KI270790v1_alt
SN:chr4_GL383528v1_alt
SN:chr4_KI270787v1_alt
SN:chr4_GL000257v2_alt
SN:chr4_KI270788v1_alt
SN:chr4_GL383527v1_alt
SN:chr4_KI270785v1_alt
SN:chr4_KI270789v1_alt
SN:chr4_KI270786v1_alt
SN:chr5_KI270793v1_alt
SN:chr5_KI270792v1_alt
SN:chr5_KI270791v1_alt
SN:chr5_GL383532v1_alt
SN:chr5_GL949742v1_alt
SN:chr5_KI270794v1_alt
SN:chr5_GL339449v2_alt
SN:chr5_GL383530v1_alt
SN:chr5_KI270796v1_alt
SN:chr5_GL383531v1_alt
SN:chr5_KI270795v1_alt
SN:chr6_GL000250v2_alt
SN:chr6_KI270800v1_alt
SN:chr6_KI270799v1_alt
SN:chr6_GL383533v1_alt
SN:chr6_KI270801v1_alt
SN:chr6_KI270802v1_alt
SN:chr6_KB021644v2_alt
SN:chr6_KI270797v1_alt
SN:chr6_KI270798v1_alt
SN:chr7_KI270804v1_alt
SN:chr7_KI270809v1_alt
SN:chr7_KI270806v1_alt
SN:chr7_GL383534v2_alt
SN:chr7_KI270803v1_alt
SN:chr7_KI270808v1_alt
SN:chr7_KI270807v1_alt
SN:chr7_KI270805v1_alt
SN:chr8_KI270818v1_alt
SN:chr8_KI270812v1_alt
SN:chr8_KI270811v1_alt
SN:chr8_KI270821v1_alt
SN:chr8_KI270813v1_alt
SN:chr8_KI270822v1_alt
SN:chr8_KI270814v1_alt
SN:chr8_KI270810v1_alt
SN:chr8_KI270819v1_alt
SN:chr8_KI270820v1_alt
SN:chr8_KI270817v1_alt
SN:chr8_KI270816v1_alt
SN:chr8_KI270815v1_alt
SN:chr9_GL383539v1_alt
SN:chr9_GL383540v1_alt
SN:chr9_GL383541v1_alt
SN:chr9_GL383542v1_alt
SN:chr9_KI270823v1_alt
SN:chr10_GL383545v1_alt
SN:chr10_KI270824v1_alt
SN:chr10_GL383546v1_alt
SN:chr10_KI270825v1_alt
SN:chr11_KI270832v1_alt
SN:chr11_KI270830v1_alt
SN:chr11_KI270831v1_alt
SN:chr11_KI270829v1_alt
SN:chr11_GL383547v1_alt
SN:chr11_JH159136v1_alt
SN:chr11_JH159137v1_alt
SN:chr11_KI270827v1_alt
SN:chr11_KI270826v1_alt
SN:chr12_GL877875v1_alt
SN:chr12_GL877876v1_alt
SN:chr12_KI270837v1_alt
SN:chr12_GL383549v1_alt
SN:chr12_KI270835v1_alt
SN:chr12_GL383550v2_alt
SN:chr12_GL383552v1_alt
SN:chr12_GL383553v2_alt
SN:chr12_KI270834v1_alt
SN:chr12_GL383551v1_alt
SN:chr12_KI270833v1_alt
SN:chr12_KI270836v1_alt
SN:chr13_KI270840v1_alt
SN:chr13_KI270839v1_alt
SN:chr13_KI270843v1_alt
SN:chr13_KI270841v1_alt
SN:chr13_KI270838v1_alt
SN:chr13_KI270842v1_alt
SN:chr14_KI270844v1_alt
SN:chr14_KI270847v1_alt
SN:chr14_KI270845v1_alt
SN:chr14_KI270846v1_alt
SN:chr15_KI270852v1_alt
SN:chr15_KI270851v1_alt
SN:chr15_KI270848v1_alt
SN:chr15_GL383554v1_alt
SN:chr15_KI270849v1_alt
SN:chr15_GL383555v2_alt
SN:chr15_KI270850v1_alt
SN:chr16_KI270854v1_alt
SN:chr16_KI270856v1_alt
SN:chr16_KI270855v1_alt
SN:chr16_KI270853v1_alt
SN:chr16_GL383556v1_alt
SN:chr16_GL383557v1_alt
SN:chr17_GL383563v3_alt
SN:chr17_KI270862v1_alt
SN:chr17_KI270861v1_alt
SN:chr17_KI270857v1_alt
SN:chr17_JH159146v1_alt
SN:chr17_JH159147v1_alt
SN:chr17_GL383564v2_alt
SN:chr17_GL000258v2_alt
SN:chr17_GL383565v1_alt
SN:chr17_KI270858v1_alt
SN:chr17_KI270859v1_alt
SN:chr17_GL383566v1_alt
SN:chr17_KI270860v1_alt
SN:chr18_KI270864v1_alt
SN:chr18_GL383567v1_alt
SN:chr18_GL383570v1_alt
SN:chr18_GL383571v1_alt
SN:chr18_GL383568v1_alt
SN:chr18_GL383569v1_alt
SN:chr18_GL383572v1_alt
SN:chr18_KI270863v1_alt
SN:chr19_KI270868v1_alt
SN:chr19_KI270865v1_alt
SN:chr19_GL383573v1_alt
SN:chr19_GL383575v2_alt
SN:chr19_GL383576v1_alt
SN:chr19_GL383574v1_alt
SN:chr19_KI270866v1_alt
SN:chr19_KI270867v1_alt
SN:chr19_GL949746v1_alt
SN:chr20_GL383577v2_alt
SN:chr20_KI270869v1_alt
SN:chr20_KI270871v1_alt
SN:chr20_KI270870v1_alt
SN:chr21_GL383578v2_alt
SN:chr21_KI270874v1_alt
SN:chr21_KI270873v1_alt
SN:chr21_GL383579v2_alt
SN:chr21_GL383580v2_alt
SN:chr21_GL383581v2_alt
SN:chr21_KI270872v1_alt
SN:chr22_KI270875v1_alt
SN:chr22_KI270878v1_alt
SN:chr22_KI270879v1_alt
SN:chr22_KI270876v1_alt
SN:chr22_KI270877v1_alt
SN:chr22_GL383583v2_alt
SN:chr22_GL383582v2_alt
SN:chrX_KI270880v1_alt
SN:chrX_KI270881v1_alt
SN:chr19_KI270882v1_alt
SN:chr19_KI270883v1_alt
SN:chr19_KI270884v1_alt
SN:chr19_KI270885v1_alt
SN:chr19_KI270886v1_alt
SN:chr19_KI270887v1_alt
SN:chr19_KI270888v1_alt
SN:chr19_KI270889v1_alt
SN:chr19_KI270890v1_alt
SN:chr19_KI270891v1_alt
SN:chr1_KI270892v1_alt
SN:chr2_KI270894v1_alt
SN:chr2_KI270893v1_alt
SN:chr3_KI270895v1_alt
SN:chr4_KI270896v1_alt
SN:chr5_KI270897v1_alt
SN:chr5_KI270898v1_alt
SN:chr6_GL000251v2_alt
SN:chr7_KI270899v1_alt
SN:chr8_KI270901v1_alt
SN:chr8_KI270900v1_alt
SN:chr11_KI270902v1_alt
SN:chr11_KI270903v1_alt
SN:chr12_KI270904v1_alt
SN:chr15_KI270906v1_alt
SN:chr15_KI270905v1_alt
SN:chr17_KI270907v1_alt
SN:chr17_KI270910v1_alt
SN:chr17_KI270909v1_alt
SN:chr17_JH159148v1_alt
SN:chr17_KI270908v1_alt
SN:chr18_KI270912v1_alt
SN:chr18_KI270911v1_alt
SN:chr19_GL949747v2_alt
SN:chr22_KB663609v1_alt
SN:chrX_KI270913v1_alt
SN:chr19_KI270914v1_alt
SN:chr19_KI270915v1_alt
SN:chr19_KI270916v1_alt
SN:chr19_KI270917v1_alt
SN:chr19_KI270918v1_alt
SN:chr19_KI270919v1_alt
SN:chr19_KI270920v1_alt
SN:chr19_KI270921v1_alt
SN:chr19_KI270922v1_alt
SN:chr19_KI270923v1_alt
SN:chr3_KI270924v1_alt
SN:chr4_KI270925v1_alt
SN:chr6_GL000252v2_alt
SN:chr8_KI270926v1_alt
SN:chr11_KI270927v1_alt
SN:chr19_GL949748v2_alt
SN:chr22_KI270928v1_alt
SN:chr19_KI270929v1_alt
SN:chr19_KI270930v1_alt
SN:chr19_KI270931v1_alt
SN:chr19_KI270932v1_alt
SN:chr19_KI270933v1_alt
SN:chr19_GL000209v2_alt
SN:chr3_KI270934v1_alt
SN:chr6_GL000253v2_alt
SN:chr19_GL949749v2_alt
SN:chr3_KI270935v1_alt
SN:chr6_GL000254v2_alt
SN:chr19_GL949750v2_alt
SN:chr3_KI270936v1_alt
SN:chr6_GL000255v2_alt
SN:chr19_GL949751v2_alt
SN:chr3_KI270937v1_alt
SN:chr6_GL000256v2_alt
SN:chr19_GL949752v1_alt
SN:chr6_KI270758v1_alt
SN:chr19_GL949753v2_alt
SN:chr19_KI270938v1_alt
SN:chrM
SN:chrUn_KI270302v1
SN:chrUn_KI270304v1
SN:chrUn_KI270303v1
SN:chrUn_KI270305v1
SN:chrUn_KI270322v1
SN:chrUn_KI270320v1
SN:chrUn_KI270310v1
SN:chrUn_KI270316v1
SN:chrUn_KI270315v1
SN:chrUn_KI270312v1
SN:chrUn_KI270311v1
SN:chrUn_KI270317v1
SN:chrUn_KI270412v1
SN:chrUn_KI270411v1
SN:chrUn_KI270414v1
SN:chrUn_KI270419v1
SN:chrUn_KI270418v1
SN:chrUn_KI270420v1
SN:chrUn_KI270424v1
SN:chrUn_KI270417v1
SN:chrUn_KI270422v1
SN:chrUn_KI270423v1
SN:chrUn_KI270425v1
SN:chrUn_KI270429v1
SN:chrUn_KI270442v1
SN:chrUn_KI270466v1
SN:chrUn_KI270465v1
SN:chrUn_KI270467v1
SN:chrUn_KI270435v1
SN:chrUn_KI270438v1
SN:chrUn_KI270468v1
SN:chrUn_KI270510v1
SN:chrUn_KI270509v1
SN:chrUn_KI270518v1
SN:chrUn_KI270508v1
SN:chrUn_KI270516v1
SN:chrUn_KI270512v1
SN:chrUn_KI270519v1
SN:chrUn_KI270522v1
SN:chrUn_KI270511v1
SN:chrUn_KI270515v1
SN:chrUn_KI270507v1
SN:chrUn_KI270517v1
SN:chrUn_KI270529v1
SN:chrUn_KI270528v1
SN:chrUn_KI270530v1
SN:chrUn_KI270539v1
SN:chrUn_KI270538v1
SN:chrUn_KI270544v1
SN:chrUn_KI270548v1
SN:chrUn_KI270583v1
SN:chrUn_KI270587v1
SN:chrUn_KI270580v1
SN:chrUn_KI270581v1
SN:chrUn_KI270579v1
SN:chrUn_KI270589v1
SN:chrUn_KI270590v1
SN:chrUn_KI270584v1
SN:chrUn_KI270582v1
SN:chrUn_KI270588v1
SN:chrUn_KI270593v1
SN:chrUn_KI270591v1
SN:chrUn_KI270330v1
SN:chrUn_KI270329v1
SN:chrUn_KI270334v1
SN:chrUn_KI270333v1
SN:chrUn_KI270335v1
SN:chrUn_KI270338v1
SN:chrUn_KI270340v1
SN:chrUn_KI270336v1
SN:chrUn_KI270337v1
SN:chrUn_KI270363v1
SN:chrUn_KI270364v1
SN:chrUn_KI270362v1
SN:chrUn_KI270366v1
SN:chrUn_KI270378v1
SN:chrUn_KI270379v1
SN:chrUn_KI270389v1
SN:chrUn_KI270390v1
SN:chrUn_KI270387v1
SN:chrUn_KI270395v1
SN:chrUn_KI270396v1
SN:chrUn_KI270388v1
SN:chrUn_KI270394v1
SN:chrUn_KI270386v1
SN:chrUn_KI270391v1
SN:chrUn_KI270383v1
SN:chrUn_KI270393v1
SN:chrUn_KI270384v1
SN:chrUn_KI270392v1
SN:chrUn_KI270381v1
SN:chrUn_KI270385v1
SN:chrUn_KI270382v1
SN:chrUn_KI270376v1
SN:chrUn_KI270374v1
SN:chrUn_KI270372v1
SN:chrUn_KI270373v1
SN:chrUn_KI270375v1
SN:chrUn_KI270371v1
SN:chrUn_KI270448v1
SN:chrUn_KI270521v1
SN:chrUn_GL000195v1
SN:chrUn_GL000219v1
SN:chrUn_GL000220v1
SN:chrUn_GL000224v1
SN:chrUn_KI270741v1
SN:chrUn_GL000226v1
SN:chrUn_GL000213v1
SN:chrUn_KI270743v1
SN:chrUn_KI270744v1
SN:chrUn_KI270745v1
SN:chrUn_KI270746v1
SN:chrUn_KI270747v1
SN:chrUn_KI270748v1
SN:chrUn_KI270749v1
SN:chrUn_KI270750v1
SN:chrUn_KI270751v1
SN:chrUn_KI270752v1
SN:chrUn_KI270753v1
SN:chrUn_KI270754v1
SN:chrUn_KI270755v1
SN:chrUn_KI270756v1
SN:chrUn_KI270757v1
SN:chrUn_GL000214v1
SN:chrUn_KI270742v1
SN:chrUn_GL000216v2
SN:chrUn_GL000218v1
SN:chrX
SN:chrY
SN:chrY_KI270740v1_random

 

 

[kong@hpc22 dbSNP]$ grep -v "#" 00-All_chr.vcf | cut -f 1 | uniq | sort | uniq
chr1
chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr2
chr20
chr21
chr22
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chrMT
chrX
chrY

 

ADD REPLYlink written 3.6 years ago by haiying.kong250
0
gravatar for Pierre Lindenbaum
3.6 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

adding a 'chr' is not enough: for example there is the chromosome 'MT' in hg38 and 'chrM' in NCBI

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by Pierre Lindenbaum119k

Thanks for your quick reply.

Then, how should I change chromosome names in dbSNP to match with the reference human genome file (hg38) that I downloaded from UCSC website?

ADD REPLYlink written 3.6 years ago by haiying.kong250

hum... just checked: the 'MT' genome is declared as 'MT' too in the VCF , so this is not the origin of the error....

ADD REPLYlink written 3.6 years ago by Pierre Lindenbaum119k

Dear Dr. Lindenbaum,

  Could you please help me with this? I need to get this dbSNP database sorted out to use GATK.

  Thank you very much.

  Best regards,

ADD REPLYlink written 3.6 years ago by haiying.kong250

I cannot help without knowing what's in your files; It may be a bug. You'd better ask the picard mailing list.
 

ADD REPLYlink written 3.6 years ago by Pierre Lindenbaum119k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1654 users visited in the last hour