Question: Annotate merged vcfs
0
gravatar for banerjeeshayantan
3 months ago by
banerjeeshayantan50 wrote:

I have 10 normal/tumor samples and I did the following:-
1. Generated vcfs
2. Normalised them
3. Merged together
Now I want to annotate the merged vcf, using java -Xmx4g -jar snpEff.jar -v -stats ex1.html GRCm38.75 finalMerge.vcf > final.anno.vcf
but I am getting an error


00:00:00    SnpEff version SnpEff 4.3t (build 2017-11-24 10:18), by Pablo Cingolani
00:00:00    Command: 'ann'
00:00:00    Reading configuration file 'snpEff.config'. Genome: 'GRCm38.75'
00:00:00    Reading config file: /media/shayantanbanerjee/disk1/20_cancer_samples/dfata/snpEff/snpEff.config
00:00:00    done
00:00:00    Reading database for genome version 'GRCm38.75' from file '/media/shayantanbanerjee/disk1/20_cancer_samples/dfata/snpEff/./data/GRCm38.75/snpEffectPredictor.bin' (this might take a while)
00:00:09    done
00:00:09    Loading Motifs and PWMs
00:00:09    Building interval forest
00:00:12    done.
00:00:12    Genome stats :
#-----------------------------------------------
# Genome name                : 'Mus_musculus'
# Genome version             : 'GRCm38.75'
# Genome ID                  : 'GRCm38.75[0]'
# Has protein coding info    : true
# Has Tr. Support Level info : true
# Genes                      : 39179
# Protein coding genes       : 23232
#-----------------------------------------------
# Transcripts                : 94929
# Avg. transcripts per gene  : 2.42
# TSL transcripts            : 0
#-----------------------------------------------
# Checked transcripts        : 
#               AA sequences :  52148 ( 99.75% )
#              DNA sequences :  82932 ( 87.36% )
#-----------------------------------------------
# Protein coding transcripts : 52278
#              Length errors :   4872 ( 9.32% )
#  STOP codons in CDS errors :     40 ( 0.08% )
#         START codon errors :   4814 ( 9.21% )
#        STOP codon warnings :   3254 ( 6.22% )
#              UTR sequences :  48094 ( 50.66% )
#               Total Errors :   8962 ( 17.14% )
#-----------------------------------------------
# Cds                        : 432486
# Exons                      : 628052
# Exons with sequence        : 628052
# Exons without sequence     : 0
# Avg. exons per transcript  : 6.62
# WARNING!                   : Mitochondrion chromosome 'MT' does not have a mitochondrion codon table (codon table = 'Standard'). You should update the config file.
#-----------------------------------------------
# Number of chromosomes      : 80
# Chromosomes                : Format 'chromo_name size codon_table'
#       '1' 195471971   Standard
#       '2' 182113224   Standard
#       'MG4214_PATCH'  171031749   Standard
#       'X' 171031299   Standard
#       '3' 160039680   Standard
#       '4' 156508116   Standard
#       'MG4136_PATCH'  156508116   Standard
#       'MG4212_PATCH'  151862668   Standard
#       '5' 151834684   Standard
#       '6' 149736546   Standard
#       '7' 145441459   Standard
#       'MG4151_PATCH'  145439975   Standard
#       '10'    130694993   Standard
#       '8' 129401213   Standard
#       '14'    124902244   Standard
#       '9' 124595110   Standard
#       'MG132_PATCH'   124595110   Standard
#       '11'    122082543   Standard
#       'MG3829_PATCH'  122082543   Standard
#       '13'    120421639   Standard
#       'MG4180_PATCH'  120129530   Standard
#       '12'    120129022   Standard
#       '15'    104043685   Standard
#       'MG3833_PATCH'  98208653    Standard
#       '16'    98207768    Standard
#       '17'    94987271    Standard
#       'MG4222_MG3908_PATCH'   94987243    Standard
#       'MG4211_PATCH'  91797447    Standard
#       'MG4209_PATCH'  91793962    Standard
#       'Y' 91744698    Standard
#       'MG4213_PATCH'  91736668    Standard
#       'MG3835_PATCH'  90835696    Standard
#       '18'    90702639    Standard
#       '19'    61431566    Standard
#       'MG153_PATCH'   61431565    Standard
#       'JH584299.1'    953012  Standard
#       'GL456233.1'    336933  Standard
#       'JH584301.1'    259875  Standard
#       'GL456211.1'    241735  Standard
#       'GL456350.1'    227966  Standard
#       'JH584293.1'    207968  Standard
#       'GL456221.1'    206961  Standard
#       'JH584297.1'    205776  Standard
#       'JH584296.1'    199368  Standard
#       'GL456354.1'    195993  Standard
#       'JH584294.1'    191905  Standard
#       'JH584298.1'    184189  Standard
#       'JH584300.1'    182347  Standard
#       'GL456219.1'    175968  Standard
#       'GL456210.1'    169725  Standard
#       'JH584303.1'    158099  Standard
#       'JH584302.1'    155838  Standard
#       'GL456212.1'    153618  Standard
#       'JH584304.1'    114452  Standard
#       'GL456379.1'    72385   Standard
#       'GL456216.1'    66673   Standard
#       'GL456393.1'    55711   Standard
#       'GL456366.1'    47073   Standard
#       'GL456367.1'    42057   Standard
#       'GL456239.1'    40056   Standard
#       'GL456213.1'    39340   Standard
#       'GL456383.1'    38659   Standard
#       'GL456385.1'    35240   Standard
#       'GL456360.1'    31704   Standard
#       'GL456378.1'    31602   Standard
#       'GL456389.1'    28772   Standard
#       'GL456372.1'    28664   Standard
#       'GL456370.1'    26764   Standard
#       'GL456381.1'    25871   Standard
#       'GL456387.1'    24685   Standard
#       'GL456390.1'    24668   Standard
#       'GL456394.1'    24323   Standard
#       'GL456392.1'    23629   Standard
#       'GL456382.1'    23158   Standard
#       'GL456359.1'    22974   Standard
#       'GL456396.1'    21240   Standard
#       'GL456368.1'    20208   Standard
#       'MT'    16299   Standard
#       'JH584292.1'    14945   Standard
#       'JH584295.1'    1976    Standard
#-----------------------------------------------

00:00:14    Predicting variants
VcfFileIterator.parseVcfLine(132):  Fatal error reading file 'finalMerge.vcf' (line: 1):
�BC�4͝
         @�U����� 7AaDTn�0�g��"���xM�aC@@�j7ʴ{�����Vn��v�6�ֿ�\j�J�.[v�{�ٶV��<�}�y���3���lm���}���|�s��甕O����иq-m���e�^��)+s3u���M��i�l0�TW�-�f�W�z��ۺz�:;�ɖ�v-�T���G�e��5'�WW�3f��'{��z;;�{�����ٙ�L]aFnnnNAS3>���b�u4ٌp墬����e��.k{�ue�Ҭe�
java.lang.RuntimeException: java.lang.RuntimeException: Impropper VCF entry: Not enough fields (missing tab separators?).
�BC�4͝
         @�U����� 7AaDTn�0�g��"���xM�aC@@�j7ʴ{�����Vn��v�6�ֿ�\j�J�.[v�{�ٶV��<�}�y���3���lm���}���|�s��甕O����иq-m���e�^��)+s3u���M��i�l0�TW�-�f�W�z��ۺz�:;�ɖ�v-�T���G�e��5'�WW�3f��'{��z;;�{�����ٙ�L]aFnnnNAS3>���b�u4ٌp墬����e��.k{�ue�Ҭe�
    at org.snpeff.fileIterator.VcfFileIterator.parseVcfLine(VcfFileIterator.java:133)
    at org.snpeff.fileIterator.VcfFileIterator.readNext(VcfFileIterator.java:184)
    at org.snpeff.fileIterator.VcfFileIterator.readNext(VcfFileIterator.java:57)
    at org.snpeff.fileIterator.FileIterator.hasNext(FileIterator.java:123)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.annotateVcf(SnpEffCmdEff.java:467)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.annotate(SnpEffCmdEff.java:142)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:1029)
    at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:984)
    at org.snpeff.SnpEff.run(SnpEff.java:1183)
    at org.snpeff.SnpEff.main(SnpEff.java:162)
Caused by: java.lang.RuntimeException: Impropper VCF entry: Not enough fields (missing tab separators?).
�BC�4͝
         @�U����� 7AaDTn�0�g��"���xM�aC@@�j7ʴ{�����Vn��v�6�ֿ�\j�J�.[v�{�ٶV��<�}�y���3���lm���}���|�s��甕O����иq-m���e�^��)+s3u���M��i�l0�TW�-�f�W�z��ۺz�:;�ɖ�v-�T���G�e��5'�WW�3f��'{��z;;�{�����ٙ�L]aFnnnNAS3>���b�u4ٌp墬����e��.k{�ue�Ҭe�
    at org.snpeff.vcf.VcfEntry.parse(VcfEntry.java:1007)
    at org.snpeff.vcf.VcfEntry.<init>(VcfEntry.java:219)
    at org.snpeff.fileIterator.VcfFileIterator.parseVcfLine(VcfFileIterator.java:130)
    ... 9 more
00:00:14    Logging
00:00:15    Checking for updates...
00:00:16    Done.
sequencing next-gen • 195 views
ADD COMMENTlink modified 3 months ago by genomax49k • written 3 months ago by banerjeeshayantan50

Hello,

please post the first lines of you finalMerge.vcf. It looks like it isn't a real vcf file. It looks like there is something binary in it (bcf??)

fin swimmer

ADD REPLYlink written 3 months ago by finswimmer2.8k

Thanks for replying. The first lines of the vcf file are illegible too. Something like this \8B\00\00\00\00\00\FF\00BC\00\D04͝
The command used to merge the vcfs was: bcftools merge -Ob -m -any s1.vcf.gz s2.vcf.gz ..... > merge.vcf

ADD REPLYlink modified 3 months ago by genomax49k • written 3 months ago by banerjeeshayantan50
2
gravatar for finswimmer
3 months ago by
finswimmer2.8k
Germany
finswimmer2.8k wrote:
bcftools merge -Ob -m -any s1.vcf.gz s2.vcf.gz ..... > merge.vcf

With the -Ob option you create a compressed bcf file. I guess SnpEff cannot work with this file type.

Maybe you can use bcftools convert to get a vcf and stream the result directly to SnpEff. Or just use -Ov in your merge command for creating a normal vcf file.

fin swimmer

ADD COMMENTlink written 3 months ago by finswimmer2.8k

Thanks a lot. That worked like magic. Guess I was going wrong on the compressed file part.

ADD REPLYlink written 3 months ago by banerjeeshayantan50

Fine if I could help.

Please mark my answer as accepted. So everone can see that your problem is solved.

fin swimmer

ADD REPLYlink written 3 months ago by finswimmer2.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1230 users visited in the last hour