Question: Determine level of heterozygosity
gravatar for max_19
12 months ago by
max_19150 wrote:

Hi all,

I am trying to determine the level of heterozygosity in my de-novo assembled genome, in order to do this I think I should measure the snp frequency. So far what I have done is called SNPs using my final assembly and then converted to VCF using below:

bcftools mpileup -Ou -f ../scaffolds.fa sorted.bam | bcftools call -Ou -mv | bcftools norm -Ou -f ../scaffolds.fa > file.vcf

this seems to work fine and I obtain a VCF file, however when I try to view it with "less" the first few lines (header) look normal, but the rest of the lines after the column names are unreadable, for example:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  sorted.bam ^@t^@^@^@^K^@^@^@^B^@^@^@^@^@^@^@^A^@^@^@<8F>j^XB^N^@^B^@^A^@^@^B^G^WT<87>TGTCGATT^@^Q^A^@^Q^B^Q^B^Q^C^U9<8E>c>^Q^D^Q ^Q^E^U-<F6><DD>6^Q

Not sure if this is normal, but I am unsure on how to use this to get overall snp frequency/infer level of heterozygosity. Any help is greatly appreciated.

Thank you.

ADD COMMENTlink modified 12 months ago by Brice Sarver3.5k • written 12 months ago by max_19150
gravatar for Brice Sarver
12 months ago by
Brice Sarver3.5k
United States
Brice Sarver3.5k wrote:

Your options (-Ou) have you outputting an uncompressed BCF (binary VCF), hence the inability to read. In your last call to bcftools norm, pass the -Ov option for an uncompressed VCF or -Oz for a compressed one. You can also specify a file name with -o as opposed to redirecting stdout.

For more info, see the bcftools manual here.

ADD COMMENTlink modified 12 months ago • written 12 months ago by Brice Sarver3.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1320 users visited in the last hour