Question: Determine level of heterozygosity
gravatar for max_19
4 months ago by
max_19130 wrote:

Hi all,

I am trying to determine the level of heterozygosity in my de-novo assembled genome, in order to do this I think I should measure the snp frequency. So far what I have done is called SNPs using my final assembly and then converted to VCF using below:

bcftools mpileup -Ou -f ../scaffolds.fa sorted.bam | bcftools call -Ou -mv | bcftools norm -Ou -f ../scaffolds.fa > file.vcf

this seems to work fine and I obtain a VCF file, however when I try to view it with "less" the first few lines (header) look normal, but the rest of the lines after the column names are unreadable, for example:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  sorted.bam ^@t^@^@^@^K^@^@^@^B^@^@^@^@^@^@^@^A^@^@^@<8F>j^XB^N^@^B^@^A^@^@^B^G^WT<87>TGTCGATT^@^Q^A^@^Q^B^Q^B^Q^C^U9<8E>c>^Q^D^Q ^Q^E^U-<F6><DD>6^Q

Not sure if this is normal, but I am unsure on how to use this to get overall snp frequency/infer level of heterozygosity. Any help is greatly appreciated.

Thank you.

ADD COMMENTlink modified 4 months ago by Brice Sarver3.5k • written 4 months ago by max_19130
gravatar for Brice Sarver
4 months ago by
Brice Sarver3.5k
United States
Brice Sarver3.5k wrote:

Your options (-Ou) have you outputting an uncompressed BCF (binary VCF), hence the inability to read. In your last call to bcftools norm, pass the -Ov option for an uncompressed VCF or -Oz for a compressed one. You can also specify a file name with -o as opposed to redirecting stdout.

For more info, see the bcftools manual here.

ADD COMMENTlink modified 4 months ago • written 4 months ago by Brice Sarver3.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1816 users visited in the last hour