Hi
I work with HAPCUT software to make Haplotypes using a sorted bam file and a vcf file containing variants. When I run this command,
./extract_hairs --VCF variantfile.VCF --bam alignedreads.sorted.bam --maxIS 600 > fragment_matrix_file
an error says that
[bam_header_read] EOF marker is absent
Also before that, when I tried to sort bam file with samtools this error occurred. but samtools continued to make sorted bam and then I used it in HAPCUT.
Thanks for any advice on this problem.
HAPCUT is interesting. How long are the haplotypes it generates from paired-end sequencing data?
If you get this error. You can maybe ask yourself if the bam file you created was created in a perfect manner. If I get this error, I'll always look at the sam file, and convert it to a bam file again. Don't run things in parallel, since this can result in errors. After this conversion, sort your bam file and then perform indexing again. Normally you won't run into this problem afterwards.
A good pipeline converts alignment stream to BAM right away, with or without multithreading does not matter. If the file is corrupt then because of memory shortages or corruption during the alignment.