I have fastq files, where each file is a this sequence of a distinct haploid individual. I need to run these through GATK as though they were diploids, in order to use a software which takes only a VCF with diploid samples as an input. This post: How to merge two haploid samples (vcf, or g.vcf) into a pseudo-diploid? suggests merging bam files, and some post-merge processing, but couldn't I just merge pairs of my fastq's to get two haplotypes into one file? eg:
> cat read1_indv1.fq.gz read1_indv2.fq.gz > read1_combined.fq.gz > cat read2_indv1.fq.gz read2_indv2.fq.gz > read2_combined.fq.gz
Then do all the sam bam GATK stuff after the fact?