Question: How to merge two haploid samples (vcf, or g.vcf) into a pseudo-diploid?
0
gravatar for Pistachio
7 months ago by
Pistachio0
Canada/Toronto/
Pistachio0 wrote:

Hi, I'm currently dealing with some Bumblebee genomes and I have data containing 10 drones (haploids), which I want to pair them up into a fake diploid.

I have already tried using GATK combinegvcfs, bcftools merge methods. I could only get a typical merged vcf with more individuals.

The GT field in the vcf currently contains a single number.

GT:AD:DP:GQ:PL  1:0,11:11:99:433,0

I want to take two drone data and make them appear as a single diploid, so that the GT field becomes something like 1/1 or 0/1.

Would appreciate any guidance or pointers.

---------------------Update-----------------------

[Solved!] Just in case anyone else in the future stumbles onto a similar situation this is what I did.

Step 1) Merged bam files in pairs using Samtools, it looks like:

samtools (name merged_output_file).bam (input_file#1).bam (input_files#2).bam

Step 2) I had to make them into g.vcf, in order to do that I needed an index file for each bam.

samtools index -b (merged_sample).bam

** -b makes .bai files

**Above steps make multi-sample bam, if you check it'll have 2 RG tags. I needed to edit the bam so that it looked like a single sample bam file. This is because I needed to run -ERC GVCF in haplotype caller later (only works with single sample). Which leads to step 3.

Step 3) Replace the RG tags and add a single new one using Picard:

picard AddOrReplaceReadGroups I=(merged_sample)bam O=(outputfile).bam RGID= (new ID) RGLB= (new LB) RGPU= (new PU) RGSM= (new SM)

**not sure if I should've made the index after this step, will update if I hit a problem.

Step 4) make g.vcf using GATK HaplotypeCaller with the default ploidy of 2.

genome • 404 views
ADD COMMENTlink modified 7 months ago by genomax67k • written 7 months ago by Pistachio0

I'll give a try at merging the BAMs and making new vcfs. Thank you!

ADD REPLYlink modified 7 months ago • written 7 months ago by Pistachio0
1
gravatar for Alice
7 months ago by
Alice270
USA
Alice270 wrote:

if you have alignments, you can merge 2 BAMs and call genotypes as if like BAM is now diploid. Otherwise I do not think there is a straightforward way to do this. You basically need to recalculate the entire vcf.

ADD COMMENTlink written 7 months ago by Alice270

Solved, it worked! Thank you!

ADD REPLYlink written 7 months ago by Pistachio0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1130 users visited in the last hour