Merge Bam With Vcf Files. Tips About The Correct Workflow
1
0
Entering edit mode
11.4 years ago

Dear All, I am very new to the analysis of NGS data.

I would like to merge the information of sample 1029 from HGDP (http://cdna.eva.mpg.de/denisova/VCF/human/HGDP01029.hg19_1000g.12.mod.vcf.gz) to SAN sample in Schuster et al 2010 ftp://ftp.bx.psu.edu/data/bushman/hg18/bam/KB1illumChr12.bam.

If I well understood, I should call the variants from the bam file and then merge with the vcf. Is it correct?

Could you gently suggest me the best way to do it in your opinion? When should i convert my files to the same reference sequence?

I am really sorry if I am saying something completely wrong, but I've just started to manage this kind of data

merge bam vcf • 3.5k views
ADD COMMENT
0
Entering edit mode

I'm not sure what are you planning to do, but denisova and bushman SNPs for hg18/hg19 are in Kaviar: http://db.systemsbiology.net/kaviar/cgi-pub/Kaviar2.pl?show=sources

ADD REPLY
0
Entering edit mode

Thank you for your help, but I don't need the Denisova genome but the HDP1029 (mandenka genome) and the San genome. Unfortunately data are in two different formats and I don't know how to compare them. Do you have any suggestion?

ADD REPLY
0
Entering edit mode

well, you have 2 options: 1) map San genome reads to hg19 and the call variants with samtools or GATK, or 2) call variants using the hg18 BAM and then convert the coordinates to hg19 with UCSC LiftOver. After that you can merge with VCFtools

ADD REPLY
0
Entering edit mode

Thank you, I will try today to do it, hoping that it works. I will do the call variants using samtools mpileup, but I have no idea how to map the San genome, Do you have any suggestion?

Thank you again.

ADD REPLY
1
Entering edit mode
11.4 years ago
Marvin ▴ 890

You should probably align the Schuster data to hg19 (the MPG guy used the same genome as the 1000 Genomes Project), probably using BWA. Then call genotypes, using either "samtools mpileup" or GATK. Then merge the files, maybe using "vcftools merge". In my experience, vcftools is so buggy as to be useless, so you'll probably write your own mergin code for the two VCF files.

ADD COMMENT
0
Entering edit mode

But,If I well understood BWA accept only fastq input and my Schuster data is .bam. Is it correct? I am really sorry for that, my i am feeling "alone in the dark"... I absolutely need to attend a course for NGS.

Thank you again

ADD REPLY
0
Entering edit mode

The man page for BWA is here: http://bio-bwa.sourceforge.net/bwa.shtml

ADD REPLY

Login before adding your answer.

Traffic: 1777 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6