Hello everyone. I will try to explain with an example of what i need.
We have vcf file to get some positions of snps.
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001 NA00002 NA00003 20 14370 rs6054257 G A 29 PASS NS=3;DP=14;AF=0.5;DB;H2 GT:GQ:DP:HQ 20 17330 . T A 3 q10 NS=3;DP=11;AF=0.017 GT:GQ:DP:HQ 20 1110696 rs6040355 A G,T 67 PASS NS=2;DP=10;AF=0.333,0.667;AA=T;DB GT:GQ:DP:HQ 20 1230237 . T . 47 PASS NS=3;DP=13;AA=T GT:GQ:DP:HQ
and a bam file with the alignments to reference. the approximate depth is 20x and all are 454 reads. Now from this information, i wanted to extract all the bases at the particular positions.. I need output of something like this.
Read Name Position1 Position2 Position3 Position4 14370 17330 1110696 1230237 Read1 G T A T Read2 A A G . Read3 A A G . Read4 G A A T Read5 A T T . Read6 A T T . Read7 G A A T
I would like to build such graph across all reads with specific position. Can this be possible by samtools mpileup/vcf tools or does anyone has any script written to solve such problem. From this information i will be extracting the haplotype information for specific genotypes.
I did something similar a long time back. wouldn't the sff tools from 454 have something to help?
I looked in to it but its of not much help !! can we do this with samtools or vcftools?