We are currently carrying out imputation using the 1000 genomes data on samples genotyped by Affy and Illumina platforms. However, Daniel MacArhur's recent Science paper points out that most LoF variants (about 59%) are going to be false calls. He illuminates a pipeline for filtering out false variants on page 37 of the supplemental .
His paper used 185 genomes from the 1000 Genomes project. Now there are 1092 genomes available. So in theory imputation should be more accurate. My question is this: between the low coverage release they used (2010_07) and the newest data available (2011_05_21) have any of the filters been implemented to remove SNVs that are likely artifacts? This would include (see the figure S1) mostly mapping/sequencing errors, functional annotation errors. In other words, have any of the artifactual variants been removed so that imputation quality will be improved or are the same problems present in the current release?
Jorge answered a related question on this sometime back here.