I presume most simple way is to write a perl script but thought I would check not missing a filter option rather than go down this route and future use. I've got GATK SNP and INDEL output. Filtered Indels to another vcf but want to filter only Indel insertions of 20 or more to another vcf. Is there an option for this in vcftools or other GATK post filtering software i.e. snpsift? I can't seem to see it.
It works, thanks but I want a min length of 20 not max (could use --maxIndelSize for this too). I'll look at the doc and post what I use for 20 and above indels filtering. changing the "<" around doesn't produce any results even when I go down to 5 even though I know there are indels of 5+ and above.
function accept(){
var lengths=variant.indelLengths;
if(lengths==null)returnfalse;
for(var i=0;i< lengths.size();i++){
var L=lengths.get(i);
if(L>20 || L<-20)returntrue;}returnfalse;}
accept();
Thanks ill check it out in detail Tomorrow. I saw your tool for extracting 0 coverage regions from a genome. I was late night musing on how could use this. trying to think of how to mask regions of high coverage i.e. missplaced reads so second time round for mapping they are mapped to more likely locations? Or could you use this to focus on remapping to these regions single end by stripping reads from these high coverage areas. you use this to attempt to map to by stripping regions with massive coverage which is incorrect mapping.
Quick disclaimer: I'm new to bioinformatics so maybe there is some other way to do this filter using other tools, but I found GATK to be the most easy and straightforward.
getIndelLengths().0
returns the first element of the list isn't it ? what happens if there is more than one ALT allele ?I would be curious about this too if someone has the answer
It works, thanks but I want a min length of 20 not max (could use --maxIndelSize for this too). I'll look at the doc and post what I use for 20 and above indels filtering. changing the "<" around doesn't produce any results even when I go down to 5 even though I know there are indels of 5+ and above.
Got it for insertions only that are larger than 19 and less than 1000 as this: -select 'vc.getIndelLengths().0 > 19 && vc.getIndelLengths().0 < 1000'