Which of the 4 SureSelect Agilent BED files to use with GATK haplotype caller?
0
3
Entering edit mode
4.3 years ago
curious ▴ 750

I have some BAMs from whole exome sequencing.

I want to run GATK haplotype caller, which requires one bed file as input

SureSelect kit for the BAMs comes with 4 different .bed files:

*_Covered.bed

*_AllTracks.bed

*_Padded.bed

*_Regions.bed

Googling shows this question has been asked multiple times: What Agilent Interval Files (.Bed) Should I Use For Exome Variant Calling With Gatk?

I still don't know, but my gut instinct is to use the *_Padded.bed file because according to agilent it shows:

"the genomic regions that you can expect to sequence when using the design for target enrichment. To determine these regions, the program extends the regions in the Covered BED file by 100 bp on each side."

https://earray.chem.agilent.com/suredesign/help/Target_enrichment_design_files_available_for_download.htm

Has anyone done this before and know the way?

gatk variant calling sureselect • 3.2k views
ADD COMMENT
0
Entering edit mode

Just my 2 cents: I'm using the *_Padded file to subset my VCF file. Be aware that the regions can overlap.

ADD REPLY
0
Entering edit mode

I wonder if that even matters for my application, as far as I can tell the bed file is supplied as a argument to GATK Haplotype caller just to cut down on searching time by pointing to specific intervals. I hate making assumptions though i'll be on the lookout

ADD REPLY
0
Entering edit mode

As an aside, are you sure a BED file would even work? I recall running into an issue a few years ago where GATK needed an interval_list file, which was similar but not identical to the BED format.

ADD REPLY
0
Entering edit mode

Well not anymore! I'll take a look thanks again. GATK is amazing resource, kind of complicated though.

ADD REPLY
0
Entering edit mode

I had the same issue - unfortunately these Bed files from this company have mismatched coordinates than the reference files online. Liftover is needed because these files are usually for the older version of the genome AKA hg38 vs 39 etc.

ADD REPLY
0
Entering edit mode

vs 39

Do you mean 19? There is no 39

ADD REPLY
0
Entering edit mode

I think you shold use the *_Regions.bed file

ADD REPLY
0
Entering edit mode

curious Hi! Could I ask you what argument did you use in the HaplotypeCaller command to include de .bed file?

ADD REPLY

Login before adding your answer.

Traffic: 1334 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6