Illumina WES intervals file for coverage analysis?
1
0
Entering edit mode
4.3 years ago
vctrm67 ▴ 50

Does anyone know where to get intervals files for Illumina WES? I was only able to find this.

For reference, I want to run coverage/depth analysis for one of the BAM files that I have. I tried looking into what preparation went into sequencing based on the BAM header but only got ID:DS-bkm-112-N_L001_RG SM:DS-bkm-112-N_L001 LB:lib_name PL:illumina.

I think I'm able to run coverage analysis without the intervals file but it would be whole genome analysis, if I'm not mistaken. Don't think this would be very accurate but better than nothing?

illumina • 3.0k views
ADD COMMENT
0
Entering edit mode

That BAM header does not tell you what WES kit was used. It just says that the sequencing platform was Illumina. Illumina makes their own WES kits, but there are many others.

ADD REPLY
0
Entering edit mode

I am aware, but was hoping I was wrong...so the best I can do is just run a whole-genome based coverage like GATK's DepthOfCoverage?

ADD REPLY
0
Entering edit mode

Whole-genome coverage probably would not make sense. You are only expecting to capture about 2% of the genome with WES. However, you can use all known exons as your intervals. That would not be perfect, but it would be much closer to the truth.

ADD REPLY
0
Entering edit mode

Do you know where I could get such an interval file?

Also, if I did whole-genome coverage, wouldn't that just be the same as WES but with many more blank intervals? The aggregated interval statistics would probably be off, but for the relevant intervals, would they be more or less accurate?

ADD REPLY
0
Entering edit mode

You should find out from the originators of data which kits was used or did you download this data from SRA?

ADD REPLY
0
Entering edit mode

No I received this through many different people and it is difficult to contact the originators

ADD REPLY
0
Entering edit mode

You would still have to define "relevant intervals" somehow.

ADD REPLY
0
Entering edit mode

Wouldn't I just look for intervals that have coverage? Wouldn't those be the "relevant intervals"?

ADD REPLY
1
Entering edit mode

If you look for regions that have coverage, they will be covered.

Usually, the main reason to check for coverage is to see how many regions of interest are covered.

ADD REPLY
2
Entering edit mode
4.3 years ago
trausch ★ 1.9k

ExAC published an interval list of callable regions for WES. In our experience this interval list is pretty well covered by all exome capture kits (illumina, Agilent, ...) and thus, good for QC and coverage analysis purposes. Coordinates are hg19/GRCh37. There are plenty of tools to then calculate the on-target rate and the fraction of targets above a certain coverage level, for instance, alfred:

wget https://storage.googleapis.com/gnomad-public/intervals/exome_calling_regions.v1.interval_list
grep -v "^@" exome_calling_regions.v1.interval_list | grep -v "^MT" | sed 's/^/chr/' | cut -f 1-3,5 | gzip -c > targets.bed.gz
alfred qc -b targets.bed.gz -r hg19.fa -j out.json.gz -o out.tsv.gz input.bam

The output JSON file can be browsed interactively here.

ADD COMMENT

Login before adding your answer.

Traffic: 1945 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6