How do I get the lengths of different regions targeted by the Nextera expanded rapid capture exome kit for b37?
1
1
Entering edit mode
7.9 years ago
anp375 ▴ 180

I want the total lengths of regions that are exons, splicing, UTRs, and ncRNA. I can't distinguish them from the provided bed file. If there is a way using the bed file, please let me know. Thank you.

exome nextera lengths • 2.1k views
ADD COMMENT
2
Entering edit mode
7.9 years ago
Carlos Borroto ★ 2.1k

You can use a GTF with annotated intervals for the categories you want. For example this one from Ensembl. Use something like unix grep to create files for the categories you are interested. Then use bedtools intersect to get the regions of your BED file overlapping with each category GTF file. Finally use the procedure described here to find total lengths.

Also, if you are only interested on the ratio for each category, you could simplify this by using bedtools jaccard. In this case you would only need your BED file and the categories GTF files.

ADD COMMENT
0
Entering edit mode

Thank you, this helped a lot.

ADD REPLY

Login before adding your answer.

Traffic: 2835 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6