calculate GCContents using GATK
0
0
Entering edit mode
9.0 years ago
illinois.ks ▴ 210

I am working with exome-seq data. (actually more like targeted sequencing data)

I have my own bed file. I am trying to calculate GC contents for my intervals in my bed file using GATK GCContentBYInterval.

However, I realized that some of intervals are missing after I run GCContextbyInterval in GATK.

For example, I have 62308 intervals in my bed file.

But when I run

java -Xmx2000m -Djava.io.tmpdir=TEMP -jar xxxx/GenomeAnalysisTK.jar -T GCContentByInterval -L mybedfile.bed -R fastfile.fa -o gc.txt

My gc.txt file only includes 62181 intervals(lines) instead of 62308.

I am not sure where were 130 intervals gone..

I googled it and found that if the intervals are continous, it will not be reported both, only one. However it is not the case here.

Could somebody please let me know? Am I missing something? Is it a bug in GATK?

next-gen • 2.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 2620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6