How To Handle Reads Ending With Deletions In Gatk?
0
0
Entering edit mode
10.4 years ago
Luca Beltrame ▴ 240

Hello,

before asking my question, I should point out that I'm working with data that's not my own (publicly available), to learn and establish a proper workflow when real data wlll arrive in the laboratory.

I'm dealing with some exome data[1] from an Ion Torrent 318 chip and I'm trying to run the GATK RealignerTargetCreator on it to perform recalibration later on. The problem is that some reads have a deletion at the end:

read ends with deletion. Cigar: 179S54M1D5M1I9M1D

And thus they're not processable by GATK. How to handle this case? Is the workflow I used (outlined below) to blame for this?

Steps I did:

First, QC: keep reads with a phred score of at least 20 in 80% of the bases (python script modeled over the fastx toolkit).

Then, realignment with bwa bwasw (consider that reads by Ion Torrent can go up to 250 bp):

bwa bwasw -t 8 hg19.fa C30-101.filtered.fastq > C30-101.sam

Followed by conversion to BAM, addition of RG groups, sorting, and indexing (pysamtools).

Then GATK was invoked as

 gatk -T RealignerTargetCreator -R hg19.fa -o input.bam.list -I C30-101_RG.bam

(gatk is a small wrapper that merely hides the java -Xmx -jar ... stuff.)

[1] http://lifetech-it.hosted.jivesoftware.com/docs/DOC-2659 (registration may be required)

gatk indel analysis alignment ion-torrent • 2.4k views
ADD COMMENT
0
Entering edit mode

Replace "179S54M1D5M1I9M1D" to "179S54M1D5M1I9M1S" (last D to S). Sorry.

ADD REPLY

Login before adding your answer.

Traffic: 1853 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6