igblastn doesn't correctly end CDR3 sequence
1
0
Entering edit mode
6.0 years ago

I'm using igblastn 1.4.0 with TRAV and TRAJ databases from IMGT. It aligns the sequences in the query correctly (the FR3 sequence is well-identified) but it fails to set the right bounds for CDR3.

This is one example from an aligned sequence to the germline:

<-------------FR3-IMGT------------->
S  A  Q  L  S  D  S  A  L  Y  Y  C  A  L  S  S  N  A  V  A  K  L  T  F  G  A  G
TCAGCGCAGCTGTCAGACTCTGCCCTGTACTACTGTGCTCTGAGTTCTAATGCCGTCGCTAAGCTCACATTCGGAGCAGGA

As you can see igblastn correctly identifies the FR3 region up the Cys amino acid. Just afterwards the CDR3 starts, but igblastn tells me this:

CDR3-IMGT (germline)    37      45      9

That is, the CDR3 sequence is only 9 bases long, GCTCTGAGT or ALS. However, CDR3 should end with either a F or a W amino acid (not included). In this case it would be F, with CDR3 being: GCTCTGAGTTCTAATGCCGTCGCTAAGCTCACA.

How can I tell igblastn to do this? Is it configurable somehow?

Right now I pass the following arguments:

-germline_db_V mouse_f_orf_inframeP_TRAV_blast-edited.fasta
-germline_db_J mouse_f_orf_inframeP_TRAJ_blast-edited.fasta
-germline_db_D mouse_f_orf_inframeP_TRBD_blast-edited.fasta
-organism mouse
-domain_system imgt
-query myquery.fasta
-ig_seqtype TCR
-auxiliary_data optional_file/mouse_gl.aux
-show_translation -outfmt 3

I can't see anything wrong here. Any help would be appreciated.

cdr3 igblastn CDR3 igblast • 1.8k views
3
Entering edit mode
5.8 years ago

IgBlast latest binary release doesn't mark the CDR3 end (while it seems that the latest web version does). You can either look for the conserved [WF]GXG amino acid motif or use this IgBlast wrapper.