igblastn doesn't correctly end CDR3 sequence
Entering edit mode
6.0 years ago

I'm using igblastn 1.4.0 with TRAV and TRAJ databases from IMGT. It aligns the sequences in the query correctly (the FR3 sequence is well-identified) but it fails to set the right bounds for CDR3.

This is one example from an aligned sequence to the germline:

 S  A  Q  L  S  D  S  A  L  Y  Y  C  A  L  S  S  N  A  V  A  K  L  T  F  G  A  G

As you can see igblastn correctly identifies the FR3 region up the Cys amino acid. Just afterwards the CDR3 starts, but igblastn tells me this:

CDR3-IMGT (germline)    37      45      9

That is, the CDR3 sequence is only 9 bases long, GCTCTGAGT or ALS. However, CDR3 should end with either a F or a W amino acid (not included). In this case it would be F, with CDR3 being: GCTCTGAGTTCTAATGCCGTCGCTAAGCTCACA.

How can I tell igblastn to do this? Is it configurable somehow?

Right now I pass the following arguments:

-germline_db_V mouse_f_orf_inframeP_TRAV_blast-edited.fasta
-germline_db_J mouse_f_orf_inframeP_TRAJ_blast-edited.fasta
-germline_db_D mouse_f_orf_inframeP_TRBD_blast-edited.fasta
-organism mouse
-domain_system imgt
-query myquery.fasta
-ig_seqtype TCR
-auxiliary_data optional_file/mouse_gl.aux
-show_translation -outfmt 3

I can't see anything wrong here. Any help would be appreciated.

cdr3 igblastn CDR3 igblast • 1.8k views
Entering edit mode
5.8 years ago

IgBlast latest binary release doesn't mark the CDR3 end (while it seems that the latest web version does). You can either look for the conserved [WF]GXG amino acid motif or use this IgBlast wrapper.


Login before adding your answer.

Traffic: 2804 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6