Hi, I was trying to figure out how CDR3 DNA sequence in TCR sequencing is converted to the corresponding AA sequence. From a data set I have from Adaptive, an example is:
TCTCTCACTGTGACATCTGCCCAGAAGAACGAGATGGCCGTTTTTCTCTGTGCCAGCAGTTCGACTGGGGGTCCTTATGAACAGTAC If I convert this to AA sequence (6 possible reading frames, using Expasy tool), I get: 5'3' Frame 1 S L T V T S A Q K N E Met A V F L C A S S S T G G P Y E Q Y 5'3' Frame 2 L S L Stop H L P R R T R W P F F S V P A V R L G V L Met N S 5'3' Frame 3 S H C D I C P E E R D G R F S L C Q Q F D W G S L Stop T V 3'5' Frame 1 V L F I R T P S R T A G T E K N G H L V L L G R C H S E R 3'5' Frame 2 Y C S Stop G P P V E L L A Q R K T A I S F F W A D V T V R 3'5' Frame 3 T V H K D P Q S N C W H R E K R P S R S S G Q Met S Q Stop E The reported AA sequence is: CASSSTGGPYEQYF which matches the first of the translated AA sequence. All of the reported sequences start with residue C and end with F. I was wondering why the full sequence is not reported: Met A V F L C A S S S T G G P Y E Q Y and why is F added (to each reported AA) at the end. Is there any reference to this process. Thanks, - Pankaj
Thanks,
- Pankaj
Hi Pankaj, could you please try to format your question into a readable text using e.g. code format for the sequences? Thank you.