Question

[ASK| About CDR3 gene region extraction and clustering

0

Entering edit mode

2.3 years ago

muhamamadfeisaljatnika • 0

Dear all Biostar community members,

First of all, I apologize for asking about beginner things. This is my first time posting here. I have VH gene FASTQ data of bio-panning products to be analyzed. I usually use USEARCH in LINUX System in order to merge, pipelining and cluster. But, the result is always the full length of the VH gene. Recently, I would extract HCDR3 of the whole population of each bio-panning for better understanding. But, I am really confused about how to extract just the HCDR3 region. Would anyone mind helping me with my problem? I am a beginner in the python language programming.

Thank you

CDR3 Antibody • 540 views

ADD COMMENT • link updated 2.2 years ago by Jesse ▴ 740 • written 2.3 years ago by muhamamadfeisaljatnika • 0

score 0 · Answer 1 · 2022-01-10

If you mean you have full variable region sequences (V+D+J segments included) and you just want to extract the CDRH3 region from those, I'd recommend IgBLAST. It has a command-line version you can run with multiple output formats including standard AIRR TSV that makes it easy to parse out the cdr3 column. The command-line version takes a little bit of setup because you have to provide it with the databases of V+D+J germline sequences plus a table of J gene attributes (the "auxiliary data") if you want that junction info, though. There's a guide for the command-line version to get that set up.