How to do typing of VDJ regions for10x TCR kit?
2
0
Entering edit mode
6 months ago

I am analyzing the full-length sequencing of TCRs or precisely VDJ regions. Although I worked in the NGS field for some time," immunoinformatics " is new to me. How to do the typing for VDJ. It is not in the genome as one region. For now, I found the database IMGT but in their fasta there are dots that I can't explain. I tried Blast to the genome but they are not some gaps. In the genome it is all connected so should I ignore them? Maybe they separate regions?

>AE000658|TRAV1-1*01|Homo sapiens|F|V-REGION|128090..128364|275 nt|1| | | | |275+48=323| | |
ggacaaagccttgagcag...ccctctgaagtgacagctgtggaaggagccattgtccag
ataaactgcacgtaccagacatctggg..................ttttatgggctgtcc
tggtaccagcaacatgatggcggagcacccacatttctttcttacaatgctctg......
......gatggtttggaggagaca...............ggtcgtttttcttcattcctt
agtcgctctgatagttatggttacctccttctacaggagctccagatgaaagactctgcc
tcttacttctgcgctgtgagaga
>X04939|TRAV1-1*02|Homo sapiens|(F)|V-REGION|52..320|269 nt|1| | | | |269+48=317| | |
ggacaaagccttgagcag...ccctctgaagtgacagctgtggaaggagccattgtccag
ataaactgcacgtaccagacatctggg..................ttttatgggctgtcc
tggtaccagcaacatgatggcggagcacccacatttctttcttacaatggtctg......
......gatggtttggaggagaca...............ggtcgtttttcttcattcctt
agtcgctctgatagttatggttacctccttctacaggagctccagatgaaagactctgcc


Is my approach correct? map to these reference fasta files and assign the cell according to the DJV type like TRAV1-1*02? and what is the meaning of these dots?

TCRA Immunology TCR • 607 views
1
Entering edit mode
2
Entering edit mode
6 months ago
Jesse ▴ 450

IMGT aligns its reference sequences in its own standardized way, using periods for the gaps, so you can compare between them more directly with a standardized numbering. You could ungap it and get the actual sequence ("all connected" like you said) you'd see in the genome. The gaps are only there to give us a sort of standardized numbering to make talking about the specific parts of these sequences easier.

https://www.imgt.org/IMGTScientificChart/Numbering/IMGTnumbering.html

For the second aspect of your question, the V/D/J parts of your observed TCR sequences are coming from non-contiguous portions of certain genomic loci that are brought together during recombination. (Maybe that's what you mean by "not in the genome as one region"?) Like in this figure:

You'll need to figure out in software what individual V/D/J genes most likely recombined to produce the receptor sequences you see. Easier said than done, but software like cellranger (are you using cellranger for your 10x results here?) can do that for you.

I've used it for B cells, not T cells, but it looks like the same cellranger vdj functionality is there for both.

There are lots of other programs that do the assignment part, too. I've used IgBLAST and IMGT V-QUEST plenty for antibody sequences but never TCRs, but apparently these work for T cells too:

0
Entering edit mode

Thank you. I am using a modified method (long read seq). I will check IgBLAST.

1
Entering edit mode
13 days ago
mizraelson ▴ 60

I would also recommend using MiXCR software wich has its own library which is superior to IMGT in many ways. Its a very easy one line command for 10x data:

mixcr analyze 10x-vdj-tcr \
--species hsa \
sample_R1.fastq.gz \
sample_R2.fastq.gz \
sample_result


You can read more on that here: https://docs.milaboratories.com/mixcr/reference/overview-built-in-presets/#10xgenomics

Also, if you have any custom modifications to 10x you can contact us on github (https://github.com/milaboratory/mixcr) and we can help with preparing a single-line command specifically for your protocol.