filter out T cell clones
4 months ago
chi.delta ▴ 40

Dear all, I just mapped some bulk Seq reads to reference VDJ genes of T cell receptors in order to extract the T cell clonotypes (using mixcr). The resulted clonotypes come in a txt file per sample that looks like this:

count   freq    cdr3nt  cdr3aa  v   d   j   VEnd    DStart  DEnd    JStart
76  0.05846153846153846 TGTGCCTTATCGGGGTACACCGATAAACTCATCTTT    CALSGYTDKLIF    TRDV3   .   TRDJ1   7   -1  -1  6
59  0.045384615384615384    TGTGCTGTGCGGCCTGCCGGGACTGCAAGGCAACTGACCTTT  CAVRPAGTARQLTF  TRAV20  .   TRAJ22  16  -1  -1  22
.....


However, in some cases, I would like to filter out some TRD clones, like for example here the first one that contains a TRDV3 and a TRDJ gene. Can one do this easily directly on the txt file in R? Or is there another way of doing instead of reading the table as a data frame, filtering, and then exporting it as txt again? I eventually import these txt files to vdjtools for further processing.

Any help or idea would be much appreciated Thanks a lot!

vdjtools TCR mixcr bulk • 462 views
4 months ago
Jesse ▴ 500

What mixcr command created that output? From their docs it sounds like outputs are generally just TSV files, so it'd be easy enough to do a read.table or what have you in R and go from there. But they also have a feature to convert things to AIRR format which could be handy too.

In any case filtering the table will involve some kind of of read+filter+write, whether with R or whatever else. Is something like an awk one-liner all you need?

awk '$5 !~ /TRD/ &&$7 !~ /TRD/' < file > file2

this just worked perfectly, thanks a lot!

7 weeks ago
mizraelson ▴ 60

Hi, just to add, its important to notice that a lot of times tra and trd clones share the same segments (V and J). From our experience only C gene can reliably distinguish between those two.

Also, its worth noticing that MiXCR series 4 supports most of the features of vdjtools. You can read our new documentation portal on available post analysis options: https://docs.milaboratories.com/mixcr/reference/mixcr-postanalysis/

Oh, that's good to know about the segments! (chi.delta, watch out, then, if you're trying to recognize TRD from V+J gene names like I mentioned.) Though, aren't alpha/beta/gamma/delta TCR chains assembled from totally different loci? I'm confused how a beta chain could end up using a V gene from TRD for example. This is probably where my ignorance of TCRs vs IGs is showing though.