Dear all,
I just mapped some bulk Seq reads to reference VDJ genes of T cell receptors in order to extract the T cell clonotypes (using mixcr). The resulted clonotypes come in a txt file per sample that looks like this:
However, in some cases, I would like to filter out some TRD clones, like for example here the first one that contains a TRDV3 and a TRDJ gene. Can one do this easily directly on the txt file in R? Or is there another way of doing instead of reading the table as a data frame, filtering, and then exporting it as txt again? I eventually import these txt files to vdjtools for further processing.
Any help or idea would be much appreciated
Thanks a lot!
What mixcr command created that output? From their docs it sounds like outputs are generally just TSV files, so it'd be easy enough to do a read.table or what have you in R and go from there. But they also have a feature to convert things to AIRR format which could be handy too.
In any case filtering the table will involve some kind of of read+filter+write, whether with R or whatever else. Is something like an awk one-liner all you need?
Hi, just to add, its important to notice that a lot of times tra and trd clones share the same segments (V and J). From our experience only C gene can reliably distinguish between those two.
Oh, that's good to know about the segments! (chi.delta, watch out, then, if you're trying to recognize TRD from V+J gene names like I mentioned.) Though, aren't alpha/beta/gamma/delta TCR chains assembled from totally different loci? I'm confused how a beta chain could end up using a V gene from TRD for example. This is probably where my ignorance of TCRs vs IGs is showing though.
this just worked perfectly, thanks a lot!