Entering edit mode
2.7 years ago
amal.elzemrany
▴
30
I have a tsv file for duplicated genes and their transcripts, I need to extract each duplicated gene with their transcripts in one row using bash
input:
STRG.8 STRG.8.1
STRG.8 STRG.8.2
STRG.88 STRG.88.1
STRG.88 STRG.88.2
I need the output to be the gene with the number of duplicated transcripts and these transcripts like this
STRG.8 2 STRG.8.1, STRG.8.2
STRG.88 2 STRG.88.1, STRG.88.2
Is
bash
your only option, or are you fine with a bit ofR
?