Dear Friends, Hi
I have done a blastX against NCBI nr database (using Diamond and keeping -max_target_seqs = 1) with outfmt 6.
I want to collect 50 proteins with the most frequent occurance in my results.
Is there any command line sccript or program for doing this task?
(I have tried
cutting the column of the IDs and then openning it in Microsoft excel and count the duplicates and . . . but opening such file and running the duplicate count in my Windows system computer which is not very powerful is very difficult)
Thank you in advance