Entering edit mode
                    8.7 years ago
        Neuls
        
    
        ▴
    
    20
    I'm planing to use cd-hit to remove redundancy from a 16S mRNA dataset I have got in order to build a philogenetic tree using Phylip later on. Maybe it is a newbiew question but I wonder if i have to remove redundacy before or after doing MSA using MAFFT.
Also I wonder if the output from cd-hit can be in phylip format..
Thank you
You may want to check out
dedupe.shorclumpify.shfrom BBMap suite for this purpose. If you are looking to remove perfectly identical reads from a NGS dataset doing it before alignments would be best.