Can I use CD-HIT package for shotgun sequencing data (metagenomic) comparsion?
1
0
Entering edit mode
7 months ago

I want to compare two genomes that I obtained from my shotgun metagenomic sequencing data. Seems like "cd-hit-est-2d" option works perfectly for me as I need to find out what sequences are similar/dissimilar between these two genomes. However, I read in the CD-HIT manual that it's good for non-intron containing sequences e.g. EST. That means I SHOULD NOT use CD-HIT for my metagenome comparison?

If not, please suggest any other tool that I can use to compare the genomes.

shotgun CD-HIT metagenome genome • 527 views
0
Entering edit mode
7 months ago

I think a multi-genome alignment tool like MUGSY might be more appropriate for this, alternatively dotplots, Mummer, Blast2seqs etc

0
Entering edit mode

Actually I used CD-HIT and it worked. It took only a minute compared to what is reported that it may take longer/resources issue etc.

My point is that should I not trust these results?

1
Entering edit mode

You can trust the result. CD-HIT works for non-intron containing sequences, and prokaryotic genomes (that is presumably what you have) are intronless. It would likely work even for intron-containing genomes as long as they are closely related.

0
Entering edit mode

How does one assess (meta)genome similarity using a clustering tool like CD-HIT? Do you just look at the proportion of clustered contigs w.r.t. total number of contigs?

0
Entering edit mode

CD-HIT has an algorithm called "cd-hit-est-2d" that compares the two genomes or nucleotide datasets and outputs the sequences that are similar/dissimilar between the two.