Question

Analog usearch -otutab_merge

0

Entering edit mode

2.4 years ago

poet1988 ▴ 30

Robert C. Edgar has developed a great program Usearch (https://drive5.com/usearch/download.html) However, the 32 bit version has limitations, the 64 bit version costs almost $ 1500 (

usearch -otutab_merge otutable1.txt,otutable2.txt,otutable3.txt -output otutable_merged.txt

This command is able to combine OTU tables. But due to the large amount of data, the 32-bit version cannot do this.

What does this command do? Here's an example

Table1

     Sample 1

species1 0

species2 3

Table2

   Sample 2

species1 1

species3 2

Output tab-delimited table

   Sample1   Samlpe2

species1 0 1

species2 3 0

species3 0 2

Can you please tell me if there are analogs without limitation? I'm sure that python can do this, but maybe there are ready-made solutions.

otutab_merge Analog usearch • 1.3k views

ADD COMMENT • link 2.4 years ago by poet1988 ▴ 30

1

Entering edit mode

Hi,

Did you hear about vsearch? (please see the link)

As it appears stated in the github repository:

The aim of this project is to create an alternative to the USEARCH tool developed by Robert C. Edgar (2010). The new tool should:

have open source code with an appropriate open source license

be free of charge, gratis

have a 64-bit design that handles very large databases and much more than 4GB of memory

be as accurate or more accurate than usearch

be as fast or faster than usearch

Although I don't know if the tool has the command that you're looking for. I was looking into their workflow in their wiki and I think they processed the samples in a slightly different way.

In any case, you may consider using this tool in the future, since it is open source and the 64-bit is free.

I hope this helps,

António

ADD REPLY • link 2.4 years ago by antonioggsousa 3.2k

0

Entering edit mode

Many thanks, António for the answer!

I looked at the output of --help and did not find a similar command usearch -otutab_merge

ADD REPLY • link 2.4 years ago by poet1988 ▴ 30

score 0 · Answer 1 · 2021-12-08

I wrote a simple and not elegant script (for 8 samples), maybe it will help someone.

$python3

import pandas as pd

import numpy as np

df1 = pd.read_csv("BVP1.txt", sep='\t')

df2 = pd.read_csv("BVP2.txt", sep='\t')

df3 = pd.read_csv("BVP3.txt", sep='\t')

df4 = pd.read_csv("BVP4.txt", sep='\t')

df5 = pd.read_csv("BVP5.txt", sep='\t')

df6 = pd.read_csv("BVP6.txt", sep='\t')

df7 = pd.read_csv("BVP7.txt", sep='\t')

df8 = pd.read_csv("BVP8.txt", sep='\t')

Mergedf = pd.merge(df1, df2, how = 'outer')

Mergedf2 = pd.merge(Mergedf, df3, how = 'outer')

Mergedf3 = pd.merge(Mergedf2, df4, how = 'outer')

Mergedf4 = pd.merge(Mergedf3, df5, how = 'outer')

Mergedf5 = pd.merge(Mergedf4, df6, how = 'outer')

Mergedf6 = pd.merge(Mergedf5, df7, how = 'outer')

Mergedf7 = pd.merge(Mergedf6, df8, how = 'outer')

Mergedf7.to_csv('Merged_BVP.txt', sep ='\t')