Question: How to retrieve rows from OTU table
0
gravatar for mollysil
3.0 years ago by
mollysil0
mollysil0 wrote:

I have a text file that is a list of OTU names in the first column, with the occurrence in each treatment in the following columns (totaling 34 columns). I put a sample of the table below. There are ~3000 OTUs total in this file (therefore, ~3000 rows).

CM2_9   0   0   0

AF141_14    22  25  23

AF171_13    13  0   0

LIPB162_1   0   0   0

I have a separate text file with all the OTU names of interest (~500 OTUs), which looks something like this:

WSF3_2

WSF1_2

AF172_15

IO2_57

Is there a simple way to retrieve just the rows in my table that match up to the OTUs of interest? I want, as output, a new table with just the rows of my OTUs of interest. Help please! I'm working in PUTTY (linux). Also, does anything need to be changed to a comma delimited file? Both files are tab delimited as a .txt file.

rows otu • 1.1k views
ADD COMMENTlink modified 3.0 years ago by 5heikki9.0k • written 3.0 years ago by mollysil0
1

Take a look at join command in unix if you do not want to use external programs.

ADD REPLYlink written 3.0 years ago by genomax91k

Could you try the following solution:

$ grep -f test2.txt test1.txt

test2.txt contains all the OTU names of interest (~500 OTUs) and test1.txt is complete OTU file (~3000 OTUs)

Input:

$ cat test1.txt 
CM2_9   0   0   0
AF141_14    22  25  23
AF171_13    13  0   0
LIPB162_1   0   0   0


$ cat test2.txt 
LIPB162_1
CM2_9

output:

$ grep -f test2.txt test1.txt 
CM2_9   0   0   0
LIPB162_1   0   0   0
ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by cpad011214k
2
gravatar for 5heikki
3.0 years ago by
5heikki9.0k
Finland
5heikki9.0k wrote:

Assuming tab separated files

join -1 1 -2 1 -t $'\t' <(sort -t $'\t' -k1,1 otutable) <(sort -t $'\t' -k1,1 listfile)
ADD COMMENTlink written 3.0 years ago by 5heikki9.0k
1
gravatar for st.ph.n
3.0 years ago by
st.ph.n2.5k
Philadelphia, PA
st.ph.n2.5k wrote:

Here's a quick python solution, where ids.txt are the OTUs of interest, and otus.txt is your original file.

#!/usr/bin/env python

with open('ids.txt', 'r') as f:
    ids = [line.strip() for line in f]

with open('otus.txt', 'r') as f2:
    otu = {}
    for line in f2:
        otu[line.strip().split('\t')[0]] = line.strip().split('\t')

for i in ids:
    print '\t'.join(otu[i])

Save as get_otus.py, run as python get_otus.py > my_otus.txt

ADD COMMENTlink written 3.0 years ago by st.ph.n2.5k

Magical! Thanks so much!!!

ADD REPLYlink written 3.0 years ago by mollysil0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2045 users visited in the last hour