Question: How to retrieve rows from OTU table
0
gravatar for mollysil
15 months ago by
mollysil0
mollysil0 wrote:

I have a text file that is a list of OTU names in the first column, with the occurrence in each treatment in the following columns (totaling 34 columns). I put a sample of the table below. There are ~3000 OTUs total in this file (therefore, ~3000 rows).

CM2_9   0   0   0

AF141_14    22  25  23

AF171_13    13  0   0

LIPB162_1   0   0   0

I have a separate text file with all the OTU names of interest (~500 OTUs), which looks something like this:

WSF3_2

WSF1_2

AF172_15

IO2_57

Is there a simple way to retrieve just the rows in my table that match up to the OTUs of interest? I want, as output, a new table with just the rows of my OTUs of interest. Help please! I'm working in PUTTY (linux). Also, does anything need to be changed to a comma delimited file? Both files are tab delimited as a .txt file.

rows otu • 621 views
ADD COMMENTlink modified 15 months ago by 5heikki8.1k • written 15 months ago by mollysil0
1

Take a look at join command in unix if you do not want to use external programs.

ADD REPLYlink written 15 months ago by genomax62k

Could you try the following solution:

$ grep -f test2.txt test1.txt

test2.txt contains all the OTU names of interest (~500 OTUs) and test1.txt is complete OTU file (~3000 OTUs)

Input:

$ cat test1.txt 
CM2_9   0   0   0
AF141_14    22  25  23
AF171_13    13  0   0
LIPB162_1   0   0   0


$ cat test2.txt 
LIPB162_1
CM2_9

output:

$ grep -f test2.txt test1.txt 
CM2_9   0   0   0
LIPB162_1   0   0   0
ADD REPLYlink modified 15 months ago • written 15 months ago by cpad011211k
2
gravatar for 5heikki
15 months ago by
5heikki8.1k
Finland
5heikki8.1k wrote:

Assuming tab separated files

join -1 1 -2 1 -t $'\t' <(sort -t $'\t' -k1,1 otutable) <(sort -t $'\t' -k1,1 listfile)
ADD COMMENTlink written 15 months ago by 5heikki8.1k
1
gravatar for st.ph.n
15 months ago by
st.ph.n2.4k
Philadelphia, PA
st.ph.n2.4k wrote:

Here's a quick python solution, where ids.txt are the OTUs of interest, and otus.txt is your original file.

#!/usr/bin/env python

with open('ids.txt', 'r') as f:
    ids = [line.strip() for line in f]

with open('otus.txt', 'r') as f2:
    otu = {}
    for line in f2:
        otu[line.strip().split('\t')[0]] = line.strip().split('\t')

for i in ids:
    print '\t'.join(otu[i])

Save as get_otus.py, run as python get_otus.py > my_otus.txt

ADD COMMENTlink written 15 months ago by st.ph.n2.4k

Magical! Thanks so much!!!

ADD REPLYlink written 15 months ago by mollysil0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 835 users visited in the last hour