Question: Removing unassigned OTUs
1
gravatar for rahulgenexpress
7 months ago by
rahulgenexpress10 wrote:

I have a bunch of sequences which failed to get assignment up to genus level. For downstream analysis I want to remove those OTUs/sequences/ASVs from the phyloseq object (OTU table). Could somebody suggest me script to remove those OTUs? Thanks in advance.

EDIT:

The input and output provided as image as well as ta ext. You can see ASVs which are not assigned up to genus level (NA) which are ASV, ASV7, ASV8 etc has been removed in output. I want the same for whole table. because of certain reasons I dont want to use Excel.

Input

https://ibb.co/dLxOz8

#otuid  sample1 sample2 Phylum  Class   Order   Family  Genus   species 
ASV6    0   0   Cyano   [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   NA  NA  
ASV7    7549    0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   NA  NA  
ASV8    165 0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   NA  NA  
ASV11   0   0   TM7 TM7-3   I025    Rs-045  NA  NA  
ASV12   0   0   TM7 TM7-3   I025    Rs-045  NA  [Saccharibacteria]_UB2523   
ASV13   0   0   TM7 TM7-3   I025    Rs-045  [Saccharibacteria]  [Saccharibacteria]_UB2523   
ASV1    14692   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV2    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV3    7347    5107    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV4    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV5    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV9    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]  
ASV10   3685    2773    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]

output

https://ibb.co/dQRae8

#otuid  sample1 sample2 Phylum  Class   Order   Family  Genus   species 
ASV13   0   0   TM7 TM7-3   I025    Rs-045  [Saccharibacteria]  [Saccharibacteria]_UB2523   
ASV1    14692   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV2    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV3    7347    5107    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV4    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV5    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV9    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]  
ASV10   3685    2773    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]
microbiome otu next-gen R phyloseq • 416 views
ADD COMMENTlink modified 7 months ago by zx87546.5k • written 7 months ago by rahulgenexpress10
3
gravatar for genomax
7 months ago by
genomax62k
United States
genomax62k wrote:

In that case you don't need to use R. You could use awk and not print any line where the 8th field matches NA.

cat <(head -n 1 test) <(awk -F'\t' '$8 !~/NA/' input.txt) > out.txt
ADD COMMENTlink modified 7 months ago • written 7 months ago by genomax62k
2
gravatar for zx8754
7 months ago by
zx87546.5k
London
zx87546.5k wrote:

Assuming you have read the file into R using something like myData <- read.table(file = "myFile.txt"). We can subset NA value rows as below:

myDataClean <- myData[ !is.na(myData$Genus), ]

# then output to a file
write.table(myDataClean, "myFileClean.txt")
ADD COMMENTlink written 7 months ago by zx87546.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2512 users visited in the last hour