Question: Removing unassigned OTUs
1
gravatar for rahulgenexpress
13 months ago by
rahulgenexpress10 wrote:

I have a bunch of sequences which failed to get assignment up to genus level. For downstream analysis I want to remove those OTUs/sequences/ASVs from the phyloseq object (OTU table). Could somebody suggest me script to remove those OTUs? Thanks in advance.

EDIT:

The input and output provided as image as well as ta ext. You can see ASVs which are not assigned up to genus level (NA) which are ASV, ASV7, ASV8 etc has been removed in output. I want the same for whole table. because of certain reasons I dont want to use Excel.

Input

https://ibb.co/dLxOz8

#otuid  sample1 sample2 Phylum  Class   Order   Family  Genus   species 
ASV6    0   0   Cyano   [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   NA  NA  
ASV7    7549    0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   NA  NA  
ASV8    165 0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   NA  NA  
ASV11   0   0   TM7 TM7-3   I025    Rs-045  NA  NA  
ASV12   0   0   TM7 TM7-3   I025    Rs-045  NA  [Saccharibacteria]_UB2523   
ASV13   0   0   TM7 TM7-3   I025    Rs-045  [Saccharibacteria]  [Saccharibacteria]_UB2523   
ASV1    14692   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV2    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV3    7347    5107    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV4    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV5    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV9    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]  
ASV10   3685    2773    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]

output

https://ibb.co/dQRae8

#otuid  sample1 sample2 Phylum  Class   Order   Family  Genus   species 
ASV13   0   0   TM7 TM7-3   I025    Rs-045  [Saccharibacteria]  [Saccharibacteria]_UB2523   
ASV1    14692   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV2    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV3    7347    5107    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV4    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV5    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV9    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]  
ASV10   3685    2773    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]
microbiome otu next-gen R phyloseq • 675 views
ADD COMMENTlink modified 13 months ago by zx87547.9k • written 13 months ago by rahulgenexpress10
3
gravatar for genomax
13 months ago by
genomax70k
United States
genomax70k wrote:

In that case you don't need to use R. You could use awk and not print any line where the 8th field matches NA.

cat <(head -n 1 test) <(awk -F'\t' '$8 !~/NA/' input.txt) > out.txt
ADD COMMENTlink modified 13 months ago • written 13 months ago by genomax70k
2
gravatar for zx8754
13 months ago by
zx87547.9k
London
zx87547.9k wrote:

Assuming you have read the file into R using something like myData <- read.table(file = "myFile.txt"). We can subset NA value rows as below:

myDataClean <- myData[ !is.na(myData$Genus), ]

# then output to a file
write.table(myDataClean, "myFileClean.txt")
ADD COMMENTlink written 13 months ago by zx87547.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 756 users visited in the last hour