How to filter hmmer output and get needed enzymes
0
0
Entering edit mode
5 months ago
Suzu • 0

This is a part of my file. You can see the output for KI0314_NODE_20043_length_7522_cov_1.691954_4.

Glyco_hydro_43 PF04616.18 KI0314_NODE_20043_length_7522_cov_1.691954_4 - 2.3e-43 148.8 4.0 3.5e-43 148.2 4.0 1.3 1 0 0 1 1 1 1 Glycosyl hydrolases family 43 GH43_C2 PF17851.5 KI0314_NODE_20043_length_7522_cov_1.691954_4 - 2.8e-31 109.0 0.5 4.5e-31 108.3 0.5 1.3 1 0 0 1 1 1 1 Beta xylosidase C-terminal Concanavalin A-like domain Cellulase PF00150.22 KI0314_NODE_20043_length_7522_cov_1.691954_4 - 5.6e-16 59.0 4.1 1.1e-15 58.1 4.1 1.4 1 0 0 1 1 1 1 Cellulase (glycosyl hydrolase family 5)

In this file I need to remove all enzymes are not a cellulase. So I need to delete this part of file. Do you know some tools which I can use for this instead writing a long script?

hmmer pfam • 278 views
ADD COMMENT
0
Entering edit mode

This sounds like something a little too niche for a dedicated tool to be made for it. But this sounds like a pretty simple task to do in R that wouldn't require a long script. You could use the subset function with grepl where the regex pattern is a string of enzymes you want to remove. If the hmmer output file is particularly large, you could also use the data.table package to speed up reading data in.

ADD REPLY

Login before adding your answer.

Traffic: 1211 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6