Question: Easier way to filter contaminants based on protein similarity ?
3.7 years ago by
Biogeek400 wrote:

I'm familiar with the idea of blob plots, but for filtering off contaminants in a plant I've sequenced. I've so far used a manual screening protocol, whereby I take the top hit of each query ( sim identity , but score and e value) . I then use the uniprot and trembl species identifiers to filter off plausible contaminants such as bacteria etc.

Doing this takes a hellish long time in excel, manually. Is there anyway I can filter off all hits which are not in green plants? I am using the entire uniprot and trembl as the plant I'm working on commonly has lots of bacteria and fungi on it. I have the uniprot/ trembl identifier available to me.

ADD COMMENTlink modified 3.7 years ago by Elisabeth Gasteiger1.7k • written 3.7 years ago by Biogeek400
3.7 years ago by
Elisabeth Gasteiger1.7k wrote:

Have you tried to perform your BLAST searches against the Plants subsection of UniProtKB in the first place? e.g. on the UniProt website, or programmatically using one of

ADD COMMENTlink written 3.7 years ago by Elisabeth Gasteiger1.7k
