I am sure I am not the first person with this problem and do not want to reinvent the wheel here, therefor I would like to know if someone has a solution for the following problem:
Based on the contamination report I got from the NCBI contamination screen after trying to upload my transcriptome assembly there are some sequences that need to be trimmed. One example from the report looks like this:
"sequence name" 350 1..109 Pseudomonas putida W619
Meaning that bp 1 - 109 is of suspicious origin and needs to be trimmed from this sequence that is 350 bp in total. Is there a program or script known to you that reads in both the contamination report (can be edited so it only includes lines as the example) and the fasta file and spits out a fasta with some of the secuences trimmed?
Or do I need to write one myself (not so experienced)..?
Thanks in advance!