Question: NCBI Contamination Screen
1
gravatar for Seraph
6 weeks ago by
Seraph0
Seraph0 wrote:

Hi all,

I have medusa-scaffold data, which contain 160 scaffolds and some NNN bridges. The file is in fasta format, I need to remove all scaffolds less than 200, and vectors possible contamination, this why I am trying to use NCBI Vecscreen. VecScreen However, as a beginner, I still do not know where can I find the option of removing these <200 bp, and how to upload my fasta data, since I have no accession number yet.

Please let me know if you are familiar with such a tool. Many thanks!

next-gen genome • 157 views
ADD COMMENTlink modified 5 weeks ago • written 6 weeks ago by Seraph0
1

If you are looking to simply remove sequences that are <200 bp (no screening per se) then you can use BBtools.

reformat.sh in=your.fa out=filtered.fa minlength=200
ADD REPLYlink written 6 weeks ago by genomax87k

Thanks genomax can this code be used in the mac terminal!

ADD REPLYlink written 6 weeks ago by Seraph0
1

Yes. BBTools are written in java. So you just need Java runtime.

ADD REPLYlink modified 6 weeks ago • written 6 weeks ago by genomax87k

Hi, biomax, please check the edited question. Thanks in advance for your valuable answers

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Seraph0

Thanks it works well

ADD REPLYlink modified 4 weeks ago • written 5 weeks ago by Seraph0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 672 users visited in the last hour