Entering edit mode
16 months ago
A_heath ▴ 120
I have a customed database and I used it to BLASTn against a bacterial genome. I would like to extract the unmatched regions only.
Is there a command line or another way to do it?
Thanks very much for your precious help!
Thank you very much shenwei356 for your reply.
So if I understood correctly,
bedtools complementwill output the coordinates of the regions that are not covered by a hit and
bedtools getfastawill extract the fasta sequences?
edit: I managed to convert my Blastn result file in a .bed file. Now I'm stuck with
bedtools complementas I need to input a genome file (-g). Indeed, in my customed database I dowloaded multiple contigs (117,136 contigs to be exact). The genome file is required to be a 2-column file with the name of the contigs alongside their size in bp. Is there a way to design a genome file for bedtool with this many contigs?
In one file or many files?
For one or a few contigs files:
For many contigs files:
Thank you very much, it worked great to obtain a proper genome file with 2 columns!
However, now I have another issue with
bedtools complementas it returns an error saying that the .bed file contains contigs out of order. I thought about using
bedtools sortbut is this appropriate for this type of file?