First of all, the goal of the analysis is to identify nucleotide sequences exclusive of Pseudomonas. It's a wide range of organisms and I'm aware of that. Unfortunately, the person who asked me to do this did not give me proper instructions or define a specific organism.
Since exclusive nt sequences could be found in the whole genome and not only on CDSs, I'm trying to figure out how to analyze the whole genome sequence for this purpose. I've done this before but only working with CDSs and proteins. I came up with an idea to use local blastn to blast search the whole genome against the Pseudomonas nt database. Firstly, the blastn search would find the hits and I would filter out the non-hits; secondly, a file containing the hits would be blasted against the bacteria nr/nt database to find the hits and non-hits and then I would filter out the hits (since they imply sequence similarity with other organisms). Correct me if this method is wrong.
If the above method is okay, then the problem is that I can't figure out how to check the nucleotide sequences that aligned with sequences in the blast database. Remember, this is a complete record with no annotation since I want to analyze the whole genome and not only the coding regions. I'm currently trying to play with the -outfmt parameter but this will take a while since each analysis take quite a time. Web BLAST is out of the question since the analysis is too CPU intensive (resulting in CPU usage limit on web BLAST)
If anyone knows how to solve this, I ask you kindly to show me the way. Thanks.