Hi Everyone,
Sorry if this is a dup, but I can't seem to find a satisfactory answer on this site or others.
I'm wondering what, if any, pre-processing I should perform on a reference genome fasta/gff file prior to mapping using BWA or Bowtie. For example, if I wanted to map something to the orangutan genome, should I remove entries that are labeled as "unplaced/unlocalized genomic scaffold" from the gff and fasta files--i.e. only map to canonical chromosomes.
I notice that even these scaffolds have "BestRefSeq" categories in the gff file for genes, indicating that they still have useful information on them.
The reason I ask is because I was told by someone who no doubt knows much more than me about this stuff that I SHOULD remove these chromosomes. I'm wondering, however if this person is wrong.
Thanks!