According to the previous answered questions here, when we have RNAseq samples of human or mouse with mapping percentage lower than 70%, we should BLAST some of unmapped reads to find the source of contamination.
1- I want to know the reason for that. Is this useful just for when we have produced data ourselves and we have to redo the experiment and prevent that type of contamination? Or knowing the source of contamination is essential even when we want to analyze other's data on GEO for the sack of eliminating genes related to the contamination ?
2-If I have a dataset with 30 samples and 15 of them are aligned less than 70%, is it essential to eliminate gene counts related to the contamination or I should simply remove those 15 samples because there is nothing I can do to rescue them? What if a less number of samples (e.g. 3samples) are aligned less than 70%?