I have a list of sequences majority of which map to human genome. I want to map them onto the genome, and obtain gene symbol when the sequence falls within a gene. I already know of an R solution which seems very slow on running, a solution based on Blat which I honestly dont know and dont have the time to learn and then parse the results( I am not even much familiar with linux).A third solution that comes to my mind is to feed RNAseq analysis tools with my fasta files and see the result. First question, is that whether this approach will work (my seqs are all 60 nt)? The second question is that the results from my R code (that are verified using web version of blat) contain symbols that are not returned by either Hisat2|stringtie or Hisat2|HtseqCount or Salmon (I run them on galaxy) is that some requirements of hisat2/salmon are not met by my dataset or because I dont know them, although running them on Galaxy is a piece of cake. I can give example of a couple of sequences that are not mapped using tools at galaxy but are mapped using R.

Command line blast/blat is your best option here. RNAseq tools might work but you need something more robust than fast quantification since you have actual sequences and not reads.


