Mapping a list of sequences to gene symbol
0
0
Entering edit mode
4.8 years ago

Hi!

I have a list of sequences majority of which map to human genome. I want to map them onto the genome, and obtain gene symbol when the sequence falls within a gene. I already know of an R solution which seems very slow on running, a solution based on Blat which I honestly dont know and dont have the time to learn and then parse the results( I am not even much familiar with linux).A third solution that comes to my mind is to feed RNAseq analysis tools with my fasta files and see the result. First question, is that whether this approach will work (my seqs are all 60 nt)? The second question is that the results from my R code (that are verified using web version of blat) contain symbols that are not returned by either Hisat2|stringtie or Hisat2|HtseqCount or Salmon (I run them on galaxy) is that some requirements of hisat2/salmon are not met by my dataset or because I dont know them, although running them on Galaxy is a piece of cake. I can give example of a couple of sequences that are not mapped using tools at galaxy but are mapped using R.

Hisat2 salmon HTSeqCount StringTie • 682 views
ADD COMMENT
0
Entering edit mode

Command line blast/blat is your best option here. RNAseq tools might work but you need something more robust than fast quantification since you have actual sequences and not reads.

ADD REPLY

Login before adding your answer.

Traffic: 2230 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6