best alignment method
1
0
Entering edit mode
7.7 years ago

Hi,

I used bowtie for mapping RNA-seq reads to genome. First, I used "--end-to-end" option and got around 77% mapping with genome. Then, I used "--local" option and got 92% mapping with genome. Both of the methods gives different downstream result i.e expression value. Kindly any body suuggest me which procedure will be best for the mapping.

Thank you Debashis

rna-seq • 2.0k views
ADD COMMENT
2
Entering edit mode

Note that bowtie2 can't handle spliced reads, use one of the programs mentioned by WouterDeCoster (or Salmon/Sailfish/Rapmap, which I would add to his list).

ADD REPLY
2
Entering edit mode

BBMap can hang in there as well as any others mentioned below. It is dead simple to use (and splice-aware) and being written in Java can run on PC/Mac/Unix.

ADD REPLY
1
Entering edit mode
  1. based on this paper I would suggest STAR (HISAT slightly faster than STAR but with low memory)
  2. and in case if you need to read more about algorithm and comparison you can read this paper Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM)
ADD REPLY
2
Entering edit mode
7.7 years ago

Most commonly used aligners are in my opinion:

  • TopHat2 (but is declared obsolete by authors)
  • Hisat2 (to replace TopHat2, very fast)
  • STAR (fast but memory hungry, possibly most used)
  • kallisto (using pseudo alignment, extremely fast and as accurate as real alignment)

For all of those convenient manuals and online help is available.

ADD COMMENT
0
Entering edit mode

Thank you for your answer. I have also used Tophat2 but it gives only 55% alignment rate. Can I procede with this mapping percentage? During DEG testing cuffdiff 1.3.0 gives 400 deg and cuffdiff2.2.1 gives 0 deg with this 55% mapping result.

ADD REPLY
0
Entering edit mode

You probably need to trim the reads a bit. Having said that, STAR is faster and tends to give better results (it does local alignment by default, for whatever that's worth).

ADD REPLY
0
Entering edit mode

Raw data are generated from Nexseq 500(2150). I used different trimming options like trimming reads from 3' end, trimming reads from both 5' and 3' end, convert the reads into 2100,convert the reads into 2*75, q value > 20, q > 25. This result. I didn't get any specific changes in the mapping and also there is no change in downstream result except expression value.

ADD REPLY
0
Entering edit mode

Ah, NextSeq. In that case it's worth checking if you have an polyG tails caused by the two color chemistry. Essentially. you want to trim long stretches of (even high quality) G nucleotides from the end of your read.

Because: NextSeq uses only two colors for labeling nucleotides, C is red, T is green. A is both red and green and G is nothing at all. So if your fragment is shorter than your read length, the remainder might be (high quality) G nucleotides.

You could also have a look at FastQC quality metrics to get an idea about how much trimming you should perform.

And I'd also recommend STAR or kallisto.

ADD REPLY
0
Entering edit mode

55% alignment is a separate issue that you would need to investigate. Have you tried to take some of the reads that do not map and checked to see if they are contaminants (blast @NCBI is generally the best way to check this)? Do you have replicate samples or is this a one to one DE comparison?

ADD REPLY

Login before adding your answer.

Traffic: 2532 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6