Question

Improving the mapping rate by aligner parameters

0

Entering edit mode

6.4 years ago

XBria ▴ 90

Hi everyone ,

Is the iidea of improving mapping rate ( or decreasing multi mapping rate, or decreasing no mapped rate, or increasing uniquely mapped rate) by changing the aligners parameters ( hisat or STAR) feasible ? If yes please share a link to an article to read more about.

Thanks,

RNA-Seq • 3.2k views

ADD COMMENT • link 6.4 years ago by XBria ▴ 90

1

Entering edit mode

Since I see you're using HISAT: spend some time to learn how to use the --score-min function and how to set the thresholds for mismatches and gaps. Those influence the rate, obviously, but make the results dirtier. It's a trade off where you need to find the right spot.

You can also refer to this other post: A: Tophat multiple or unique mapping criteria

ADD REPLY • link 6.4 years ago by Matteo Schiavinato ★ 3.6k

0

Entering edit mode

While theoretically feasible it won't convert bad data into magical alignments. Are you having trouble getting alignments with defaults (which are generally reasonable for good data).

ADD REPLY • link 6.4 years ago by GenoMax 141k

0

Entering edit mode

What I 've done is comparison of two different aligners results, hisat and STAR, now I see STAR emerges better results with higher rate of uniquely aligned reads and etc... To improve Hisat or Star performance, Shorter length of reads (or others) might be a good option as I understood. I do not have issues regarding both of the aligners. Only want to improve the rate. By the way, I could not find a good reference on the internet. If you have one, please share that. Thanks alot !

ADD REPLY • link 6.4 years ago by XBria ▴ 90

2

Entering edit mode

I think you will have to take STAR and HISAT2 manuals and work through various options yourself. Every dataset is going to have certain characteristics and your's is going to be different than others.

BTW: As I recall you had 90+% alignments with STAR based on past posts. This is on par with what you can expect for a NGS RNAseq dataset. You are never going to get 100% alignment. If you did you should be suspicious of that result (unless the data was cherry picked).

ADD REPLY • link 6.4 years ago by GenoMax 141k

0

Entering edit mode

Dear Genomax, For human paired-end data with 75 bp length (for example ERR188044) , how can we optimize parameters when aligning with hisat (to get a higher rate of accuracy)?