Question: Your Favorite Mapper For What Kind Of Data?
5
gravatar for Mdeng
7.9 years ago by
Mdeng510
Germany
Mdeng510 wrote:

Hi everyone,

maybe this is question is not specific enough or is bit to fuzzy, like "you can't find a perfect answers". But I would like to know which mapper do use for what kind of data. I tried some different mappers, on some sets of data and the results quiet differ, even if some tools where recommended for a special kind of data.

So I would like to know which mapper you are using.

Possible "combinations" I would suggest:

Illumina, Solid, Roche, single end, paired end, whole genome, short reads, BWA, BFAST, Bowtie, MAQ, Reference Software delivered with the Sequencer, Something else (maybe commercial).

If you are interested in my favs:

  • BWA for Solid whole genome
  • MAQ, BFAST for Solid short reads, paired end - here I am very exited about your answer for single end
illumina solid bwa • 2.3k views
ADD COMMENTlink written 7.9 years ago by Mdeng510
3
gravatar for Ian
7.9 years ago by
Ian5.5k
University of Manchester, UK
Ian5.5k wrote:

In regards to single end SOLiD sequences (50bp) for the purpose of ChIP-seq:

Corona-Lite

I used to always use this as i thought the uniquely mapped reads was the safest option. Plus it gave a good %mapping rate compared other mappers i played with. The disadvantage is that it does not produce native SAM output and the conversion process is lengthy and the product is not compatible with downstream tools. It is also no longer being developed and has been replace by Bioscope, which i believe is now being replaced by Lifescope.

BFAST

I made the decision to move over to BFAST as i was impressed by what i heard about it from conferences etc. The indexes are BIG and slow to create though. The main reason i like it is because of the 'postprocess' flag '-a3' that returns uniquely mapping reads and reads that map to multiple locations, but where one match scores better than the rest. It also does not appear (to me) to constrain the output reads by the number of mismatches, but this is tricky as read matches may contain INDELS (see below).

As i side note it is good for resequencing projects as it is a 'gapped aligner', which means it will find matches to reads containing INDELS.

LIMBO

I am currently in a state of limbo as some users/customers of our facility like the output of Corona-Lite compared to BFAST and visa versa. One scenario of note is when matching reads to repeat regions. BFAST produces some very large peaks in these areas compared to Corona-Lite; it is not clear at the moment which program is 'telling the truth'.

PerM I like a lot of PerM functionality, but it does throw away all reads that contain color miscalls (-1). I played with this tool just after it was released and has come on leaps and bounds. I probably should retry it...

ADD COMMENTlink written 7.9 years ago by Ian5.5k
2
gravatar for Vitis
7.9 years ago by
Vitis2.2k
New York
Vitis2.2k wrote:

I only have experience with Illumina reads. So, BWA, novoalign for genomic DNA reads, variant calling, I also used a bowtie/novoalign two-step mapping for genomic DNA reads and it worked very well; Bowtie (PE or SE) for mRNA-Seq expression analyses. I'd like to mention BLAT, although it's not really a mapper. BLAT is very good for evaluating contigs from de novo assembly of mRNA-Seq reads, very fast, accurate and the results are easy to parse.

ADD COMMENTlink written 7.9 years ago by Vitis2.2k
1

We didn't consider splicing for our organism (for now), so I just used Bowtie to map and HTSeq to get the raw counts, followed by statistical analysis with R. When we're there to include splicing, we'll definitely consider TopHat. By the way, TopHat calls Bowtie, but with less flexibilities, not all Bowtie parameters can be adjusted within TopHat.

ADD REPLYlink written 7.9 years ago by Vitis2.2k

Do you mean TopHat for RNA-seq?

ADD REPLYlink written 7.9 years ago by Aaron Statham1.1k
1
gravatar for Daniel Swan
7.9 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

You might be interested in this recent paper:

"Comparative analysis of algorithms for next-generation sequencing read alignment"

http://bioinformatics.oxfordjournals.org/content/early/2011/08/19/bioinformatics.btr477

ADD COMMENTlink written 7.9 years ago by Daniel Swan13k

+1 It was a shame they didn't include BFAST!

ADD REPLYlink written 7.9 years ago by Ian5.5k
0
gravatar for Fabian Bull
7.9 years ago by
Fabian Bull1.3k
German
Fabian Bull1.3k wrote:

Tophat for RNA-seq data. BWA for genomic.

ADD COMMENTlink written 7.9 years ago by Fabian Bull1.3k

BWA only for all genomic data? If it comes to single end reads, I think it doesn't perform that well.

ADD REPLYlink written 7.9 years ago by Mdeng510

Sorry for that. I have only worked with PE. You may be right.

ADD REPLYlink written 7.9 years ago by Fabian Bull1.3k

Is bowtie much better than BWA for SE reads? I thought they were pretty much equivalent.

ADD REPLYlink written 7.9 years ago by Aaron Statham1.1k

I don't have a valid studie for this, yet. But me first observation was, that BWA gives you more false positives.

ADD REPLYlink written 7.9 years ago by Mdeng510
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 750 users visited in the last hour