Forum: Bioinformatics terms that might be confusing for beginners
0
gravatar for zhangjk21
4 months ago by
zhangjk210
China/Wuhan/HUST
zhangjk210 wrote:

Everyone involved in bioinformatics should first understand the concept of the base quality, which is Phred score. But for beginners, why the base quality was named that might still be confusing. Does anyone have other examples like this? Thanks.

For instance, Bowtie, TopHatCufflinksStringtie ,Ballgown etc. , all these softwares are about clothes. Why?

blog forum bioinformatics • 497 views
ADD COMMENTlink modified 4 months ago by Bastien Hervé2.8k • written 4 months ago by zhangjk210
2

Hello zhangjk21 ,

why do you find "base quality" confusing? It is one of the few terms that are quite well defined.

It become more complicated with:

  • coverage
  • read depth
  • read/fragment/insert size
  • duplicates
  • 1-based vs 0-based position
  • the large variety of file types
  • ...

fin swimmer

ADD REPLYlink modified 4 months ago • written 4 months ago by finswimmer9.0k

What really confused me is not the true meaning or professional definition of the base quality. I know it is well defined. And I'm just interested in the story why we use this word Phred , which I can not figure out according to the definition because it is not abbreviation of other terms. In other words, I'm more interested in the history behind this word Phred.

ADD REPLYlink written 4 months ago by zhangjk210
1

Bowtie, TopHat,Cufflinks ,Stringtie ,Ballgown etc. , all these softwares are about clothes. Why?

this is simply a gimmick of the developers (they have to name their software anyway, so why not this?) . They all belong to the same suite (no pun intended) of software. I believe it is actually one of the better examples of software naming

ADD REPLYlink written 4 months ago by lieven.sterck3.5k
1

It's called the Tuxedo suite, IIRC

ADD REPLYlink written 4 months ago by RamRS20k
1

Just to go back one step: the very term, Bioinformatics, is confusing. As different people have different ideas about what is bioinformatics, actually, these people then mis-interpret the skills / abilities of others who call themselves bioinformaticians. Why? - it is because they expect that these other people have all of the skills that they believe a bioinformatician should have, i.e., based on their own ideas about what is bioinfortmatics.

Bioinformatics is very broad, and there are multiple areas in which each can specialise. Then again, you have bioinformaticians who have broad / general skills but who are not true experts in any one area.

ADD REPLYlink modified 4 months ago • written 4 months ago by Kevin Blighe35k

The difference between insert size and fragment size was confusing for me initially.

ADD REPLYlink written 4 months ago by karthic100

Just to be sure :

  • Insert size = The size of the DNA after mechanical or enzymatic shearing of DNA
  • Fragment size = The size of DNA sheared + Poly-A + adaptators

Or is it the other way around ?

ADD REPLYlink modified 4 months ago • written 4 months ago by pegeot.henri0

To follow through :

PE reads      R1--------->                                                 <---------R2
Adaptaors       ~~~                                                               ~~~
fragment        ~~~===============================================================~~~
insert                 ===========================================================
inner mate                  .................................................
ADD REPLYlink modified 4 months ago • written 4 months ago by pegeot.henri0
1

at least the other way around:

  • Fragment size = The size of the DNA after mechanical or enzymatic shearing of DNA
  • insert size = the bit of DNA between the two adapters

see here for a nice blog post on this topic

ADD REPLYlink modified 4 months ago • written 4 months ago by lieven.sterck3.5k

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLYlink written 4 months ago by WouterDeCoster36k

It is explained clearly in this post here

ADD REPLYlink written 4 months ago by karthic100

In practice, a fragment size can mean the length with or without the adapters, it depends on the context of which fragments are being talked about.

ADD REPLYlink written 4 months ago by Devon Ryan87k

It’s worth bearing in mind that some terms (software names especially) are often named withou any actual relevance to what they do. It might be an in-joke by the group, or a reference that they like etc.

ADD REPLYlink written 4 months ago by jrj.healey10k
1

like northern and western blotting in experiments.

ADD REPLYlink written 4 months ago by cpad011210k
1

Actually those sort of ‘make sense’, because they followed on from Southern blotting which was invented by Edward Southern. It seemed semi logical to give them names of other cardinal points.

ADD REPLYlink written 4 months ago by jrj.healey10k
1

Or Just Another Bogus Bioinformatics Algorithm!

http://www.acgt.me/blog?tag=jabba

ADD REPLYlink written 4 months ago by Daniel3.7k
2
gravatar for H.Hasani
4 months ago by
H.Hasani640
Freiburg, Germany
H.Hasani640 wrote:

Well, Wikipedia has a clear explanation for the Phred score!

ADD COMMENTlink modified 4 months ago • written 4 months ago by H.Hasani640

Tells you what it is, but doesn't actually explain where the name originated from...

ADD REPLYlink written 4 months ago by jrj.healey10k
1

Phred stands for Phil's Read Editor. It is software (Phred, Phrap, and Consed) written by Phil Green and team. https://www.ncbi.nlm.nih.gov/pubmed/9521921?dopt=Abstract

What's in a name? I always like to think that it was a reference to Fred Sanger, but I am not sure.

ADD REPLYlink modified 4 months ago • written 4 months ago by b.nota5.7k
2
gravatar for lieven.sterck
4 months ago by
lieven.sterck3.5k
VIB, Ghent, Belgium
lieven.sterck3.5k wrote:

I think if there is one example that seems to be confusing (not only to beginners) is the similarity <-> homology one. I see this mistake appearing even in manuscripts of very "experienced" people.

ADD COMMENTlink written 4 months ago by lieven.sterck3.5k
2
gravatar for Bastien Hervé
4 months ago by
Bastien Hervé2.8k
Limoges, CBRS, France
Bastien Hervé2.8k wrote:

Mapping <=> Alignment

Alignment and mapping

ADD COMMENTlink modified 4 months ago • written 4 months ago by Bastien Hervé2.8k
1
gravatar for WouterDeCoster
4 months ago by
Belgium
WouterDeCoster36k wrote:

For instance, Bowtie, TopHat,Cufflinks ,Stringtie ,Ballgown etc. , all these softwares are about clothes. Why?

Why not? :)

ADD COMMENTlink written 4 months ago by WouterDeCoster36k

just to elucidate my question, LOL

ADD REPLYlink written 4 months ago by zhangjk210

Well yeah, naming a tool doesn't have to make sense. You just have to remember if you google for Cufflinks that you should add "RNA". Same goes for the STRING database, though.

ADD REPLYlink written 4 months ago by WouterDeCoster36k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1529 users visited in the last hour