Will it be fine if the benchmarking of an assembler/aligner is performed using a single chromosome of Human
1
0
Entering edit mode
3.6 years ago

Hi Everyone

I have been reading several benchmarking papers and I realized how much computational resources and time it must have cost for all the chromosomes we have in the human reference genome. Wouldn't it be a more principled approach to use the longest and the shortest chromosome to use in benchmarking studies of aligners/assemblers ? If not why.

assembly sequence alignment genome next-gen • 875 views
ADD COMMENT
1
Entering edit mode

If you only had sequence data from those two chromosome then yes. You could artificially retrieve the data that is aligning to just those particular chromosome and then use it (after doing an alignment to the full genome).

Keep in mind this would only be appropriate for the synthetic purpose of benchmarking.

ADD REPLY
0
Entering edit mode

No I have WGS data. So it covers the entire genome.

ADD REPLY
1
Entering edit mode

Correct. But you can pull out reads from that data that just align to two chromosomes and do what I mentioned above for benchmarking.

ADD REPLY
3
Entering edit mode
3.6 years ago
JC 13k

No, because aligners also deal with multiple mapping issues, the realistic way is to use the full genome.

ADD COMMENT
0
Entering edit mode

Any thoughts on the assembler's side.

ADD REPLY
0
Entering edit mode

What do you mean? For assemblers is common to benchmark using different library sizes with several total reads sizes.

ADD REPLY

Login before adding your answer.

Traffic: 2550 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6