Aligning against a newly assembled genome: non unique contigs and the issue of multimapped alignments
0
0
Entering edit mode
9.5 years ago
IV ★ 1.3k

Aligning RNA-Seq reads against a newly assembled genome involves numerous issues and difficulties that we don't normally meet when analyzing data from model organisms.

For instance, many of the genome regions can be repeated in different contigs and scaffolds, diminishing the number of unique regions. These regions are result of poor genome assembly and are distinct from the repeated regions we normally see in complete genomes.

This in turn increases dramatically the number of multimapped reads (in some genomes to more than 80-90%) in RNA-Seq analyses.

I would be really happy to see how colleagues face these issues during alignent and most importantly in dowstream analyses (e.g. expression quantification).

**Edit**

Examples:

How do you (or would you) take into account multimapped reads in gene (or exon or transcript) expression?

Do you (or would you) perform pairwise alignments between the contigs/scaffolds to check for non-redundancy?

What measures (or would you) take if your genome is highly repeated?

Thank you all,

Ioannis

RNA-Seq alignment • 2.4k views
ADD COMMENT

Login before adding your answer.

Traffic: 2962 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6