Entering edit mode
7.3 years ago
Bioinfonext
▴
460
I have found that in some plant species, a large number of genes have similarity in the coding regions so when we do mapping on cDNA, it will not count the appropriate raw read counts.
Do you think best way is to map raw reads to extract whole mRNA sequence including 5' and 3' prime? but it should not have intron region.
Because most of the highly similar genes may have differences in 5' prime and 3 prime regions.
What is the best way to count real raw read in this type of plant species?
Multi-mapping reads are a common issue with short reads. How you treat them is somewhat up to you. I am sure there are threads on Biostars that have wiser answers.
BBMap.sh
allows you to choose from one of these options. There are two extremes you can choose from.