Question: RNA-seq mapping rate
0
gravatar for afli
3 months ago by
afli170
China, Beijing, IGDB
afli170 wrote:

Hi, I have a basic question about RNA-seq analysis. If reads alignment rate is about 40-50% (from bowties, hisat2, or other alignment tools), would it be appropriate to increase the sequencing depth and get enough aligned reads to do analysis? Or this low alignment rate would cause some bias so we should abandon these samples? Thank you!

The sample is rice, and has high quality reference genome. I used bowtie2 to do the alignment, the summary is:

82280146 reads; of these:
  82280146 (100.00%) were paired; of these:
    41474464 (50.41%) aligned concordantly 0 times
    12443024 (15.12%) aligned concordantly exactly 1 time
    28362658 (34.47%) aligned concordantly >1 times
    ----
    41474464 pairs aligned concordantly 0 times; of these:
      1444965 (3.48%) aligned discordantly 1 time
    ----
    40029499 pairs aligned 0 times concordantly or discordantly; of these:
      80058998 mates make up the pairs; of these:
        73562998 (91.89%) aligned 0 times
        414858 (0.52%) aligned exactly 1 time
        6081142 (7.60%) aligned >1 times
55.30% overall alignment rate

The reason why the rate is low is that there is condamination of some bacterium. I just want to know if this kind of reads could be appropriate for downstream analysis.

rna-seq • 414 views
ADD COMMENTlink modified 3 months ago by Devon Ryan88k • written 3 months ago by afli170
1

Would you mind adding the hisat2 alignment summary here ?

ADD REPLYlink written 3 months ago by bioExplorer3.7k

I've added the information above.

ADD REPLYlink written 3 months ago by afli170
1

It might be interesting to know which species you're working with since a mapping rate of 40% would seem very low in human or mice but not in another species that is less well annotated. And the tissue you are working with obviously also plays into that evaluation.

ADD REPLYlink written 3 months ago by Wietje180
1

Please be as complete as possible and add information such as:

  • organism
  • commands used
  • alignment summary data
  • read length
  • library prep method
  • ...
ADD REPLYlink written 3 months ago by WouterDeCoster37k

Thank you for your suggestion.

ADD REPLYlink written 3 months ago by afli170

I don't think bowtie2 is a suitable aligner for spliced reads, as I assume rice has.

ADD REPLYlink written 3 months ago by WouterDeCoster37k
1

In case of bacterial contamination, you can use e.g. BBSplit to separate the reads originating from the bacterium. While continuing with the "host" reads, you may want to control for the bacterial influence (directly to the gene expression, or indirectly by distortion of the fragment ratios in the library). You can include it as a factor in your DE-model and check it as Devon suggested with a PCA or a NLDA.

ADD REPLYlink written 3 months ago by michael.ante3.2k
1

Do the samples have a sufficient read length, so > 50bp. I experienced on downloaded data that low mapping rates might primarily be due to poor read length (like 36bp or 25bp).

ADD REPLYlink written 3 months ago by ATpoint14k
2
gravatar for Devon Ryan
3 months ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

As a rule of thumb if one of your samples has a much lower alignment rate than the others you're probably going to exclude it in downstream analyses, since it will tend to have other problems. Make a PCA and see if it sticks out as an outlier. If so, exclude it. If not, then I guess you can keep it.

ADD COMMENTlink written 3 months ago by Devon Ryan88k

Thank you, this sounds reasonable.

ADD REPLYlink written 3 months ago by afli170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1370 users visited in the last hour