Question: Quantifying repetitive elements from RNA-seq (hisat2 or Salmon)
gravatar for ywchen
5 months ago by
ywchen0 wrote:

Hi everyone: I am interested in quantifying change in repetitive elements ( LTR here) transcription after treatment and I come up with following ideas:

  1. Directly map RNA-seq data to genome with hisat2 and quantify with repetitive element annotation from Repeatmasker, followed by collecting elements from the same class to compare them. But I am not sure about how to set up maximum allowed multiple alignment value (For most RNA-seq it requires to be uniquely mapped but the value would be much higher since repetitive elements happens lots of times).
  2. I got consensus repetitive element sequence fastq from Repbase, is it possible to view these repeat elements as "transcriptome" and use salmon (or similar transcriptome based tools) to map reads on it?

I am not familiar with this area and I would appreciate any suggestions . Thanks for help!

Update: Since I am only interested in LTR, I have modified the question. It looks possible to extract uniquely mapped reads and combine with Repeatmasker annotation. Direct quantification looks like will fail since repetitive elements are abundant in mRNA.

ADD COMMENTlink modified 5 months ago by Devon Ryan85k • written 5 months ago by ywchen0
gravatar for Devon Ryan
5 months ago by
Devon Ryan85k
Freiburg, Germany
Devon Ryan85k wrote:

You can use STAR and then put though TEtranscript. You can allow multiple entries with STAR and it generally produces better alignments than hisat2 (in my experience at least). Our group that works on repeat elements uses this method.

While you can use the consensus repeat sequence, you end up biasing things for how close the expressed repeats are to the consensus. Consensus sequences are mostly useful for showing a profile over a single instance where you can label structure easily.

ADD COMMENTlink modified 5 months ago • written 5 months ago by Devon Ryan85k

Thanks for your answer. I'm concerned about memory usage by STAR and maybe I will start hisat2 with -k 100 to see if it can be used by TEtranscript tool.

ADD REPLYlink written 5 months ago by ywchen0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1208 users visited in the last hour