Question: Expression of transposons in transcriptomes
0
gravatar for SeaStar
15 months ago by
SeaStar30
Ocean
SeaStar30 wrote:

Hello! I have a question for you. I have a fasta file of transposons, with name and sequences. I would to quantify the expression of transposons in some different transcriptomes. What kind of analysis do you suggest to me? What software could I use? Thanks a lot

alignment sequence • 353 views
ADD COMMENTlink modified 15 months ago by A. Domingues2.3k • written 15 months ago by SeaStar30
1
gravatar for A. Domingues
15 months ago by
A. Domingues2.3k
Dresden, Germany
A. Domingues2.3k wrote:

I have used SalmonTE in the past and had good experiences. It uses salmon the background but then aggregates the counts per element, family and class. The results tables are also ready to use with DESeq2

ADD COMMENTlink written 15 months ago by A. Domingues2.3k

Hi Dominigues! Thank you a lot!

ADD REPLYlink written 15 months ago by SeaStar30

Is it possible to use your own reference index from a fasta file with transposable elements generated by repeatscout instead of the ones present in the database of salmonTE?

ADD REPLYlink written 15 months ago by SeaStar30
1

No idea. I suggest asking the developers in github. They have been quite responsive whenever I had similar questions.

ADD REPLYlink modified 15 months ago • written 15 months ago by A. Domingues2.3k
0
gravatar for ATpoint
15 months ago by
ATpoint39k
Germany
ATpoint39k wrote:

How long are these sequences on average (ok 500-1000bp), and are they polyadenylated? There are two things to consider:

First, if they are not polyA they will be missed in most RNA-seq samples as most are polyA-enriched. Second, they must be at least in the range of 200bp or longer as shorter sequences typically get exluded in the library preparation except it is shortRNA sequencing. Transposons are not my field so be sure that it is common to detect them in RNA-seq as there are some RNA species that are rapidly degraded and might require special library prep techniques to preserve them, which might not be the case in most standard RNA-seq samples.

From the technical side, check first if these sequences are already present in the respective reference transcriptome. If so, use a tool such as salmon to quantify your data against it. If not include the sequences (without polyA tails) into that reference and then use salmon. Alternatively align data against a reference genome with tools such as star or hisat2 and then make sure you have a annotation file (GTF) where you included the coordinates of these sequences. Tools such as featureCounts can then assign the aligned reads to the features in the GTF. This is all pretty much standard so please first get a background in RNA-seq and the related analysis techniques.

ADD COMMENTlink modified 15 months ago • written 15 months ago by ATpoint39k

500 - 1000 pb. There are longer elements Lines and smaller sines

ADD REPLYlink written 15 months ago by SeaStar30

Thank you a lot. I've yet done a kind of analysis. I produced a gff file with coordinates of transposons in the genome using repeatmasker, then I used featurecounts to assign the aligned reads as you suggested. My doubt was if there was a tool that can find the expression using fasta file without alignments and you answered to me. Thank you

ADD REPLYlink modified 15 months ago • written 15 months ago by SeaStar30
1

See this SalmonTE above in the answer of A. Domingues, seems to do exactly what you need.

ADD REPLYlink written 15 months ago by ATpoint39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1549 users visited in the last hour