Question: Build consensus sequences from repeat masker output
0
gravatar for Amaranta Remedios
3 months ago by
Manchester
Amaranta Remedios10 wrote:

Hi,

So I have a repeat masker output file for a new organism (crustacean). And I want to use the transposable elements in this specie to analyse the piRNAs (using my own sequencing data=short reads).

The problem is I would like to get consensus sequences for transposable elements in this specie, instead of having each position in the genome where there is a transposon. Because if the same transposon exist in 100 copies in the genome I will have it 100 times in Repeatmasker.
Ideally I will like to get to a multifasta file like the ones in Repbase but I am a bit lost about how to use the Repeatmasker output to achieve this.

Any suggestion will be very helpful ! Thanks

ADD COMMENTlink modified 29 days ago by bioinfo0 • written 3 months ago by Amaranta Remedios10

I think the easiest way would be to manipulate the coordinates as a bed file and then use bedtools to extract the sequences from the fasta. Once you have the fastas you can get a consensus

ADD REPLYlink written 3 months ago by Asaf8.4k

Thanks for the comment. I have already extracted the fasta sequences. I guess the way to move forward would be to do some sort of clustering on the sequences but I am not just sure about that.

ADD REPLYlink written 3 months ago by Amaranta Remedios10

You should have the name of the repeat, you can start with that and then get a consensus for each group.

ADD REPLYlink written 3 months ago by Asaf8.4k
0
gravatar for bioinfo
29 days ago by
bioinfo0
bioinfo0 wrote:

there is a script shipped with repeatMasker directory will solve your struggle I assume can be found here

repeatMasker/util/queryRepeatDatabase.pl

What you can do is to get from the database all repetitions of the corresponding taxa you are interested in.

apply as below:

util/queryRepeatDatabase.pl -species YourSpecies  > YourSpecies_repetitions.lib
ADD COMMENTlink modified 29 days ago • written 29 days ago by bioinfo0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1891 users visited in the last hour