Question: Transposable elements search using ion torrent short single-end reads and assembled data of plant
0
gravatar for mirza
2.8 years ago by
mirza80
India
mirza80 wrote:

Hello everyone,

I have ion torrent short single-end reads and assembled data (CLC genomics workbench) of a plant. Can anyone tell me

1. I should use reads or assembled contigs

2. What are the best tools for such data

3. Links of How to use/ run the suggested software

clc transposon iontorrent • 1.3k views
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by mirza80
0
gravatar for SES
2.8 years ago by
SES8.1k
Vancouver, BC
SES8.1k wrote:

1. Use the raw reads.

2. That depends a little on what it is you want to do and the species, but RepeatExplorer and Transposome can be used.

3. RepeatExplorer and Transposome are a good place to start.

For some context, genome assembly is complicated by repeats and the regions that are typically missing, compressed, or misassembled are the repetitive regions. Therefore, you don't want to do a low-coverage assembly to look for repeats. If you have a high-quality draft supported by genetic and physical maps, cytogenetic data, etc. then use the assembly. If not, you are going to be telling lies!

RepeatExplorer and Transposome (developed by myself) were both designed around solving problems with plant genomes, so this is an ideal use case. RepeatExplorer underestimates the repeat abundance (sometimes by a lot), so this is something important to consider if you are thinking of making a biological or evolutionary study. On the other hand, it may be easier to use (web vs. command line) depending on your background, albeit much slower. I don't have experience running RepeatExplorer with single-end data, but I can tell you that Transposome seems to do better, in terms of biological expectations, with long reads including single-end, so this should work well. If you have any questions, feel free to ask.

ADD COMMENTlink written 2.8 years ago by SES8.1k

Has anyone developed a k-mer based approach to estimating TE abundance as % of the genome?

ADD REPLYlink written 2.8 years ago by apelin20470

Yes, you can use k-mer frequency to look at repeat properties in the genome (uniqueness, occurrence ratios, etc.) but this is only really informative (biologically) when combined with information about what TEs are in a genome. Without that, you can't say a whole lot based on k-mers alone. This approach is super useful for comparative purposes though, and for visualizing the genomic abundance of repeats (in Fig. 2 of this paper I did this to show the genomic abundance of repeats in a BAC).

ADD REPLYlink written 2.8 years ago by SES8.1k

Thanks SES, for your reply. I did went through your tool's home page and it  says that its for paired end reads??!!

ADD REPLYlink written 2.8 years ago by mirza80

If you are referring to Transposome, see this page for how to use long read data (those directions should be good for Ion Torrent data assuming read lengths ~400bp).

ADD REPLYlink written 2.8 years ago by SES8.1k
0
gravatar for mirza
2.8 years ago by
mirza80
India
mirza80 wrote:

 Hi,

Sorry for this late reply.

I have single end ion torrent data not paired end data, is it possible to run it in Transposome.
 

ADD COMMENTlink written 2.8 years ago by mirza80

Yes, sorry I thought it was clear above (in my comment). The only thing that is different is changing the default overlaps for long reads. Otherwise, you can run it just the same as with paired-end data (and the results will be comparable).

ADD REPLYlink written 2.8 years ago by SES8.1k

Hello Staton,

I was trying to install your tool, Transposome using the commands given. I successfully installed all the dependencies, but am stuck at the last step of installation. I am copying the commands and error here. I'll appreciate your help,

smarla@smarla-HP-Z400-Workstation:~$ cd Transposome
smarla@smarla-HP-Z400-Workstation:~/Transposome$ perl Makefile.PL
g++ -o graph_binary.o -c graph_binary.cpp -ansi -O5 -Wall
g++ -o community.o -c community.cpp -ansi -O5 -Wall
g++ -o main_community.o -c main_community.cpp -ansi -O5 -Wall
g++ -o louvain_community graph_binary.o community.o main_community.o -ansi -lm -Wall
g++ -o graph.o -c graph.cpp -ansi -O5 -Wall
g++ -o main_convert.o -c main_convert.cpp -ansi -O5 -Wall
g++ -o louvain_convert graph.o main_convert.o -ansi -lm -Wall
g++ -o main_hierarchy.o -c main_hierarchy.cpp -ansi -O5 -Wall
g++ -o louvain_hierarchy main_hierarchy.o -ansi -lm -Wall
Generating a Unix-style Makefile
Writing Makefile for Transposome
Writing MYMETA.yml and MYMETA.json
smarla@smarla-HP-Z400-Workstation:~/Transposome$ make
Skip blib/lib/Transposome/SeqUtil.pm (unchanged)
Skip blib/lib/Transposome/Cluster.pm (unchanged)
Skip blib/lib/Transposome/Annotation/Mapping.pm (unchanged)
Skip blib/lib/Transposome.pm (unchanged)
Skip blib/lib/Transposome/Test/TestFixture/TestConfig.pm (unchanged)
Skip blib/lib/Transposome/SeqIO.pm (unchanged)
Skip blib/lib/Transposome/Annotation/Typemap.pm (unchanged)
Skip blib/lib/Transposome/Role/Config.pm (unchanged)
Skip blib/lib/Transposome/PairFinder.pm (unchanged)
Skip blib/lib/Transposome/SeqFactory.pm (unchanged)
Skip blib/lib/Transposome/Run/Blast.pm (unchanged)
Skip blib/lib/Transposome/SeqIO/fastq.pm (unchanged)
Skip blib/lib/Transposome/Annotation.pm (unchanged)
Skip blib/lib/Transposome/Annotation/Methods.pm (unchanged)
Skip blib/lib/Transposome/Annotation/Search.pm (unchanged)
Skip blib/lib/Transposome/Annotation/Summary.pm (unchanged)
Skip blib/lib/Transposome/Role/File.pm (unchanged)
Skip blib/lib/Transposome/SeqIO/fasta.pm (unchanged)
Skip blib/lib/Transposome/Role/Types.pm (unchanged)
Skip blib/lib/Transposome/Role/Util.pm (unchanged)
Skip blib/lib/Transposome/Test/TestFixture.pm (unchanged)
cp bin/louvain_community blib/bin/louvain_community
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/louvain_community
cp bin/formatdb blib/bin/formatdb
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/formatdb
cp bin/louvain_convert blib/bin/louvain_convert
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/louvain_convert
cp bin/transposome blib/bin/transposome
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/transposome
cp bin/mgblast blib/bin/mgblast
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/mgblast
cp bin/louvain_hierarchy blib/bin/louvain_hierarchy
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/louvain_hierarchy
Manifying 1 pod document
Manifying 21 pod documents
smarla@smarla-HP-Z400-Workstation:~/Transposome$ make test
PERL_DL_NONLAZY=1 "/usr/bin/perl" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/00-load.t ............. 7/13 # Testing Transposome 0.09.8, Perl 5.014002, /usr/bin/perl
t/00-load.t ............. ok     
t/01-utils_config.t ..... ok   
t/02-utils_seq.t ........ ok     
t/03-utils_blast.t ...... ok   
t/04-seqio.t ............ ok     
t/05-seqio-fasta-fh.t ... ok    
t/06-seqio-fastq-fh.t ... ok    
t/07-seqstore.t ......... ok     
t/08-seqsample.t ........ ok   
t/09-megablast.t ........ ok   
t/10-pairfinder.t ....... ok       
t/11-cluster.t .......... ok    
t/12-annotation.t ....... ok     
t/13-allmethods.t ....... ok     
t/14-analysis_steps.t ... ok     
t/15-transposome_app.t .. ok   
All tests successful.
Files=16, Tests=1052, 95 wallclock secs ( 0.17 usr  0.03 sys + 14.14 cusr  3.46 csys = 17.80 CPU)
Result: PASS
smarla@smarla-HP-Z400-Workstation:~/Transposome$ make install
Manifying 1 pod document
Manifying 21 pod documents
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
ERROR: Can't create '/usr/local/bin'
Do not have write permissions on '/usr/local/bin'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 at -e line 1.
make: *** [pure_site_install] Error 13
smarla@smarla-HP-Z400-Workstation:~/Transposome$

 

ADD REPLYlink written 2.6 years ago by mirza80

This is weird, nothing gets installed into /usr/local/bin or under /usr/local so I'm not sure. What OS/distribution are you using?

ADD REPLYlink written 2.6 years ago by SES8.1k

I am using Ubuntu 12.04 LTS

ADD REPLYlink written 2.6 years ago by mirza80

I tested on Ubuntu 12.04 and it does install under /usr/local on that system if you are using the system Perl. You just need to type "sudo make install" to install it. Otherwise, I suggest setting up perlbrew (it is very easy, and there are copy-and-paste commands to do it on the Transposome wiki under "installing dependencies") so you don't need admin to do anything with Perl.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by SES8.1k

Hi,

I used "sudo make install" nad here is the result

Appending installation info to /usr/lib/perl/5.14/perllocal.pod
smarla@smarla-HP-Z400-Workstation:~/Transposome-0.09.7$

Its installed, thanks! :)

Now, I have the following questions,

1. What should be the size of my input file?

2. Whats a better strategy- using raw reads or contigs?

ADD REPLYlink written 2.6 years ago by mirza80

1. I would start with 100,000 reads that are sampled from the whole data set. Then, you can increase after seeing how long the analysis will take.

2. Definitely use raw reads.

ADD REPLYlink written 2.6 years ago by SES8.1k

Thanks SES, I appreciate.

ADD REPLYlink written 2.6 years ago by mirza80
0
gravatar for mirza
2.8 years ago by
mirza80
India
mirza80 wrote:

Ok thanks!!

ADD COMMENTlink written 2.8 years ago by mirza80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1254 users visited in the last hour