Question: Transcriptome reconstruction from both short reads and long sequences
1
gravatar for Prakki Rama
2.4 years ago by
Prakki Rama2.0k
Singapore
Prakki Rama2.0k wrote:

Hi all, Could i please know if there is any tool to reconstruct transcriptome from reference genome using both short read and long sequences from pacific biosciences at one go simulataneosly. 

I need something like Scripture, but it is limited to short reads. I would prefer a tool which can deal with both short and long at the same time. Thanks in advance for your suggestions.

mapping sam bam • 871 views
ADD COMMENTlink modified 2.4 years ago by CraigM80 • written 2.4 years ago by Prakki Rama2.0k
2
gravatar for mark.ziemann
2.4 years ago by
mark.ziemann890
Australia/Mebourne/Monash University
mark.ziemann890 wrote:

How long are your reads?

You can use pre-aligned data (bam files) for scripture.

http://www.broadinstitute.org/software/scripture/Walkthrough_example

Use a dedicated long-read aligner to generate the alignment and then use Scripture to do the reconstruction. STAR might be a good option for alignment depending on error rates.

ADD COMMENTlink written 2.4 years ago by mark.ziemann890

Thank you. I want to use both Illumina as well as pacbio data. The example shows only Illumina reads mapped to genome. So, you say, as long as the data is in pre-aligned format (bam files) it should be ok?  My long read data mean length is 6.3 kb.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by Prakki Rama2.0k

Recent versions of STAR (ie 2.4.1c) are distributed with STARlong that is optimised for reads >200bp in length. There isn't anything in the manual about it, but here is a comment from the author about it. You may want to use standard STAR for Illumina reads and STARlong for the PacBio reads.

ADD REPLYlink written 2.4 years ago by mark.ziemann890
1
gravatar for 5heikki
2.4 years ago by
5heikki6.6k
Finland
5heikki6.6k wrote:

A de novo approach:

idba_tran -r $1 -l $2 -o $3 --num_threads 16 --mink 20 --maxk 100 --step 5
-r pe reads in interleaved fasta
-l long reads in fasta
-o output dir
ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by 5heikki6.6k

Have not tried this. Will have to check. Thanks you. 

ADD REPLYlink written 2.4 years ago by Prakki Rama2.0k
1
gravatar for CraigM
2.4 years ago by
CraigM80
Ireland
CraigM80 wrote:

How about MIRA?

An EST assembler which can perform hybrid assemblies using platforms of different read lengths.

I do not have first hand experience of trying this tool yet but believe it can do what you are looking for.

http://sourceforge.net/p/mira-assembler/wiki/Home/

 

A list of software for PacBio assembly, including hybrid assemblies, can be found here https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/Large-Genome-Assembly-with-PacBio-Long-Reads 

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by CraigM80

MIRA appears denovo assembler. I want to use genome as reference and run it. Will have check the tool. Thanks for the suggestion.

ADD REPLYlink written 2.4 years ago by Prakki Rama2.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1383 users visited in the last hour