NaS (Nanopore Synthetic-long) help
6
0
Entering edit mode
7.5 years ago
midox ▴ 290

hello,
I am trying to set up the program is NaS hybrid approach Developed to take advantage of data generated using Minion device.
I started to install the PRE-REQUISITES but the last two points seems a little fuzzy.

  • Blat binary (at least v35) available through your PATH variable environnment.
  • Last binary (at least 502) accessed through your PATH variable environnment.

do you have an idea how to do?
Thank you.

Assembly preprocessing program • 3.4k views
ADD COMMENT
0
Entering edit mode

hello,

I'm still working on the implementation of NaS but here's the new problem:

Number of parallel task : 5
[mar. juin 16 13:54:04 CEST 2015] Create output directory : NaS_example
[mar. juin 16 13:54:04 CEST 2015] Create fasta file from fastq...
[mar. juin 16 13:54:50 CEST 2015] Alignement step in fast mode...
[mar. juin 16 13:54:55 CEST 2015] Select reads...
[mar. juin 16 13:54:55 CEST 2015] Retrieve similar reads...
[mar. juin 16 13:54:55 CEST 2015] Generate NaS reads...
Academic tradition requires you to cite works you base your article on.
When using programs that use GNU Parallel to process data for publication
please cite:

  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.

This helps funding further development; and it won't cost you a cent.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

To silence the citation notice: run 'parallel --bibtex'.

Academic tradition requires you to cite works you base your article on.
When using programs that use GNU Parallel to process data for publication
please cite:

  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.

This helps funding further development; and it won't cost you a cent.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

To silence the citation notice: run 'parallel --bibtex'.

cat: NaS_example/assemblies/*/NaS_hqctg_reads_final.fa: Aucun fichier ou dossier de ce type
[mar. juin 16 13:54:56 CEST 2015] Generate statistics...

** WARNING **: Warning zero length sequence []
awk: (FILENAME=- FNR=1) Fatal: tentative de division par zéro
gawk: (FILENAME=- FNR=1) Fatal: tentative de division par zéro
NbReads=  5  CumulativeSize=  31008  N50size=  7994  minSize=  2512  maxSize=  10464  avgSize=  6201.6  =>  NaS_example/NANO_reads.stats
NbReads=  1  CumulativeSize=  0  N50size=    minSize=  maxSize=  maxSize=  0  avgSize=    =>  NaS_example/NaS_hqctg_reads.stats

do you know how to fix this despite the parallel module is installed?

ADD REPLY
1
Entering edit mode

hello,

This message from parallel come while you don't run parallel --bibtex, run it and the message will disappear. But these is just a message, parallel works, your main problem is still with blat.

ADD REPLY
0
Entering edit mode

hello,

thank you for your help, I have resolved the problem of parallel.

but I think as you said there is another problem I'm not that slows down the rendering of Blat blat because I have tested and it works.

Number of parallel task : 5

[mer. juin 17 09:42:51 CEST 2015] Create output directory : NaS_example
[mer. juin 17 09:42:51 CEST 2015] Create fasta file from fastq...
[mer. juin 17 09:43:35 CEST 2015] Alignement step in fast mode...
[mer. juin 17 09:43:47 CEST 2015] Select reads...
[mer. juin 17 09:43:47 CEST 2015] Retrieve similar reads...
[mer. juin 17 09:43:47 CEST 2015] Generate NaS reads...
cat: NaS_example/assemblies/*/NaS_hqctg_reads_final.fa: Aucun fichier ou dossier de ce type
[mer. juin 17 09:43:48 CEST 2015] Generate statistics...

** WARNING **: Warning zero length sequence []
awk: (FILENAME=- FNR=1) Fatal: tentative de division par zéro
gawk: (FILENAME=- FNR=1) Fatal: tentative de division par zéro
NbReads=  5  CumulativeSize=  31008  N50size=  7994  minSize=  2512  maxSize=  10464  avgSize=  6201.6  =>  NaS_example/NANO_reads.stats
NbReads=  1  CumulativeSize=  0  N50size=    minSize=  maxSize=  maxSize=  0  avgSize=    =>  NaS_example/NaS_hqctg_reads.stats

Here is my link to the NaS_example folder http://uptobox.com/csx3wu3gozu9

Is what you can help me on this?

thankyou

ADD REPLY
2
Entering edit mode
7.5 years ago

Hello,

fastalength and fastacomposition are libraries of EBI :

Those libraries are required by NaS, there are in NaS directory but it might not be compatible with the glib version of your OS. You can compile them by yourself with the source code available here (http://www.ebi.ac.uk/~guy/exonerate/)

When binaries are compiled, copy them (FastaToTbl, TblToFasta, fastacomposition, fastalength) into NaS directory et run NaS again.

ADD COMMENT
0
Entering edit mode

I solved this problem and thank you for your help and I recompiled and here's the problem remains

root@midox-VGN-NS11S-S:/home/midox/Bureau/NaS-master# $(pwd)/NaS_v2/NaS --fq1 /home/midox/Bureau/NaS_example_acineto/AWK_DOSF_1_1_A5KR6.IND3_clean.10prc.fastq --fq2 /home/midox/Bureau/NaS_example_acineto/AWK_DOSF_1_2_A5KR6.IND3_clean.10prc.fastq --nano /home/midox/Bureau/NaS_example_acineto/MinION_reads_Acinetobacter_baylyi.fa --out NaS_example --nb_proc 5
Number of parallel task : 5
[lundi 8 juin 2015, 16:05:18 (UTC+0200)] Create output directory : NaS_example
[lundi 8 juin 2015, 16:05:18 (UTC+0200)] Create fasta file from fastq...
[lundi 8 juin 2015, 16:07:51 (UTC+0200)] Alignement step in fast mode...
[lundi 8 juin 2015, 16:07:58 (UTC+0200)] Select reads...
[lundi 8 juin 2015, 16:07:58 (UTC+0200)] Retrieve similar reads...
[lundi 8 juin 2015, 16:07:58 (UTC+0200)] Generate NaS reads...
cat: NaS_example/assemblies/*/NaS_hqctg_reads_final.fa: No such file or directory
[lundi 8 juin 2015, 16:08:00 (UTC+0200)] Generate statistics...

** (process:25124): WARNING **: Warning zero length sequence []
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: division by zero attempted
gawk: cmd. line:1: (FILENAME=- FNR=1) fatal: division by zero attempted
NbReads=  5  CumulativeSize=  31008  N50size=  7994  minSize=  2512  maxSize=  10464  avgSize=  6201.6  =>  NaS_example/NANO_reads.stats
NbReads=  1  CumulativeSize=  0  N50size=    minSize=  maxSize=  maxSize=  0  avgSize=    =>  NaS_example/NaS_hqctg_reads.stats

I don't know what is this type of problem.

ADD REPLY
1
Entering edit mode

Can you send me your NaS_example directory, please ?

ADD REPLY
0
Entering edit mode

here's a link to my NaS example (compressed).
http://uptobox.com/opp5tjpxxq35
thank you for your help

ADD REPLY
1
Entering edit mode

In your file NaS_example/tmp/blat-alignement.stderr:

/bin/bash: line 3: blat: command not found

Are you sure you have BLAT accessible through your PATH ?

If you type blat in a terminal, what happens?

ADD REPLY
0
Entering edit mode

blat when I step on the command line it says "blat: command not found" even though I installed it.

I don't know the problem with Blat.

ADD REPLY
1
Entering edit mode

In the file NaS line 42:

PATH=/env/cns/opt/454-2.9/bin/:/env/cns/src/blat/blat_v35/bin/linux/:$PATH

Replace /env/cns/src/blat/blat_v35/bin/linux/ by the link to your blat binaries directory and try to run NaS again

ADD REPLY
1
Entering edit mode
7.5 years ago
george.ry ★ 1.2k

LAST: http://last.cbrc.jp/

BLAT: http://hgdownload.soe.ucsc.edu/admin/exe/

 

Download both and then add the directories to your PATH - probably easiest set in your bashrc file (http://unix.stackexchange.com/questions/26047/how-to-correctly-add-a-path-to-path) - or add a link to the binaries themselves into /usr/local/bin/ .

ADD COMMENT
0
Entering edit mode
7.4 years ago

In file NaS_exemple/tmp/blat-alignment.stderr:

sh: -c: line 0: Erreur de syntaxe près du symbole inattendu « ( »
sh: -c: line 0: `blat -tileSize=10 -stepSize=5 -noHead /scratch/mkchouk/testdata/NaS_example_acineto/MinION_reads_Acinetobacter_baylyi.fa stdin >(cat) >&2'

To understand it, I want to reproduce this error on my test environment, so, what's your versions for:

linux ?
bash ? (run bash --version)
parallel ?
blat ?

Moreover when you use NaS, what happens if you add this option --mode sensitive?

ADD COMMENT
0
Entering edit mode

I resolved the problem and NaS run successfully. thankyou Guillaume Gautreau44

this is the output:

Number of parallel task : 5
[mer. juin 17 15:05:00 CEST 2015] Create output directory : NaS_example
[mer. juin 17 15:05:00 CEST 2015] Create fasta file from fastq...
[mer. juin 17 15:05:40 CEST 2015] Alignement step in fast mode...
[mer. juin 17 15:05:44 CEST 2015] Select reads...
[mer. juin 17 15:05:44 CEST 2015] Retrieve similar reads...
[mer. juin 17 15:07:01 CEST 2015] Generate NaS reads...
mkdir: impossible de créer le répertoire « NaS_example/assemblies/channel_101_read_1_twodirections »: Le fichier existe
mkdir: impossible de créer le répertoire « NaS_example/assemblies/channel_103_read_6_twodirections »: Le fichier existe
mkdir: impossible de créer le répertoire « NaS_example/assemblies/channel_100_read_10_twodirections »: Le fichier existe
mkdir: impossible de créer le répertoire « NaS_example/assemblies/channel_102_read_15_twodirections »: Le fichier existe
mkdir: impossible de créer le répertoire « NaS_example/assemblies/channel_103_read_2_twodirections »: Le fichier existe
[mer. juin 17 15:07:11 CEST 2015] Generate statistics...
NbReads=  5  CumulativeSize=  31008  N50size=  7994  minSize=  2512  maxSize=  10464  avgSize=  6201.6  =>  NaS_example/NANO_reads.stats
NbReads=  5  CumulativeSize=  38036  N50size=  9743  minSize=  1982  maxSize=  12787  avgSize=  7607.2  =>  NaS_example/NaS_hqctg_reads.stats

but in the example in GitHub the output with the similar datasets

NbReads= 5 CumulativeSize= 31008 N50size= 7994 minSize= 2512 maxSize= 10464 avgSize= 6201.6 => /env/cns/home/ggautrea/NaS_example/NANO_reads.stats
NbReads= 4 CumulativeSize= 34867 N50size= 9707 minSize= 4263 maxSize= 11971 avgSize= 8716.75 => /env/cns/home/ggautrea/NaS_example/NaS_hqctg_reads.stats

in this way, is that Nas work properly or not? because we havent the similar outputs?

ADD REPLY
0
Entering edit mode

Have you removed previous NaS_exemple directory before run NaS ?

Maybe your version of newbler or blat is not the same as example

ADD REPLY
0
Entering edit mode

yes, I removed previous NaS_example.

I use Nas with sensitive mode --mode sensitive he run successfully.

[mer. juin 17 15:07:11 CEST 2015] Generate statistics...
NbReads=  5  CumulativeSize=  31008  N50size=  7994  minSize=  2512  maxSize=  10464  avgSize=  6201.6  =>  NaS_example/NANO_reads.stats
NbReads=  5  CumulativeSize=  38036  N50size=  9743  minSize=  1982  maxSize=  12787  avgSize=  7607.2  =>  NaS_example/NaS_hqctg_reads.stats

(I don't know is that the good results)

but without the --mode sensitive and it is the same error :(

Number of parallel task : 5
[mer. juin 17 15:53:52 CEST 2015] Create output directory : NaS_example
[mer. juin 17 15:53:52 CEST 2015] Create fasta file from fastq...
[mer. juin 17 15:54:33 CEST 2015] Alignement step in fast mode...
[mer. juin 17 15:54:37 CEST 2015] Select reads...
[mer. juin 17 15:54:37 CEST 2015] Retrieve similar reads...
[mer. juin 17 15:54:37 CEST 2015] Generate NaS reads...
cat: NaS_example/assemblies/*/NaS_hqctg_reads_final.fa: Aucun fichier ou dossier de ce type
[mer. juin 17 15:54:37 CEST 2015] Generate statistics...

** WARNING **: Warning zero length sequence []
awk: (FILENAME=- FNR=1) Fatal: tentative de division par zéro
gawk: (FILENAME=- FNR=1) Fatal: tentative de division par zéro
NbReads=  5  CumulativeSize=  31008  N50size=  7994  minSize=  2512  maxSize=  10464  avgSize=  6201.6  =>  NaS_example/NANO_reads.stats
NbReads=  1  CumulativeSize=  0  N50size=    minSize=  maxSize=  maxSize=  0  avgSize=    =>  NaS_example/NaS_hqctg_reads.stats

in file NaS_exemple/tmp/blat-alignment.stderr:

sh: -c: line 0: Erreur de syntaxe près du symbole inattendu « ( »
sh: -c: line 0: `blat -tileSize=10 -stepSize=5 -noHead /scratch/mkchouk/testdata/NaS_example_acineto/MinION_reads_Acinetobacter_baylyi.fa stdin >(cat) >&2'

I have GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)<

I have blat/35 And parallel/20150522

I think the problem of NaS not resolved yet.

Thankyou

ADD REPLY
0
Entering edit mode

i think the problem is in script in NaS-wrapped in (cat)

ADD REPLY
0
Entering edit mode
7.4 years ago

I use parallel/20130122 and it works but this problem with your version is not normal and will be corrected.

Can you send just the file NaS_hqctg_reads.fa in sentitive mode to check quality?

ADD COMMENT
0
Entering edit mode

I tested the parallel and it works.

here's a link to NaS_hqctg_reads.fa in sentitive mode.

http://www47.zippyshare.com/v/tpnw6Uy5/file.html

ADD REPLY
0
Entering edit mode

NaS don't work in fast mode with this version of parallel and must be fixed

You can download here the reference assembly of Acinetobacter baylyi: http://www.genoscope.cns.fr/externe/nas/references/acineto/

Use bwa mem with option -x ont2d to compare acineto NaS reads with the reference

On your 5 reads corrected by NaS, here is the stat of alignment of corrected reads using the reference

Number of reads                          : 5
Number of reads (>10Kb)                  : 1
Number of bp                             : 38036
Average size of reads                    : 7607.2
N50 size of reads                        : 9743
Max size of reads                        : 12787
######
Number of aligned reads                  : 5 (100%)
Number of aligned bp                     : 38036 (100%)
Average identity percent                 : 100%
Max alignement size                      : 12787
Number of aligned reads L=100%           : 5 (100%)
Number of aligned reads ID=100%          : 5 (100%)
Number of aligned reads L=100% ; ID=100% : 5 (100%)
Number of loci                           : 5
Reference size                           : 3598621
Coverage of reference                    : 38041 (1.05%)
ADD REPLY
0
Entering edit mode

So, I can't use NaS with the parallel module parallel/20150522?

and it works with the sensitive mode, I can use this mode of NaS for my tests?

ThankYou

ADD REPLY
0
Entering edit mode

For the sentitive mode, yes you can, you have better results than git hub example ;)

For the fast mode, it will be fixed soon

ADD REPLY
0
Entering edit mode

ok thankyou Guillaume.

ADD REPLY
0
Entering edit mode

Fast mode problem with parallel is fixed, update NaS ;)

ADD REPLY
0
Entering edit mode

thankyou Guillaume,

the NaS works very well but the results HERE:

Number of parallel task : 5
[jeu. juin 18 15:50:05 CEST 2015] Create output directory : NaS_example
[jeu. juin 18 15:50:05 CEST 2015] Create fasta file from fastq...
[jeu. juin 18 15:50:42 CEST 2015] Alignement step in fast mode...
[jeu. juin 18 15:50:49 CEST 2015] Select reads...
[jeu. juin 18 15:50:49 CEST 2015] Retrieve similar reads...
[jeu. juin 18 15:52:35 CEST 2015] Generate NaS reads...
[jeu. juin 18 15:52:39 CEST 2015] Generate statistics...
NbReads=  5  CumulativeSize=  31008  N50size=  7994  minSize=  2512  maxSize=  10464  avgSize=  6201.6  =>  NaS_example/NANO_reads.stats
NbReads=  4  CumulativeSize=  17401  N50size=  4786  minSize=  1573  maxSize=  8242  avgSize=  4350.25  =>  NaS_example/NaS_hqctg_reads.stats

they are not like the example is that normal?

ADD REPLY
0
Entering edit mode

do NaS is usable for large genomes ? because I want to use it for plant genomes .

ADD REPLY
0
Entering edit mode

I fixed a bug, several blat process worked on the same file in same time that why your stats are different than example. Now, you can update NaS again :)

NaS has been tested on genome until ~20Mb like yeast. It may work on little plant genome like arabidopsis thaliana but not on large genome with a lot of repeat. Genoscope currently working on the improvement of NaS to deal with larger genome.

I advise you to split your fasta dataset of nanopore reads in portion of ~20mb rather to provide all the data to NaS in one time.

ADD REPLY
0
Entering edit mode

Hello,

Do NaS works with PacBio data?

Thankyou

ADD REPLY
1
Entering edit mode

Yes you can. Pacbio file have often specials characters (like slash) in fasta sequences identifiers, try to rename it if you have problem.

ADD REPLY
0
Entering edit mode

yes I have a problem.

how can I rename ?

this is an example of my pacbio sequences

>SRR1204085.2 length=111
TTTGTTTGTGTGTGGTTTGTCTTGTTGTTTGGTTGGGGTTTCTCTTCGGCTGGTCGGCGTCTCGTGTGTCGCCTTTCTTGTGTTTGTGCGTGTGCTTGGGTTTCCTCGCTT
ADD REPLY
0
Entering edit mode

NaS don't support space in fasta sequences identifiers

to rename your fasta try:

cat your_sequences.fasta | NaS_v2/FastaToTbl | NaS_v2/TblToFasta > your_sequences_rename.fasta

FastaToTbl: https://github.com/institut-de-genomique/NaS/blob/master/NaS_v2/FastaToTbl

ADD REPLY
0
Entering edit mode

​here are my Illumina sequences.

@SEB9BZKS1:57:D16YVACXX:1:1101:2395:1997 1:N:0:ACAGTG
NATTTCTGATCTAGAACGCATAACACATACCACATCATATTAAATGAAATTCTAAGAGTAGAAGGAGCTTATTTGAGCAC
+
#4=DDFFFHGHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJIJHIJJJJJJIIJJJJJJHHEHHF

I think he misses /1 and /2.

is that there is a solution for adding /1 and /2 in sequence?

thank you

ADD REPLY
0
Entering edit mode

Do you have Illumina reads in pairs in 2 files?

delete 1:N:0:ACAGTG at the end and with a script awk add /1 or /2

For exemple:

cat r1.fastq | awk 'NR%4==1{printf ($1 "/1" "\n");}NR%4!=1{print $0;}'
cat r2.fastq | awk 'NR%4==1{printf ($1 "/2" "\n");}NR%4!=1{print $0;}'
ADD REPLY
0
Entering edit mode

yes I have 2 files of illumina reads.

ADD REPLY
0
Entering edit mode

ok so use the script in my previous message to add /1 and /2

ADD REPLY
0
Entering edit mode

ok, thankyou Guillaume.

ADD REPLY

Login before adding your answer.

Traffic: 1781 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6