Question

pick_open_referrence_otus.py in QIIME 1.9.1

0

Entering edit mode

7.9 years ago

narasimhulu.bioinfo • 0

I have a doubt in QIIME, I practiced it with Illumina test data and got exactly same output. Now I am trying this with my project sample. I have Paired-end fastq and mapping files as input files, I am following below steps

Join_paired_ends.py
Split_libraries_fastq.py
Count_seqs.py
Pick_open_referrence_otus.py

I ran the six samples at a time, first 3 steps quickly generated outputs, but fourth step is running from last 3 days.

Please tell me any suggestions from yours end.

next-gen • 2.6k views

ADD COMMENT • link 7.9 years ago by narasimhulu.bioinfo • 0

0

Entering edit mode

As long as it is "running" and has not produced an error .. leave it alone.

Example datasets (like the one you used) are just that. They are meant to test/show functionality and finish in a reasonable amount of time. But real samples will always take longer.

ADD REPLY • link 7.9 years ago by GenoMax 141k

0

Entering edit mode

Thank you Agata for early reply answer. Last seven days this command still running and not showing any error also. I followed below steps and commands:

Inputs: Fastq files [six samples] Mapping_file.txt uc_fast_params.txt

I have paired fast q files and to join both the files I have followed below command.

multiple_join_paired_ends.py -i /root/qiime-deploy/QIIME -o output_folder -p uc_fast_params.txt

This command generated output folder & this folder consists of each sample folder and each sample folder consists of three files [fastqjoin.join.fastq, fastqjoin.un1.fastq, fastqjoin.un2.fastq]. After join paired fastq files command, I followed below command.

multiple_split_libraries_fastq.py -i /root/qiime-deploy/QIIME/output_folder -o output_folder --demultiplexing_method sampleid_by_file

This command generated three out put files [histograms.txt, seqs.fna, split_library_log.txt].

Next below command is given count_seqs.py -i seqs.fna

which gave outputs like 2498936 : seqs.fna (Sequence lengths (mean +/- std): 176.4463 +/- 19.0080) 2498936 : Total

OTU picking: using an open-reference OTU picking protocol by searching reads against the Greengenes database For OTU picking, the below command has been followed but its running from past 3 days and there is output given but the run is not getting over. I am using QIIME default reference database [QIIME_default_reference/gg_13_8_otus]

pick_open_reference_otus.py -o otus -i seqs.fna -p uc_fast_params.txt

ADD REPLY • link 7.9 years ago by narasimhulu.bioinfo • 0

0

Entering edit mode

Is the size of the output file(s) increasing? If you look at the processes "top" does the pick_out script appear to be active?

ADD REPLY • link 7.9 years ago by GenoMax 141k

0

Entering edit mode

Everything looks ok for me except that you have 2 millions reads! It is a lot to process and I an not surprised that it is calculating for so long, especially when you decided to use open reference otu picking, which is much longer than for example close reference otu picking couse of extra de novo clustering. So, I would recommend to check as genomax2 say -- top or htop to see if your program is still running, if it is, wait :)

Best, Agata

ADD REPLY • link 7.9 years ago by agata88 ▴ 870

0

Entering edit mode

At least in older versions of QIIME this script defaulted to RDP classifier for taxonomy assingment, and more often than not, it simply hung without any error messages (the default amount of memory assigned to it wasn't nearly enough for real data sets).

ADD REPLY • link 7.9 years ago by 5heikki 11k

0

Entering edit mode

Thank You for reply.

Now this command[pick_open_reference_otus.py -o otus -i seqs.fna -p uc_fast_params.txt] running is complete and generated output files. Now I run this command

biom summarize-table -i otus/otu_table_mc2_w_tax_no_pynast_failures.biom, which gave outputs like

Num samples: 1
Num observations: 15397
Total count: 1685277
Table density (fraction of non-zero values): 1.000
Counts/sample summary:
 Min: 1685277.0
 Max: 1685277.0
 Median: 1685277.000
 Mean: 1685277.000
 Std. dev.: 0.000
 Sample Metadata Categories: None provided
 Observation Metadata Categories: taxonomy
Counts/sample detail:
fastqjoin.join.fastq: 1685277.0

Actively I run six samples, but the above results showing one sample. Please help me, where mistake in my side .

ADD REPLY • link updated 7.9 years ago by GenoMax 141k • written 7.9 years ago by narasimhulu.bioinfo • 0

0

Entering edit mode

Did you see this in output above?

Num samples: 1

Sample Metadata Categories: None provided

Sounds to me like you either did not provide a mapping file when you ran pick_otu's or the file was not valid/validated before the run.

ADD REPLY • link 7.9 years ago by GenoMax 141k

0

Entering edit mode

Did you resolve this issue? Looks to me like your input is just one single fasta file and maybe that's why it's being treated as one sample.

ADD REPLY • link 7.0 years ago by menglan.xiang • 0

score 0 · Answer 1 · 2016-06-02

0

Entering edit mode

7.9 years ago

agata88 ▴ 870

I agree, leave it alone. Pick_open reference_otus.py is a pipeline which include a lot of steps for example alignment, picking OTUS etc ... that can take a while.

Best,

Agata

ADD COMMENT • link 7.9 years ago by agata88 ▴ 870