Question

Pick De Novo OTUs Qiime

1

Entering edit mode

9.0 years ago

Ruth M. ▴ 10

Hi Everyone,

I'm trying to pick_de_novo_otus.py on 50 samples (.fa files) that I have trimmed and aligned (not using QIIME). It seemed straightforward, but I can't get the input format of the data right. I thought it could be comma delineated, but it says that the file doesn't exist 'home/Data/KK.01.fa,KK.02.fa,KK.03B.fa,KK.04.fa,KK.05.fa' (command below)

pick_de_novo_otus.py -i home/Data/KK.01.fa,KK.02.fa,KK.03B.fa,KK.04.fa,KK.05.fa -o home/Data/OTU&

I also tried using -i *.fa instead, but that gave the same result. When processing a large number of samples through the OTU picking, what is the proper way to input your files?

Thank you!!!

qiime OTU • 2.9k views

ADD COMMENT • link updated 6.6 years ago by ankit hinsu ▴ 10 • written 9.0 years ago by Ruth M. ▴ 10

0

Entering edit mode

Thanks, I'm unable to run split_libraries_fastq, so I have alternatively trimmed my sequences and merged the paired reads so that I now have individual .fa files for each sample rather than the .fna file. I ended up writing out pick_de_novo_otus.py script for each one in a shell script and running it using bash, and it seems to be working.

ADD REPLY • link 9.0 years ago by Ruth M. ▴ 10

score 0 · Answer 1 · 2016-07-23

0

Entering edit mode

9.0 years ago

Picasa ▴ 680

You need to run split_libraries_fastq before.

http://qiime.org/scripts/split_libraries_fastq.html

ADD COMMENT • link 9.0 years ago by Picasa ▴ 680

score 0 · Answer 2 · 2018-11-22

Hi,

I think that by this time, you might have received solution to your problem. Nevertheless, here is the solution.

QIIME1 requires input file in proper way (called QIIME-compatible format). Their tutorial "454 tutorial for de novo OTU picking" has a demultiplexing step. This step does three things: denmultiplexing, quality-filtering and combine all the sequences from all samples in QIIME-compatible way.

What you require is alternate to this step.

The answer is "add_qiime_labels.py" script. This script asks for a folder with all you input files and a mapping file having information about "which file belongs to which sample" [same mapping file used in QIIME with one extra column mentioning name of the file]. On running this script, it will combine all the sequences in QIIME-compatible format and produce a single fasta file (which you can use for OTU picking).

Hope it helps!!