Hello,
I am working with FASTQ files and I want to filter them based on the alignment with references sequences in FASTA format.
I decided to use QIIME2 for this. So I imported both FASTA and FASTQ files to the required format for QIIME2 (qza artifact).
- Command to import the FASTQ files to QZA:
qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path manifest_file.tsv \
--output-path query_sequences.qza \
--input-format PairedEndFastqManifestPhred33V2
- Command to import the FASTA files to QZA:
qiime tools import --input-path reference_sequences.fasta --output-path reference_sequences.qza --type 'FeatureData[Sequence]'
The problem comes when I run the command to filter the query sequences:
qiime quality-control exclude-seqs \
--i-query-sequences query_sequences.qza \
--i-reference-sequences reference_sequences.qza \
--p-method blast \
--p-perc-identity 0.90 \
--p-perc-query-aligned 0.90 \
--o-sequence-hits hits.qza \
--o-sequence-misses misses.qza
This command gave the following error:
(1/1) Invalid value for '--i-query-sequences': Expected an artifact of at
least type FeatureData[Sequence]. An artifact of type
SampleData[PairedEndSequencesWithQuality] was provided.
The type SampleData[PairedEndSequencesWithQuality]
is the one used to import the FASTQ to QZA and the typeFeatureData[Sequence]
is the one used to import the FASTA to QZA.
Is there a way to use qiime quality-control exclude-seqs
directly with the FASTQ files?
Thank you very much.
According to the documentation: "The exclude-seqs method aligns a set of query sequences contained in a FeatureData[Sequence] file against a set of reference sequences." Both artifacts should be fasta files, one with the features, the second one with the reference, both with the type FeatureData[Sequence].