Question

how to create a dataset from fastq sequence files in ubuntu

0

Entering edit mode

23 months ago

Flo • 0

I am very new to bioinformatics and i need to analyse the microbiome in extracted dna samples (Illumina sequencing, single end). I have filtered the data (deleted barcodes and deleted the ones with a too high expected error with usearch) and created OTU files ( do not know if that was what i was supposed to do). i need to generate a dataset with the samples as rows and the columnnames as the microbiome species inside. the data is best relative abundancy data but at this point i am happy with binary data. just the presence of microbiota in one dataframe to preform a pca on the microbial communities in R. i already have a library of fastq files of all the bacteria, fungi and virus dna that i can use to detect the presence of microbiota. unfortunatly, i do not know how to do this or to use which commands. Ive been working on this for the past 3 weeks and dont know how to continue.

metagenomics microbiome dataset usearch ubuntu • 571 views

ADD COMMENT • link updated 23 months ago by Mark ★ 1.5k • written 23 months ago by Flo • 0

score 0 · Answer 1 · 2022-05-22

0

Entering edit mode

23 months ago

Mark ★ 1.5k

Sounds like you need to use QIIME or QIIME2, they have good tutorials that will help you get started:

https://qiime2.org/

ADD COMMENT • link 23 months ago by Mark ★ 1.5k