Question: Starting the assembly with .fastq file
0
gravatar for Raghul
2.3 years ago by
Raghul200
Italy
Raghul200 wrote:

Hi Members, We gave the samples for sequencing & we have obtained the .fastq file & .fastq zip file. I want to assemble & annotate the sequences. It is low/moderate coverage sample. So what can I do next? I have windows computer. Is it enough? Please do not be irritated by the simplicity of the question! I have to learn & complete the project! There was an excel sheet containing the following information (What can I infer from this anybody)

Experiment Name MISEQ RUN 75
Workflow GenerateFASTQ
Application FASTQ Only
Assay TruSeq LT
Description P178_P128_P271_P150 Chemistry Default

Reads 301

Thanks Raghul

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Raghul200

Mammalian genome de novo assembly with 301 reads of MiSeq data in fa format (no qualities?)? This sounds like Mission Impossible. Could you please tell the species' name, print out first few lines of your fa and fa zip files as well sizes of these files or number of lines. Thank you

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Petr Ponomarenko2.6k

Sorry it is a .fastq file

ADD REPLYlink written 2.3 years ago by Raghul200
1
gravatar for Philipp Bayer
2.3 years ago by
Philipp Bayer6.3k
Australia/Perth/UWA
Philipp Bayer6.3k wrote:

Have you looked inside the .fa file? Normally genome sequencing reads are delivered as fq/fastq files, a .fa(.fasta) is usually an assembly, not sequencing results. If the file starts with '>' then I guess it's a fasta and you got a finished assembly - if it starts with '@' it's a fastq and you got sequencing reads.

Are you sure you got 301 reads? That's nothing!

If you have fastq files and want to annotate them, there are myriad tutorials out there for you on genome assembly, depending on the complexity of your organism, your chemistry etc. you may want to change, like this one: https://www.ebi.ac.uk/training/online/course/ebi-next-generation-sequencing-practical-course/genome-assembly-velvet

For annnotation there are different pipelines depending on, again, the complexity of your organism. For prokaryotes look at prokka - for fungal genomes look at funannotate - for everything else have a look at the MAKER pipeline.

ADD COMMENTlink written 2.3 years ago by Philipp Bayer6.3k

Hi Philipp Sorry for the late reply. I had some network issues & I am working in a small town in Ethiopia. Still looking into the files. Thanks! Raghul

I have fastq files(mammalian genome). So I want a assembler for Windows. Can anybody suggest an appropriate tool! Thanks

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Raghul200

I have fastq files(mammalian genome). So I want a assembler for Windows.

I don't think there are any software packages that will run on windows that can handle a human genome. You will also need a lot of RAM (hundreds of GB) to assemble a human genome. Since you appear to have MiSeq data it is unlikely that you have enough data to assembly a human genome.

Your best bet to start analyzing your data is to align it to existing reference. Even for that you are going to need anywhere between 6 to >30 GB of free RAM. You are likely to have very low coverage (if this is a whole genome sequencing experiment).

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by genomax69k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1453 users visited in the last hour