don't know exactly how to find my reference genome
2
1
Entering edit mode
3.3 years ago

Hi I am very new to bioinformatics analysis. I am trying to map a whole genome in galaxy, but I don't know exactly how to find my reference genome (chlamydomonas reinhardtii). I do know It is available on NCBI but don't know which file should be used as a reference genome. I mean: is the reference genome only one single file, or is it multiple files for each chromosome? I'd be grateful if you could help me.

genome assembly sequencing • 1.3k views
ADD COMMENT
0
Entering edit mode

Thanks Juanjo. I will do that. Much appreciated!

ADD REPLY
2
Entering edit mode
3.3 years ago
GenoMax 141k

You can find the representative genome page for your organism at NCBI (LINK). Actual genome sequence is in this file which is in fasta format. This is the file you should use as a reference.

If you need specific help with Galaxy then consider posting questions to their help forum: https://help.galaxyproject.org/

ADD COMMENT
0
Entering edit mode

First, thanks very much for your help. Much appreciated. So, the reference genome is a single file.

Now, I have another question. This .fna file contains ~500,000 letters, while chlamydomonas reinhardti's genome is 110 Mb. How can this reference cover all the genome?

ADD REPLY
0
Entering edit mode

That's not the answer I get:

% wget -qO- https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/002/595/GCF_000002595.1_v3.0/GCF_000002595.1_v3.0_genomic.fna.gz | gunzip -c | wc -c
 122115319

Part of that will be headers, but not 600k worth of characters. Double-check that you are working with the uncompressed file, and not the gzipped (gz) file, which is compressed.

ADD REPLY
0
Entering edit mode

Yes, you're right. Thanks very much for your kind help.

ADD REPLY
2
Entering edit mode
3.3 years ago
juanjo75es ▴ 130

In general, I would recommend you making a BLAST search in the ncbi online BLAST tool. Take one of your contigs and select around 10.000 bps and make a search. Sometimes there is no standard reference genome but someone already sequenced your species. And sometimes what you are assembling is not exactly what you think it is. The BLAST results include a link to the complete sequence that you can download.

ADD COMMENT

Login before adding your answer.

Traffic: 1997 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6