Question: how to concatenate contigs from repseq?
0
gravatar for bitpir
11 days ago by
bitpir90
bitpir90 wrote:

Hello,

Is there a good way to concatenate the contigs from refs (from single sequencing project) to get a whole genome sequence? Is there a good bioinformatics tools out there to do this?

Thanks!

concatenate refseq ncbi • 125 views
ADD COMMENTlink modified 11 days ago by h.mon19k • written 11 days ago by bitpir90

You want to "concatenate" different genome assemblies and get a single whole genome assembly? Can you be more specific? For example, could you provide examples of the genomes do you want to concatenate?

ADD REPLYlink written 11 days ago by h.mon19k

Absolutely! Let's say I want to combine NZ_LNUL01000001:NZ_LNUL01000133 into one whole genome sequences. These are scaffolds from a whole genome seq project on NCBI and are stored in different accessionId's. I would like to combine all these scaffolds (sometimes contigs for some other species) so that I can get one single file e.g. NZ_GG668845.1.

I expect the scaffold may/not have some overlapping sequences, just wondering how easily I can do this...

ADD REPLYlink written 11 days ago by bitpir90
0
gravatar for h.mon
11 days ago by
h.mon19k
Brazil
h.mon19k wrote:

If you download from the RefSeq ftp assemblies site, you don't need to concatenate contigs / scaffolds, there is a single file for the assembly. For example, for the assembly you provided as example, the file in question is:

ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/001/742/375/GCF_001742375.1_ASM174237v1/GCF_001742375.1_ASM174237v1_genomic.fna.gz

In addition to the whole assembly, the ftp folder has files with protein translations, gene annotations, and so forth.

Just in case you already downloaded the files, or if there is no single file with the whole assembly, you can concatenate several fasta into one with cat:

cat NZ_LNUL01000*.fas > NZ_GG668845.1.fas
ADD COMMENTlink written 11 days ago by h.mon19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1698 users visited in the last hour