Question: How to separate the largest strain when it has the genome of its chromosome
0
gravatar for Shelle
10 weeks ago by
Shelle0
Shelle0 wrote:

I have large number of fasta files of bacteria from NCBI (in the GCF format) _genomic.fna.gz, and i am planning to extract the largest strain out of fasta files. I have noticed there are some organisms which contain the genome of its some chromosomes and hence for these cases it is not enough to extract the largest strain, since I should have all chromosomes. Different files have different headers and header in the first line of several different fasta files is as below:

>NZ_LS483491.1 Staphylococcus auricularis strain NCTC12101 genome assembly, chromosome: 1

>NZ_CP012214.1 Campylobacter jejuni strain CJ088CC52, complete genome

>NZ_CP016324.1 Vibrio cholerae 2740-80 chromosome 1, complete sequence

>NC_013791.2 Bacillus pseudofirmus OF4, complete genome # this file has a complete genome and the others 
                                                                                                     #  are some complete sequences of some strains

I am completely new to sequencing. Can anyone tell me a way to extract the largest strain when I have a large number of files with different content like the situation keeping all the chromosome and on the other hand extracting the largest sequence when the file doesn't include the chromosome in the header ??

chromosome sequence genome • 144 views
ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by Shelle0

If you wanted only the complete genomes, you should have used the solution here: How to download COMPLETE bacterial genomes from NCBI based on list of names?

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by genomax58k
0
gravatar for genomax
10 weeks ago by
genomax58k
United States
genomax58k wrote:

i am planning to extract the largest strain out of fasta files.

That sentence is not making total sense but I am going to assume that you want the longest fasta sequence of the lot irrespective of the strain name. Take a look at this thread to get that information.

ADD COMMENTlink written 10 weeks ago by genomax58k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 769 users visited in the last hour