Question: How can I use BLAST to extract chloroplast sequences from DNA reads?
gravatar for AcademicDialysis
3.7 years ago by
United States
AcademicDialysis60 wrote:

I'm trying to extract the chloroplast sequences from my reads, as Whole Genome Sequencing was used to produce them.

This paper: mentions that to do this, they BLASTed their reads against all of the known genomes in the same family. For me, this family would be Fabacaea.

Does anyone know of a quicker way to do this besides manually downloading every FASTA file containing Fabacaea chloroplast sequences from NCBI? Or of a better way to extract chloroplast sequences from my reads? I do know that chloroplast DNA should be more abundant than other DNA because it is more highly repeated than nuclear or mitochondrial DNA.

Info about reads: 300bp average, paired-end reads from Illumina MiSeq


Thanks in advance!

ADD COMMENTlink modified 3.7 years ago by 5heikki8.0k • written 3.7 years ago by AcademicDialysis60
gravatar for Pierre Lindenbaum
3.7 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum116k wrote:

search NCBI for chloroplast + Fabacaea[Filter]%29%20AND%20%22fabaceae%22[Organism]

and download the sequences as fasta.

Index the fasta with `bwa index` and map with `bwa mem`


ADD COMMENTlink written 3.7 years ago by Pierre Lindenbaum116k

Since the closest reference is just in the same family, I don't think my consensus sequences would be very large. Should I just blast and then use those reads to do de novo assembly? Or should I still use bwa and just use all of those sequences as reference?



ADD REPLYlink written 3.7 years ago by AcademicDialysis60
gravatar for 5heikki
3.7 years ago by
5heikki8.0k wrote:

Assuming the chloroplast genome differs from the host in GC% and codon usage, the quickest way would be to bin the reads based on tetramer frequencies.

ADD COMMENTlink written 3.7 years ago by 5heikki8.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1072 users visited in the last hour