Mapping .fna reference genome to itself in order to include it as an outgroup
0
0
Entering edit mode
23 months ago
Michal • 0

Hello,

Please pardon a potentially naive question. I have 30 newly sequenced WGS data files for a bird species.These are all in a FASTQ format. I have mapped them to a reference genome from NCBI, which is the only full genome sequence for a member of the same family, using bowtie2. All works fine at this point, but to have an outgroup for downstream phylogenetic analyses I would like to include the said reference genome in my analyses, as it is a perfect outgroup and the only same-family species available.

I probably naively tried to map it to 'itself' using bowtie2. Firstly, I am not sure if this is a correct approach. Secondly, the approach is failing due to constant out-of-memory errors on my cluster. Today I realised this might be due to the fact that bowtie2 is not meant to deal with this kind of data. The .fna reference file contains scaffold-level fastas, so some of them are in the range of 100s of kB. I imagine this is what kills the process due to OOM.

Is there a way to solve this problem and include the reference genome in my study? Any help will be appreciated.

For extra info this is the bowtie2 command. ref_genome/Carolina_wren is indexed and correctly prepared for mapping. All resources for the threads command have been correctly specified in the cluster settings.

bowtie2 -x ref_genome/Carolina_wren -f -U ref_genome/GCA_013397245.1_ASM1339724v1_genomic.fna -S CW.sam --threads 96

Thanks, Michal

fasta bowtie2 mapping • 519 views
ADD COMMENT

Login before adding your answer.

Traffic: 3628 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6