HTSEQ: alignments to combo of two bacterial genomes
0
0
Entering edit mode
18 months ago

I have RNAseq data from a co-culture of two bacteria. I also have assembled, annotated genomes for each bacterium (A has 2 contigs; B has 3). I have run hisat2-build on a combined database (5 contigs, named A_1:2 and B_1:3) and have a .sam file of the aligned reads. I aim to run htseq-count on the .sam file. However, I have a different genome feature file for each genome. It makes sense to combine them prior to running htseq-count, but I'm worried that the program throw errors since the coordinates of the two genomes will overlap. Below are previews of the gffs (with carriage returns added for readability). Will htseq-count overcount/throw an error because of the overlapping coordinates? Or will it be okay since the first column of each gff file has a contig identifier with the taxa name (A_1 vs. B_1)?

head A_genome.gff
##gff-version 3
##sequence-region A_1 1 3350537
##sequence-region A_2 1 791720
A_1 Prodigal:002006 CDS 259 618 .   -   0   ID=n_00001;inference=ab initio prediction:Prodigal:002006;locus_tag=n_00001;product=hypothetical protein
A_1 Prodigal:002006 CDS 725 1828    .   -   0   ID=n_00002;eC_number=2.7.2.11;Name=proB_1;db_xref=COG:COG0263;gene=proB_1;inference=ab initio prediction:Prodigal:002006,similar to AA sequence:UniProtKB:P0A7B5;locus_tag=n_00002;product=Glutamate 5-kinase

head B_genome.gff
##gff-version 3
##sequence-region B_1 1 3532203
##sequence-region B_2 1 1915494
##sequence-region B_3 1 337275
B_1 Prodigal:002060 CDS 798 1475    .   +   0   ID=n_00001;inference=ab initio prediction:Prodigal:002060;locus_tag=n_00001;product=hypothetical protein
B_1 Prodigal:002060 CDS 1901    2113    .   -   0   ID=n_00002;inference=ab initio prediction:Prodigal:002060;locus_tag=n_00002;product=hypothetical protein
htseq-count transcriptomics RNA-seq • 519 views
ADD COMMENT
1
Entering edit mode

You may want to consider doing alignments following "binning" the reads using a tool like bbsplit.sh. BBSplit syntax for generating builds for the reference genome and how to call different builds.

ADD REPLY
0
Entering edit mode

Wonderful, thank you so much! I'll check that out.

ADD REPLY

Login before adding your answer.

Traffic: 1022 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6