Entering edit mode
5.1 years ago
kathryn.eckartt
▴
40
I appreciate any advice you could give! I'm using R version 3.4.4, platform 64-bit Linux-gnu.
I've aligned RNAseq reads to a reference genome using QuasR and created a gene model from a GFF file:
qAlign("sampleTable.txt","refGenome.fa,splicedAlignment=TRUE)
library(GenomicFeatures)
geneModel<-("path/geneModel.gff3.gz","gff3")
genes<-genes(geneModel)
I've confirmed my sorted bam file has alignments, but when I try to obtain gene counts using GenomicAlignments summarizeOverlaps() all counts are listed as 0 in my final matrix.
library(GenomicAlignments)
sortBam("alignment.bam","alignment.bam.bai",destination="Sorted")
indexBam("Sorted.bam")
myBam<-BamFile("Sorted.bam",yieldSize=10000)
geneCounts<-summarizeOverlaps(genes,myBam,singleEnd=FALSE)
myGeneGR<-rowRanges(geneCounts)
myGeneCounts<-genecounts[genes$gene_id==myGeneGR$gene_id,]
library(SummarizedExperiments)
myGeneCounts_Matrix<-assay(myGeneCounts)
Yuhp, they are both of the same organism from the same assembly. Since the organism is a bacteria and there is only 1 chromosome the gene_id is the common identifier between files.
The gene id is in common between your gff and your reference?? Not the chromosome name?
If I understand what you're asking, it's the gene_id that's common. The gff lists
and the reference genome for the single chromosome lists
After aligned, sorted bam file has gene_ids and so does the gene model made from the gff with makeTxDbFromGFF.
So your reference does not have gene_IDs? The entry under chromosome in your gff should be "Assembly_species_complete_genome". Is that what it is?
Yuhp, that's correct- the reference doesn't have a gene-id. There isn't a Chromosome heading in the gff file, that is just what each line starts with.