Hi! I am using goseq on some bovine data and I am using Ensembl IDs for bovine genome UMD3.1, which is not supported by goseq, so I had to input lenght and GO information manually. I tried obtaining length information 2 ways: 1) from featCounts files results; 2) from biomaRt.
Both gave me weirdly shaped plots. Not sure if I'm doing something wrong. The graphs are in totally opposite direction then I expected them to be ( instead of going up and stabilizing as I expected, they are going down).
Below are the codes of how I obtained length info and plots, as well as the plots
1) Using featureCounts results:
featCountsData <- read.table("charolais_145A_TGACCA.featCounts.txt", header=TRUE, comment.char = "#", sep="\t") featCountsLengthData <- as.numeric(featCountsData[,"Length"]) names(featCountsLengthData) <- featCountsData$Geneid LenData <- names(geneLengthData) %in% names(genes) LenData <- geneLengthData[LenData] pwf <- nullp(DEgenes = genes, bias.data = LenData, plot.fit = TRUE)
2) Using Biomart:
txsData <- makeTxDbFromBiomart(biomart ="ensembl", dataset = "btaurus_gene_ensembl") txsByGene <- transcriptsBy(txsData,"gene") geneLengthData <- median(width(txsByGene)) BiomartLenData <- names(geneLengthData) %in% names(genes) BiomartLenData <- geneLengthData[testLenData] pwf <- nullp(DEgenes = genes, bias.data = BiomartLenData, plot.fit = TRUE)
PLOT 1: Length info obtained from featureCounts:
PLOT 2: Length info obtained from biomaRt
Any help is appreciated!