Problem with tximport and plasmodium falciparum
2
0
Entering edit mode
6 weeks ago
bioinfo ▴ 50

Hello,

I aligned my samples with kallisto to a transcriptome for plasmodium falciparum. The file I used to make the reference is Plasmodium_falciparum.ASM276v2.cdna.all.fa.gz which I downloaded from here http://ftp.ensemblgenomes.org/pub/protists/release55/fasta/plasmodium_falciparum/cdna/Plasmodium_falciparum.ASM276v2.cdna.all.fa.gz.

However, I am having issues with tximport.

The error that I get is:

Error in .local(object, ...) :
None of the transcripts in the quantification files are present
in the first column of tx2gene. Check to see that you are using
the same annotation for both.

Example IDs (file): [CAX64123, CAX64256, CZT99967, ...]

This can sometimes (not always) be fixed using 'ignoreTxVersion' or 'ignoreAfterBar'.


I understand that the problem seems to be on the mart object I created and that maybe I am getting a different version. However, I think that the problem is the external gene name. I see on the mart object that it is an attribute but when I add it to t2g the column is empty. Has anyone had that issue before?

My script is below:

mart <- biomaRt::useMart("protists_mart", host= "https://protists.ensembl.org", "pfalciparum_eg_gene")
t2g <- biomaRt::getBM(attributes = c("ensembl_transcript_id", "ensembl_gene_id", "external_gene_name"), mart = mart)
t2g <- dplyr::rename( t2g, gene_symbol = external_gene_name)
t2g<-t2g[,c(ncol(t2g),1:(ncol(t2g)-1))]

accessions <- list.dirs(full.names=FALSE)[-1]
kallisto.dir<-paste0(accessions)
tsv_files<-file.path(kallisto.dir,"abundance.tsv") #can also be abundance.tsv
names(kallisto.files)<- accessions
tx.kallisto <- tximport(kallisto.files, type = "kallisto", tx2gene = t2g)


Thank you

tximport • 571 views
2
Entering edit mode
4 weeks ago
bioinfo ▴ 50

I figured out that most of the external gene name column was empty so that is why it was not working. I ended up just using the ensembl_gene_id and it works fine now.

0
Entering edit mode
6 weeks ago
ATpoint 70k

Try to remove that dot from the t2g names.

gsub("\\..*", ""', t2g[,1])

0
Entering edit mode

Thank you. It does not seem to be that. I changed my script a bit. Now it looks like shown below:

mart <- biomaRt::useMart("protists_mart", host= "https://protists.ensembl.org", "pfalciparum_eg_gene")
t2g <- biomaRt::getBM(attributes = c("ensembl_transcript_id", "ensembl_gene_id", "external_gene_name"), mart = mart)
#t2g <- biomaRt::getBM(attributes = c("ensembl_transcript_id", "ensembl_gene_id"), mart = mart)
t2g <- dplyr::rename( t2g, gene_symbol = external_gene_name)

accessions <- list.dirs(full.names=FALSE)[-1]
kallisto.dir<-paste0(accessions)
tsv_files<-file.path(kallisto.dir,"abundance.tsv") #can also be abundance.tsv
names(kallisto.files)<- accessions
tx.kallisto <- tximport(kallisto.files, type = "kallisto", tx2gene = t2g)


If I use the script as shown above then for the counts in the tx.kallisto object I just get one number. If I comment the second line out and use the 3rd line for the getBM attributes I do get a file with the ensembl gene IDs. It seems to be something with the external gene name causing the problem.

0
Entering edit mode

You do not do what I suggested above.

0
Entering edit mode

I figured out that most of the external gene name column was empty so that is why it was not working. I ended up just using the ensembl_gene_id and it works fine now. Thank you