Question: Rsubread: ERROR: cannot finish the SAM file
0
gravatar for TheCheshireCat
5 months ago by
TheCheshireCat0 wrote:

Hello everyone, I want to get the read counts for RNA-Seq data by using the package Rsubread. I got my RNA-Seq data from sra and all other files (genes, genome,...) from NCBI assembly. I tried the following code on 4 different organisms and for 2 it worked perfectly fine but for the other two I got the following error:

ERROR: cannot finish the SAM file! Please check the disk space in the output directory. No output file was generated.

Fehler in .load.delete.summary(output_file[i]) : ERROR: Summary file Bacillus_amyloliquefaciens_samfile.sam.summary was not generated! The program terminated wrongly!

There is enough disk space available. Here is my code:

rna_counts <- function(path, organism, genome_gff, genome_fna, feature_table, reads_1, reads_2=NULL){
  require(Rsubread)
  require(rtracklayer)
  require(biomaRt)
  RNAseqDATADIR <- path
  setwd(RNAseqDATADIR)
  dir(RNAseqDATADIR)

  REF_GENOME <- paste0("ref_data/",genome_fna)
  RSUBREAD_INDEX_PATH <- "ref_data"
  RSUBREAD_INDEX_BASE <- organism
  buildindex(basename=file.path(RSUBREAD_INDEX_PATH,RSUBREAD_INDEX_BASE), reference=REF_GENOME)

  inputfilefwd <- file.path(RNAseqDATADIR, paste0("raw_data/", reads_1))
  if (!is.na(reads_2)) inputfilervs <- file.path(RNAseqDATADIR, paste0("/raw_data/", reads_2))
  else inputfilervs <- NA

  align(index=file.path(RSUBREAD_INDEX_PATH,RSUBREAD_INDEX_BASE), readfile1=inputfilefwd, readfile2=inputfilervs, output_file=paste0(organism,"_samfile.sam"), output_format="SAM",nthreads = 4)

  outputsamfile <- paste0(path,"/", organism, "_samfile.sam")
  propmap <- propmapped(outputsamfile)

  feature_table <- read.table(paste0(path,"/ref_data/",feature_table),header=TRUE,sep = c("\t","\n"))
  GENOME_GFF <- readGFF(paste0(path,"/ref_data/",genome_gff))

  saf_file = gff_to_saf(GENOME_GFF)       # own function for creating a saf file
  if is.na(reads_2)) paired = FALSE else paired = TRUE
  feature_counts <-featureCounts(outputsamfile, annot.ext=saf_file, isPairedEnd=paired)
  return(feature_counts)
}

Input:

genome_gff  = "GCA_000221645.1_ASM22164v1_genomic.gff"

genome_fna = "GCA_000221645.1_ASM22164v1_genomic.fna"

feature_table = GCA_000221645.1_ASM22164v1_feature_table.txt"

reads_1 =   "SRR1916792_1.fastq"

reads_2="SRR1916792_2.fastq"

During the calculations there were also following print outs which look a bit buggy:

|| 311% completed, 17 mins elapsed, rate=7,8k fragments per second ||
|| 363% completed, 17 mins elapsed, rate=8,8k fragments per second ||
||365% completed, 18 mins elapsed, rate=8,7k fragments per second ||

Thanks for your help!

rna-seq rsubread • 225 views
ADD COMMENTlink written 5 months ago by TheCheshireCat0
1

Since you have enough space, I wonder if the problem is that something is wrong with your output directory. Maybe its not being named properly?

ADD REPLYlink written 5 months ago by swbarnes25.2k

It is correct since there is a temporary SAM file in the directory while it is calculating and it works till a cerain point where it throws the error message. For the one organism it was always at 51% (tried it multiple times) and for the other 365% (which is a bit strange).

ADD REPLYlink written 5 months ago by TheCheshireCat0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1075 users visited in the last hour