Forge a BSgenome data package
6.1 years ago
gtho123 ▴ 220

My supervisor has requested that I create coverage plots to visualize BAM alignments of RNA-Seq data. I though a good way to do this would be to use Gviz. We work on the model legume Medicago truncatula which does not have a BSgenome package so I though I'd try and make one.

Following the vignette I have placed all the chromosomes in their own FASTA files and gziped them. I then created a seed file like so:

Package: BSgenome.Mtruncatula.JCVI.v4
Title: Full genome sequences for Medicago truncatula A17 (JCVI version 4)
Description: Full genome sequences for Medicago truncatula A17 (Barrell medic) as provided by JCVI (v4, 2014) and stored in Biostrings objects. See Tang et al. (2014) BMC Genomics 15:312
Version: 4.0
organism: Medicago truncatula A17
common_name: Barrell medic
provider: JCVI
provider_version: v4
release_date: 2014
release_name: Mt4.0
organism_biocview: Medicago_truncatula
BSgenomeObjname: Mtruncatula
seqs_srcdir: /home/gthomson/Documents/Scratch/Alignment_visualisation/Medtr4_0.tar.gz
seqnames: c(paste0("Medtr4_0_", "chr",1:8), paste0("Medtr4_0_", "scaffold",sprintf("%04d", 1:2179)))

However when I run forgeBSgenomeDataPkg() i get this error:

Creating package in ./BSgenome.Mtruncatula.JCVI.v4
Error in getSeqSrcpaths(seqname, prefix = prefix, suffix = suffix, seqs_srcdir = seqs_srcdir) :
  file(s) not found: /home/gthomson/Documents/Scratch/Alignment_visualisation/Medtr4_0.tar.gz/Medtr4_0_chr1.fa

This is weird because I can look at this folder and it is there:

Bsgenome screenshot

How can I do this and any easier methods to generate coverage plots are welcome.

Bioconductor BSgenome Biostrings Gviz • 3.2k views
Hello, I want to know if your problem has been solved, because I have encountered the same problem.

In the seed file (a txt file in the seq_srcdir folder),the seq_scrdir should be a directory and not a tar.gz file, in the above example, the seqs_srcdir directs to a tar.gz file, causing the error.

6.1 years ago

You've set the sequences directory as a single .tar.gz file. It should be a folder, unzipped.


