Forge a BSgenome data package
1
0
Entering edit mode
6.1 years ago
gtho123 ▴ 220

My supervisor has requested that I create coverage plots to visualize BAM alignments of RNA-Seq data. I though a good way to do this would be to use Gviz. We work on the model legume Medicago truncatula which does not have a BSgenome package so I though I'd try and make one.

Following the vignette I have placed all the chromosomes in their own FASTA files and gziped them. I then created a seed file like so:

Package: BSgenome.Mtruncatula.JCVI.v4
Title: Full genome sequences for Medicago truncatula A17 (JCVI version 4)
Description: Full genome sequences for Medicago truncatula A17 (Barrell medic) as provided by JCVI (v4, 2014) and stored in Biostrings objects. See Tang et al. (2014) BMC Genomics 15:312
Version: 4.0
organism: Medicago truncatula A17
common_name: Barrell medic
provider: JCVI
provider_version: v4
release_date: 2014
release_name: Mt4.0
source_url: ftp://ftp.jcvi.org/pub/data/m_truncatula/Mt4.0/Assembly/JCVI.Medtr.v4.20130313.fasta
organism_biocview: Medicago_truncatula
BSgenomeObjname: Mtruncatula
seqs_srcdir: /home/gthomson/Documents/Scratch/Alignment_visualisation/Medtr4_0.tar.gz
seqnames: c(paste0("Medtr4_0_", "chr",1:8), paste0("Medtr4_0_", "scaffold",sprintf("%04d", 1:2179)))

However when I run forgeBSgenomeDataPkg() i get this error:

Creating package in ./BSgenome.Mtruncatula.JCVI.v4
Error in getSeqSrcpaths(seqname, prefix = prefix, suffix = suffix, seqs_srcdir = seqs_srcdir) :
  file(s) not found: /home/gthomson/Documents/Scratch/Alignment_visualisation/Medtr4_0.tar.gz/Medtr4_0_chr1.fa

This is weird because I can look at this folder and it is there:

Bsgenome screenshot

How can I do this and any easier methods to generate coverage plots are welcome.

Bioconductor BSgenome Biostrings Gviz • 3.2k views
ADD COMMENT
1
Entering edit mode

Hello, I want to know if your problem has been solved, because I have encountered the same problem.

ADD REPLY
0
Entering edit mode

In the seed file (a txt file in the seq_srcdir folder),the seq_scrdir should be a directory and not a tar.gz file, in the above example, the seqs_srcdir directs to a tar.gz file, causing the error.

ADD REPLY
1
Entering edit mode
6.1 years ago

You've set the sequences directory as a single .tar.gz file. It should be a folder, unzipped.

ADD COMMENT

Login before adding your answer.

Traffic: 2456 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6