How To Read A Fasta File With The Bio3D Package
2
1
Entering edit mode
13.2 years ago
laemtao ▴ 40

Problem: I'm using the bio3d package in R to read a fasta file with about 4 sequences. When I try to read the fasta file, I get the error message that the '>' character cannot be found. This would mean two things to me, either the fasta file is corrupted with invisible ascii characters or the permissions on the file are wrong. I checked both conditions and I am still not able to read my file.

> foo <-system.file("~/Homology/seq_temp.fa", package="bio3d")
> aln <- read.fasta(foo)
1: attributes(aln)
2: 
Error in read.fasta(foo) : 
  read.fasta: no '>' id lines found, check file format

The example xray.fa is working correctly:

> foo <-system.file("examples/hivp_xray.fa", package="bio3d")
> aln <- read.fasta(foo)
> attributes(aln)
$names
[1] "id"  "ali"

$class
[1] "fasta"

I'm not too sure what the problem can be. Writing a fasta file (unaligned or aligned) is pretty fool proof, I have no idea what is causing this error.

Thank you in advance.

r bioconductor fasta • 7.2k views
ADD COMMENT
0
Entering edit mode

Please paste the first few lines of your fasta file here, e.g. use "head -5 ~Homology/seq_temp.fa". Somebody here might spot something obviously wrong with it.

ADD REPLY
0
Entering edit mode

Also, can you please rename the question to something like a question - the title looks awful and is giving me a headache :)

ADD REPLY
0
Entering edit mode

I am not receiving any output, I know I should be:

head -5 ~Homology/seq_temp.fa
head - 5 ~ Homology/seq_temp.fa
ADD REPLY
0
Entering edit mode

Regardless, here is the first sequence from my fasta file:

>gi|86159715|ref|YP_466500.1| FAD-dependent pyridine nucleotide-disulphide oxidoreductase [Anaeromyxobacter dehalogenans 2CP-C]
MRVAIIGSGPAGFYAAEALLKRTDTAVDVDMFDRLPTPFGLVRGGVAPDHQRIKAVTRVFASTAARPTFR
FLGNVRLGRDVTVDDLRRHYHQIVYATGSESDRRLGIPGEGIERCTPATVFVGWYNGHPDYRHARFDLSV
RRAAVVGNGNVAVDVARILLRTRAELERTDIAAHALEALRESQVREVYLLGRRGPAQAAFSPAELRELGT
[abridged]
ADD REPLY
0
Entering edit mode

post your file, your file is empty possibly!

ADD REPLY
0
Entering edit mode

this is not the point, check my answer again!

ADD REPLY
0
Entering edit mode

Thanks, it works! I can start plotting now!

ADD REPLY
3
Entering edit mode
13.2 years ago
Michael 54k

Use the Biostrings package functions read.DNAStringset, read.AAStringset or readFASTA instead.

Edit: Nope, you simply copy pasted from the example: try

foo <- file("~/Homology/seq_temp.fa")

system.file is only for files in the R-installation!

ADD COMMENT
0
Entering edit mode

Thank you, I was not familiar with Biostrings. I've installed it and I am receiving the following output:

ADD REPLY
0
Entering edit mode
>moo = system.file("/Homology/seq_temp.fa", package="Biostrings")
moo = readFASTA(moo)
Error in readFASTA(moo) : no FASTA sequences found
In addition: Warning messages:
1: In readFASTA(moo) :
  use 'strip.descs=FALSE' for compatibility with old version
  of readFASTA(), or 'strip.descs=TRUE' to remove the "&gt;"
  at the beginning of the description lines and to get
  rid of this warning (see '?readFASTA' for more details)
2: In file(file, "r") :
  file("") only supports open = "w+" and open = "w+b": using the former
ADD REPLY
0
Entering edit mode

your file is definitely broken!

ADD REPLY
0
Entering edit mode

read ?system.file

ADD REPLY
0
Entering edit mode

As Michael mentioned stop using the system.file() command, use file() instead.

ADD REPLY
0
Entering edit mode

Thanks, it works! I can start plotting now!

ADD REPLY
0
Entering edit mode

That's great. Laemtao, if you are happy with the answer, please accept it by clicking the check-mark on the side. Thanks.

ADD REPLY
0
Entering edit mode
13.1 years ago

The R system.file() command is the problem here. That is used in the bio3d documentation to indicate that the file ("examples/hivp_xray.fa") is somewhere in the bio3d package directory, wherever that may be in your system.

When reading your own files, all you need is the actual path:

>foo <-"~/Homology/seq_temp.fa"
>aln <- read.fasta(foo)
ADD COMMENT

Login before adding your answer.

Traffic: 2389 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6