Entering edit mode
8.9 years ago
mwanerhi erfgtr
▴
30
OK so I want to extract a set of protein sequences based on their ids in one file and the sequences in the other file. I modified a perl script but it is showing errors
01 #!/usr/bin/perl
02
03 if ($#ARGV != 1) {
04 print "usage: extracting sequences.pl IDFILE fasta_library > outputfile\n";
05 exit;
06 }
07 $home/samuel/Desktop/Darwin/Research/Masters/Analysis/BASYS/L111/L111ids.txt =$ARGV [0];
08 $home/samuel/Desktop/Darwin/Research/Masters/Analysis/BASYS/L111/L111.faa =$ARGV [1];
09
10 use strict;
11 use Bio::DB::Fasta;
12
13 my $database;
14 my $fasta_library = $ARGV[1]; #path to fastalibrary in the second argument
15 my %records;
16
17 open IDFILE, "<$ARGV[0]" or die $!; #first argument is the path of the file containing all the IDs you need to extract
18 open OUTPUT, <STDOUT>;
19
20 # creates the database of the library, based on the file
21 $database = Bio::DB::Fasta->new("$fasta_library") or die "Failed to creat Fasta DP object on fasta library\n";
22
23 # now, it parses the file with the fasta headers you want to get
24 while (<IDFILE>) {
25
26 my ($id) = (/^>*(\S+)/); # capture the id string (without the initial ">")
27 my $header = $database->header($id);
28 #print "$header\n";
29 print ">$header\n", $database->seq( $id ), "\n";
30 print OUTPUT ">$header\n", $database->seq( $id ), "\n";
31 }
32
33 #remove the index file that is useless for user
34 unlink "$fasta_library.index";
35
36 #close the filehandles
37 close IDFILE;
38 close OUTPUT;
39 exit;
Here is the error it is showing
:~/Desktop/Darwin/Research/Masters/Analysis/BASYS/L111$ perl extractingsequences.pl
Can't modify concatenation (.) or string in scalar assignment at extractingsequences.pl line 22, near "];"
BEGIN not safe after errors--compilation aborted at extractingsequences.pl line 25.
What am I doing wrong?