I made a script that works very well he search ID's from other file and compare with genome sequence file and the output is when its match they print to a another file.
I run this script for different files of ID's and its fine, until now! Seems that my regular expression dont match with one specific ID and i dont why!
#This is the regex $key =~ m/^>([A-Z]+[0-9]+[A-Z]+(\-[A-Z])*).+$/o my $header_sub = $1;
And the ID (in bold) that dont match is:
>YER062C GPP2 SGDID:S000000864, Chr V from 280682-279930, Genome Release 64-2-1, reverse complement, Verified ORF, "DL-glycerol-3-phosphate phosphatase involved in glycerol biosynthesis; also known as glycerol-1-phosphatase; induced in response to hyperosmotic or oxidative stress, and during diauxic shift; GPP2 has a paralog, GPP1, that arose from the whole genome duplication"
I have tried several things include delete the ID and write again... i checked all the phases from my script and its here on match thing that "disappear"!
I will be very grateful if you help me!