Hi everyone, I have been trying to edit the sequence titles within a fasta file and have come up short. Please can someone help if they can. Sorry if this is a dupication. I've tried searching around. There ar elots of similar things but i don't know how to change them to my specific situation.
I am doing illumina full genome sequencing of influenza viruses. The script that processes the data outputs a comnsensus sequence fasta file. Within that file there are 8 sequences. The file has a name based on the sample number submitted to the sequencingguuys, and the sequences inside the file have titles that are actually the reference used to map data against. I want to rename the file, and all 8 sequences, with the same name, but the 8 segments also need to say which gene they are.
I have started writing a script but am now stuck, please can anyone point me in the right direction.
This is the script i started Not sure how this will display
#!/bin/bash set -e #script to copy, rename, and edit fasta files #Needs an input file fastafile=$1 # Get path from file DIR=$(dirname "$fastafile") echo "$DIR" #echo information if script not used properly if [ $# -ne 1 ]; then echo "Use: FluSeqOutputRename.sh <a fasta file you want renamed> " exit 1 fi echo " This script will edit "$fastafile" Re-naming the file and editing the fasta headers " #check how many sequences are in the fasta file NoSeqs=$(grep -o '>' "$fastafile" | wc -l) echo " There are "$NoSeqs" gene segments " if [ "$NoSeqs" -ne 8 ]; then echo " There are NOT 8 gene segments Please Start Again " exit 1 fi #i might be able to just do it with 1 input IF i can replace the / with - later read -p "Please enter the virus name: " virname NewFileName=$(echo "$virname" | tr '-' '_' | tr '/' '-') echo "$virname" echo "$NewFileName" read -p "Your file will be named "$NewFileName" is this correct y or n?: " correct if [ "$correct" = n ]; then echo "Please Start again" exit 1 fi echo "I am continuing to copy and edit your file" #Now need to copy the file and rename it cp -i -v "$fastafile" "$DIR"/"$NewFileName".fasta #fastacopy="$DIR"/"$flutype-"$species-"$country"-"$identifier"-"$year".fasta fastacopy="$DIR"/"$NewFileName".fasta echo "The new file is "$fastacopy"" #I can make a new text file which is the header lines i want but does this even help? echo "$virname" PB2 > "$DIR"/SequenceNames.txt echo "$virname" PB1 >> "$DIR"/SequenceNames.txt echo "$virname" PA >> "$DIR"/SequenceNames.txt echo "$virname" HA >> "$DIR"/SequenceNames.txt echo "$virname" NP >> "$DIR"/SequenceNames.txt echo "$virname" NA >> "$DIR"/SequenceNames.txt echo "$virname" MP >> "$DIR"/SequenceNames.txt echo "$virname" NS >> "$DIR"/SequenceNames.txt
but thats as far as I can get. Any help gratefully received! Thank you James