Help with Output of FASTA Files From Excel
1
1
Entering edit mode
7 months ago

Hi All,

I am going to create a custom Bowtie2 index for a sgRNA library. I have the sequences of the sgRNA "barcodes" as an excel file shown below....

>ACTL6A_1
GGATAGTTTCCAAGCTATTT

>ACTL6A_3
TTTGCTAATGGTCGTTCTAC

>ACTL6A_5
GTTGAAGGACATAGCCATCG

>ACTL6A_7
ACTGCAATTCCAGTCCACGA


This goes on for 7000 sgRNA sequences. I would like to output these as individual FASTA files. One FASTA sequence per file, with the file named after the sgRNA bar code identifier in the FASTA header. So for example file 1 would contain..

 >ACTL6A_1
GGATAGTTTCCAAGCTATTT


and be named ACTL6A_1.fa. Can someone help me figure out how to do this using terminal commands?

Any help would be greatly appreciated.

Thanks,

Joe

sequence • 360 views
0
Entering edit mode

Why do you need to output them as individual files? Just copy and the paste the data into a programmers editor (use Notepad or Notepad++ on Windows or textpad on macOS). Save the file as pain text and use it as input for bowtie2 indexing. A multi-fasta format file is the input for aligner indexing programs.

0
Entering edit mode

I am assuming that I need to have a separate file for each sgRNA to get Bowtie2 to add the FASTA header information to the alignment in the BAM file. That is what I will use with Feature Counts to count how many reads aligned to each sgRNA from the FastQ sequencing file.

1
Entering edit mode

Your assumption is crazy. Bowtie will take the name of each sequence from the header of each sequence (the part after the ">"), not from the file name!

And you hardly need featureCounts to count up how many reads aligned to each sequence. samtools idxstats will do that.

0
Entering edit mode

Thanks for insulting me.

0
Entering edit mode

a separate file for each sgRNA to get Bowtie2 to add the FASTA header information to the alignment in the BAM file.

No. That information comes from the fasta headers in the multi-fasta file.

0
Entering edit mode

Thanks. I think I understand now. Previous creation of Bowtie2 custom index required that I load in each Chr as a separate .fa file. I did not know that I could have in theory uploaded a single multi-FASTA file.

0
Entering edit mode

That is correct. You can use a multi-fasta file for index creation with all aligners.

0
Entering edit mode
7 months ago
Aimin Li ▴ 30

I have just tried to do it using shell command 'awk', FYI:

https://www.cnblogs.com/emanlee/p/13905310.html