Help with Output of FASTA Files From Excel
1
1
Entering edit mode
3.5 years ago

Hi All,

I am going to create a custom Bowtie2 index for a sgRNA library. I have the sequences of the sgRNA "barcodes" as an excel file shown below....

>ACTL6A_1
GGATAGTTTCCAAGCTATTT

>ACTL6A_3
TTTGCTAATGGTCGTTCTAC

>ACTL6A_5
GTTGAAGGACATAGCCATCG

>ACTL6A_7
ACTGCAATTCCAGTCCACGA

This goes on for 7000 sgRNA sequences. I would like to output these as individual FASTA files. One FASTA sequence per file, with the file named after the sgRNA bar code identifier in the FASTA header. So for example file 1 would contain..

 >ACTL6A_1
GGATAGTTTCCAAGCTATTT

and be named ACTL6A_1.fa. Can someone help me figure out how to do this using terminal commands?

Any help would be greatly appreciated.

Thanks,

Joe

sequence • 1.5k views
ADD COMMENT
0
Entering edit mode

Why do you need to output them as individual files? Just copy and the paste the data into a programmers editor (use Notepad or Notepad++ on Windows or textpad on macOS). Save the file as pain text and use it as input for bowtie2 indexing. A multi-fasta format file is the input for aligner indexing programs.

ADD REPLY
0
Entering edit mode

I am assuming that I need to have a separate file for each sgRNA to get Bowtie2 to add the FASTA header information to the alignment in the BAM file. That is what I will use with Feature Counts to count how many reads aligned to each sgRNA from the FastQ sequencing file.

ADD REPLY
1
Entering edit mode

Your assumption is crazy. Bowtie will take the name of each sequence from the header of each sequence (the part after the ">"), not from the file name!

And you hardly need featureCounts to count up how many reads aligned to each sequence. samtools idxstats will do that.

ADD REPLY
0
Entering edit mode

Thanks for insulting me.

ADD REPLY
0
Entering edit mode

a separate file for each sgRNA to get Bowtie2 to add the FASTA header information to the alignment in the BAM file.

No. That information comes from the fasta headers in the multi-fasta file.

ADD REPLY
0
Entering edit mode

Thanks. I think I understand now. Previous creation of Bowtie2 custom index required that I load in each Chr as a separate .fa file. I did not know that I could have in theory uploaded a single multi-FASTA file.

ADD REPLY
0
Entering edit mode

That is correct. You can use a multi-fasta file for index creation with all aligners.

ADD REPLY
0
Entering edit mode
3.5 years ago
Aimin Li ▴ 30

I have just tried to do it using shell command 'awk', FYI:

https://www.cnblogs.com/emanlee/p/13905310.html

ADD COMMENT

Login before adding your answer.

Traffic: 1764 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6