FASTQ --> FASTA Conversion
1
0
Entering edit mode
5.0 years ago
Khosh.0 ▴ 10

Hey everyone,

I am new to the world of bioinformatics. I am currently working with my professor and have been asked to convert some FASTQ files to FASTA. There are several files and each of them are split into 2 (R1 and R2), correct me if I am wrong, but I believe these are called Illumina sequences? Do I have to merge these files or can I convert each of them to FASTA files separately?

I would like to know what your suggestions are on a good way to convert these files to FASTA? I am using a mac. I am still learning so my skills are not where I would like them to be (yet). Any pointers would be appreciated.

Thank you

FASTQ FASTA Conversion • 8.0k views
ADD COMMENT
0
Entering edit mode

It is one thing to convert fastq to fasta (a format conversion) what is your ultimate aim? You have referred to "merging" them? What is a concept one step above fastq/fasta conversion. It will only work if your reads (R1/R2 overlap).

This is a broad overview of what a sequenced fragment looks: C: How to quantify the overlapping reads in paired-end DNA sequencing to check the Only if the length of R1 or R2 is longer than insert size then the reads will merge.

ADD REPLY
1
Entering edit mode

I just need to do a format conversion of the files so that my professor can open them. I just need a good method of converting them to FASTA. There seem to be many different ways of doing this, what is a good method for someone who is a novice? Can I use python?

Thanks

ADD REPLY
0
Entering edit mode

You can use following program from BBMap suite.

reformat.sh in=your.fq.gz out=your.fa

This will convert your fastq files into fasta format.

ADD REPLY
0
Entering edit mode

Another question, sorry. How does this work in python exactly? Am I supposed to run this in the Mac Terminal while being in python?

ADD REPLY
0
Entering edit mode

This is not python at all. BBMap suite is written in Java. What you are running above is a shell script that runs the actual java command line (which is more complicated to write out).

ADD REPLY
0
Entering edit mode

I am not very familiar with Java. I am a Biology student so my knowledge of programming is lacking. I am learning as I go. Is there a simpler way of doing this in Python? Or any program that would suit me better? Thanks again.

ADD REPLY
0
Entering edit mode

This is about the simplest way you can do this. All you need is java installed.

ADD REPLY
0
Entering edit mode

So all I need to do is install Java and run that command line and it'll output a FASTA file?

ADD REPLY
1
Entering edit mode

Yes for Java installation. You will also need to download the BBMap software from link above and then run the command as noted. No installation is needed for BBMap. Just uncompress the file you download and add the directory with *.sh scripts to your $PATH.

Even simpler option would be to just use sed program on any linux system. No additional programs needed.

sed -n '1~4s/^@/>/p;2~4p' file.fq > file.fa
ADD REPLY
0
Entering edit mode

Ok, I've installed Java and I checked in the terminal to ensure that I have Java by typing in (java -version). I've also downloaded bbmap and have unzipped it. I have the bbmap folder on my desktop. How do I activate bbmap in the terminal so that I can run the script reformat.sh in=your.fq.gz out=your.fa?

ADD REPLY
0
Entering edit mode

Change to BBMap directory on desktop in a terminal window. Then do (use real directory/file paths)

./reformat.sh in=full_path_to_your_fastq_file out=where_you_want_to_write_results/your_file.fa
ADD REPLY
0
Entering edit mode

I'm working in mac, does this also work in mac? I also have a windows machine but I am currently working in mac.

ADD REPLY
0
Entering edit mode

It will work fine on macOS. I suggest you just use the sed method I posted above with Mac.

ADD REPLY
1
Entering edit mode
5.0 years ago

Welcome to the world of bioinformatics Khosh.0.

We can understand how it is like to be a beginner. However, you should really start learning how to look for what you want. Since, this is just the beginning, eventually you will know that questions like these have been asked 10,000 times on different portal including biostars.org.

It's really easy for anyone of us to give you a quick answer, however, we encourage you to first start looking by yourself. The best practice is to use the 'quick search' utility right below the biostars logo on the biostars home page.

dd

Another way is too look at the "Similar posts" section on the right hand side section of your post.

Screenshot-from-2019-05-05-10-30-28

ADD COMMENT

Login before adding your answer.

Traffic: 2562 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6