Question: FastTree Installation for Windows 10
0
gravatar for keeley.mazurkiewicz
7 days ago by
keeley.mazurkiewicz50 wrote:

I am trying to run octoFLU on python, but it requires the installation of FastTree. Since I am using a Windows 10 OS, I need some help with installing FastTree, so Python can recognize it. I never used any of these programs before and I cannot find any resources to help me troubleshoot.

ubuntu fasttree windows python • 135 views
ADD COMMENTlink modified 7 days ago by jared.andrews076.9k • written 7 days ago by keeley.mazurkiewicz50
4
gravatar for jared.andrews07
7 days ago by
jared.andrews076.9k
Memphis, TN
jared.andrews076.9k wrote:

Use the Windows Subsystem for Linux and conda to install python and fasttree easily and with essentially zero configuration required.

ADD COMMENTlink written 7 days ago by jared.andrews076.9k

I had to enable Windows Subsystems for Linux when I installed Anaconda, Docker, Ubuntu, and Sublime Text Editor. Conda commands don't work in Anaconda. I have tried them for blast and MAFFT.

ADD REPLYlink written 7 days ago by keeley.mazurkiewicz50

Gonna need a bit more info than that. Don't work how?

ADD REPLYlink written 7 days ago by jared.andrews076.9k
1
(base) C:\Users\mazur\Desktop\octoFLU-master>conda install fasttree
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - fasttree

Current channels:

  - https://conda.anaconda.org/conda-forge/win-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://conda.anaconda.org/bioconda/win-64
  - https://conda.anaconda.org/bioconda/noarch
  - https://repo.anaconda.com/pkgs/main/win-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/win-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/msys2/win-64
  - https://repo.anaconda.com/pkgs/msys2/noarch
  - https://conda.anaconda.org/default/win-64
  - https://conda.anaconda.org/default/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

We tried this for MAFFT and Blast as well. It didn't work for those either. I got MAFFT to work somehow. I think I had to google searrch for mafft.bat and dig deep for that file after uninstalling MAFFT. OR maybe I found a code that got it to transfer from Ubuntu to Anaconda. I honestly don't remember. Blast was an easy download from NCBI.

ADD REPLYlink modified 4 days ago by h.mon31k • written 7 days ago by keeley.mazurkiewicz50
1

I am doing a bioinformatics project, specifically in phylogenetic analysis. The pipeline that I want to run is called octoFLU.

Website: https://github.com/flu-crew/octoFLU

I spoke to Tavis Anderson from the USDA about this pipeline. He requested that I install Ubuntu, Sublime Text Editor, Docker, and Anaconda. After our meeting, I found out that I needed to install dendropy, smof, blastn, mafft, and fasttree. I managed to install everything except FastTree. I have been having issues since I use Windows 10, but after talking to a professor at the Department of Computer Information Technology at Purdue University, it seems that this pipeline is just very difficult. He tried installing MAFFT on Windows and on Unix machines with no success. I managed somehow to install MAFFT, but FastTree is the elephant in the room.

GitHub linked this webistire for FastTree installation: http://www.microbesonline.org/fasttree/

For Windows, FastTree is a Windows command-line executable (no SSE). When I downloaded it, it was an application. C:\Users\mazur\Desktop\FastTree.exe is its location on my computer. When I run as administrator, it says Windows Users: Please remember to run this inside a command shell. I don't know what that means. Ubuntu "recognizes" it I think, but Anaconda does not.

ADD REPLYlink written 7 days ago by keeley.mazurkiewicz50
2

Okay, so you are running Ubuntu (installed from Windows Store) on the Subsystem for Linux and you've installed anaconda on Ubuntu, not on Windows, correct? It looks like you might have installed it on Windows based on your channel setup. If so, that's your issue. I was able to install fine on Ubuntu running on the Windows Subsystem for Linux with the following commands (excludes anaconda installation):

Ensure the necessary channels are added:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

Additionally, you probably want to actually create a new conda environment for this rather than install in the base conda environment, which you can then activate and install the necessary software into:

conda create -n ft python=3.8 
conda activate ft  
conda install -c bioconda mafft fasttree smof blast dendropy

Those exact commands worked just fine for me - I'm on a Windows 10 PC, but all of that should be run in the Ubuntu terminal. And of your additional analyses should be run from the Ubuntu terminal as well. If you're going to be doing bioinformatics on a Windows machine, you will very much need to become familiar with that setup.

ADD REPLYlink modified 7 days ago • written 7 days ago by jared.andrews076.9k

I followed the instructions on here (https://www.digitalocean.com/community/tutorials/how-to-install-anaconda-on-ubuntu-18-04-quickstart) to install Anaconda in Ubuntu, but now I cannot find my octoFLU-master folder in Ubuntu. The file can be found in Anaconda on Windows though when is used cd C:\Users\mazur\Desktop\octoFLU-master. Is there anyway you can walk me through the steps? I think Zoom or Microsoft Teams would be better since I can share my screen with you.

ADD REPLYlink modified 7 days ago • written 7 days ago by keeley.mazurkiewicz50
1

I have the commands to both create and activate the new environment above:

conda create -n ft python=3.8 
conda activate ft

Anyway, once you install conda on Ubuntu, my commands above will be all you need. Installing conda on Ubuntu is as easy as:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

And follow the prompts to finish the install. Then my commands above will install all the dependencies you need.

Alternatively, you can just use the docker image they provide, which shouldn't require you to install anything other than docker. The octoFLU Github page has pretty explicit directions for how to use it.

ADD REPLYlink written 7 days ago by jared.andrews076.9k

I don't understand how you're able to get this to run and I can't. I have installed Anaconda and the blast, mafft, etc. programs in Ubuntu, but I cannot get the code of octoFLU to work. I have never done this before and all the Python books and experts I have asked couldn't figure this out. I am wasting time trying to run this code and it is only on the same dataset. I still need to run this program on my own data... and somehow figure out how to convert my dataset file from .txt to .fasta

ADD REPLYlink written 7 days ago by keeley.mazurkiewicz50

Have you considered opening an issue on their Github page? Did you try using their docker image?

What format is your data in? Coverting to FASTA format is typically trivial if you already have sequence information.

ADD REPLYlink written 7 days ago by jared.andrews076.9k

I met with the co-author of octoFLU and we couldn't get the code to work. I have don't know Ubuntu, we only used Anaconda, but stopped because it wouldn't recognize FastTree. We tried docker, but when it run the code, we couldn't find the results. My dataset is in NotePad so it is a text file.

ADD REPLYlink written 6 days ago by keeley.mazurkiewicz50
2

The person who wrote the software couldn't get it to run? You have tried to run it after installing fasttree and all other dependencies in Ubuntu as described? What error are you running into with octoFLU specifically? We need specific commands and error messages to have any chance of helping you.

When using docker, did you follow their instructions for getting the results copied outside of docker?

Yes, it's a text file, but what is the actual format of your data file? What are the first few lines of the file?

ADD REPLYlink written 6 days ago by jared.andrews076.9k
2

Well, I am assuming the issues were due to running Anaconda on Windows. I am following your instructions to a t. So I have some questions for you.

1) I created the new environment ft. If I were to close Ubuntu, how would I activate that environment... or is it not a permanent environment variable?

2) I only know how to open octoFLU.py script not the octoFLU.sh file on my Windows 10 PC. Can I open octoFLU.sh in Sublime Text Editor, so I can edit the paths in octoFLU.sh to connect blastn, makeblastdb, smof, mafft, and fasttree? *I determined their locations/paths by using the which command in Ubuntu.

3) I am lost on what to do after that. Probably due to 2 reasons - (1) I have never done any computer programming before and (2) we were doing the Windows script, not the Linux script. Note: the author uses only Mac, so we were troubeshooting as we walked through the installation of the prerequisite programs.

4) Text document to FASTA file - I used EMBOSS SEQRET Converter to convert my text formatted genetic sequences to FASTA format, so it is a text file of a list of FASTA formatted sequences. I did 10 files, one for each gene segment plus I divided the HA and NA genes by subtype. So PB2, PB1, PA, H1, H3, NP, N1, N2, M, NS are all separate files.

ADD REPLYlink modified 6 days ago • written 6 days ago by keeley.mazurkiewicz50
1

1.) My command above will activate the environment - conda activate ft. You should see (ft) next to your command line prompt after doing so. I highly recommend reading the conda manual, as it will help you determine how to manage environments. It's pretty straightforward.

2.) The sh file is just text, you can open it with Sublime Text, notepad, etc. FASTA files are just text as well. File extensions just indicate that a file has a specific format, not that it actually is that format. You can open a file in notepad, type whatever you want, name it whatever you want, and you'll still be able to open it just fine in notepad. The default file extension is .txt just so that programs/users are aware it's a text file before opening it, but there's nothing enforcing that is actually is. You should be fine to edit that file in Sublime.

4.) Okay, so your files are already in FASTA format! All you have to do is rename them so that the program will recognize them as such. Just replace the .txt extension with .fa.

3.) I feel your frustration. But you're close! Once you rename your input files, you should be able to run the octoFLU.sh script - bash octoFLU.sh your_data/your_sample.fasta. If that doesn't work for whatever reason, give us the exact command you use and the error it spits out. You will have to run this on each of your input files if you don't concatenate them all together.

ADD REPLYlink modified 6 days ago • written 6 days ago by jared.andrews076.9k

Ok. I ran the pipeline in Ubuntu, but the output cannot be found. I did not get an error message though.

ADD REPLYlink written 6 days ago by keeley.mazurkiewicz50

Can you post the exact command you used?

ADD REPLYlink modified 6 days ago • written 6 days ago by jared.andrews076.9k

octoFLU-master is on my Desktop. I edited the 7th comment in octoFLU.sh to connect the paths.

# ===== Connect your reference here
REFERENCE=reference_data/reference.fa

# ===== Connect your programs here, assuming installed on your system
BLASTN=/home/misneach/miniconda3/envs/ft/bin/blastn
MAKEBLASTDB=/home/misneach/miniconda3/envs/ft/bin/makeblastdb
SMOF=/home/misneach/miniconda3/envs/ft/bin/smof
MAFFT=/home/misneach/miniconda3/envs/ft/bin/mafft
FASTTREE=/home/misneach/miniconda3/envs/ft/bin/FastTreeMP
NN_CLASS=treedist.py

I saved it as the same file in the octoFLU-master folder on my Desktop. Then I used conda activate ft and cd octoFLU and the bash octoFLU.sh sample_data/query_sample.fasta. It ran, but the results aren't in my octoFLU-master folder. We had this issue when we ran the script in Docker.

ADD REPLYlink modified 4 days ago by genomax89k • written 5 days ago by keeley.mazurkiewicz50

The output folder is called query_sample.fasta_Final_Output, but it is empty.

ADD REPLYlink written 5 days ago by keeley.mazurkiewicz50

Can you paste the output it spat out on the console as it ran?

ADD REPLYlink written 4 days ago by jared.andrews076.9k
1

Thank you so much for all your help! I met with the other author of the code and we were able to locate the file and run my data set this morning. Seriously without your help I would have definitely lost my marbles.

ADD REPLYlink written 4 days ago by keeley.mazurkiewicz50
1

Hurrayyy. I was getting a bit worried. If you found my answer helpful, consider accepting it so that others will recognize the question has been addressed appropriately without reading through this quite long thread.

ADD REPLYlink written 4 days ago by jared.andrews076.9k

Additionally, can you point out where the data ended up in relation to your working directory so that others that run into the same issue will have an idea as to how it can be resolved?

ADD REPLYlink written 4 days ago by jared.andrews076.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1945 users visited in the last hour