Supplemental Methods: Software Installations

1 Install Conda (MacOSX)

First check if you have conda on your system by opening terminal and typing conda then pressing enter. If the command was not found you need follow up with this installation. If you already have it installed then run the command conda update conda.

conda
conda update conda

These are the instructions for installing Miniconda, a mini version of Anaconda that includes just conda, its dependencies, and Python. You can also install the full anaconda following this link.

Use this link to find the correct Miniconda you need to download for your system such as MacOSX, Windows, and Linux.

  1. Once you have downloaded it you will need to open terminal.
  2. Now that terminal is open you will need to use this command: miniconda-filename.sh. The “miniconda-filename.sh” should be replaced with the actual name of the file you downloaded. Your file name will look something like Miniconda2-latest-MacOSX-x86_64.sh
  3. Continue to press enter, and type yes when it is needed.
  4. When it is all completed we recommend running conda update conda.
  5. Then check which conda you have by running conda --version.

Use this command if you do not prefer conda’s base environment to be activated on startup in terminal.

conda config --set auto_activate_base false

Here is a link to some conda commands that may come in handy.

2 Install PanACoTa

We create a new Python environment (here it’s version 3.6 of Python) on the Mac Terminal:

conda create -y --name PanACoTa python=3.6
conda activate PanACoTa

2.1 Dependencies

We install all dependencies in the new PanACoTa environment on the Mac Terminal:

conda install -c conda-forge -c bioconda -c defaults prokka
conda install -c conda-forge -c bioconda mmseqs2
  • For prepare module: mash (to filter genomes) From https://github.com/marbl/Mash/releases We get mash-OSX64-v2.2.tar, unzip and move the ‘mash’ binary to /usr/local/bin
  • For annotate module: prokka and/or prodigal (to uniformly annotate your genomes)
  • For pangenome module: mmseqs (to generate pangenomes)
  • For align module: mafft (to align persistent genome) From https://mafft.cbrc.jp/alignment/software/macstandard.html. We download and install mafft-7.453-signed.pkg
  • For tree module: IQ Tree, FastTreeMP, FastME or Quicktree (to infer a phylogenetic tree) From http://www.iqtree.org We get the Stable release 1.6.12 (August 15, 2019) and move the ‘iqtree’ binary to /usr/local/bin

2.2 PanACoTA

PanACoTA Documentation:

By cloning the GitHub repository, you will then be able to update the code to new versions very easily and quickly. Here is how to clone the repository and install PanACoTA from it on the Mac Terminal:

git clone https://github.com/gem-pasteur/PanACoTA.git
cd PanACoTA
./make --user

To update source code to the new version: Here we use Version 1.0.1

cd PanACoTA
git pull 
./make upgrade

3 Install Prokka

Guide to installing Prokka using this link: https://github.com/tseemann/prokka

Install using conda through terminal. You can try installing it into your specific anvi’o environment but that did not work for us. So, we installed Prokka on our base python environment.

Run the command:

conda install -c conda-forge -c bioconda -c defaults prokka

Then check if you have prokka installed by running the command prokka -v

4 Install anvi’o v7.1

Follow instructions here: https://merenlab.org/2016/06/26/installation-v2/#3-install-anvio

If you are using Windows WSL2 Linux through Ubuntu 20.04 LTS and get to the step of doing the anvi-self-test, and the html site cannot be reached. Then try and use 127.0.0.1:8080 or http://localhost:8080/ to get the interactive display to work.

5 Install PPanGGOLiN

We create a new Python environment (here it’s version 3.6 of Python) on terminal:

conda create -y --name PPanGGOLiN python=3.6
conda activate PPanGGOLiN

For basic info on PPanGGOLiN go here: https://github.com/labgem/PPanGGOLiN/wiki

The public release doesn’t include the “Regions of Genome Plasticity” analysis, so we want the version from GitHub.

First we install on terminal the required dependencies:

conda activate PPanGGOLiN
conda install --file other/requirements.txt

Also on terminal we install PPanGGOLiN:

pip install git+https://github.com/labgem/PPanGGOLiN.git#egg=PPanGGOLiN
#pip install --upgrade git+git://github.com/labgem/PPanGGOLiN.git

6 Install GET_HOMOLOGUES

This should be done in your MacOSX terminal or in your Linux WSL2 subsystem if you are using Windows.

For more info on get_homologues go here: http://eead-csic-compbio.github.io/get_homologues/manual/

Run the following commands below to clone the current get_homologues software from Github onto your system:

cd # path of directory of where you want get_homologues to be
git clone https://github.com/eead-csic-compbio/get_homologues.git

Run this perl script and install and necessary packages it may ask for.

perl install.pl

Once you have finished these steps you can update get_homologues anytime by:

cd # this should be the path to your get_homologues
git pull

Test if it is on your system by running:

./get_homologues.pl -v 

The output of this will be the version and credits of get_homologues.

7 Install GET_PHYLOMARKERS

This should be done in your MacOSX terminal or in your Linux WSL2 subsystem if you are using Windows.

For more info on get_phylomarkers go here: https://vinuesa.github.io/get_phylomarkers/

Run the following commands below to clone the current get_pylomarkers software from Github onto your system:

cd # path of directory of where you want get_homologues to be
git clone https://github.com/vinuesa/get_phylomarkers.git

After get_phylomarkers has been installed and you are within its directory use the command below. This will install the Rpackages that get_phylomarkers needs. If you are getting errors then you may need to install xquartz and updated R cran packages manually. We had to do both on our MacOSX system. Go to xquartz or cran-r websites by clinking on the hyperlinks and then download and install packages you need.

./install_R\_deps.R

Test get_phylomarkers:

cd # into a directory that holds .faa and corresponding .fna files
your/path/to/run_get_phylomarkers_pipeline.sh -R 1 -t DNA

It should run smoothly and create an output folder that has get_phylomarkers and the date as the title.

You may want to add get_phylomarkers to your $PATH. We did this on MacOSX.

nano ~/.zshrc
export PATH=$PATH:/Users/yourname/bin 

After you added your path exit nano and then close terminal. Start a new window and check if your path was added by running.

echo $PATH

8 Install IQ-TREE

This is the official website to install IQ-TREE: http://www.iqtree.org/doc/Quickstart

We installed and used IQ-TREE on macOS. Download the correct package from: http://www.iqtree.org/#download

Go into IQ-TREE folder by entering (assuming you downloaded version 1.5.0 into your downloads folder):

 cd Downloads/iqtree-1.5.0-MacOSX

Now you can try an example run by entering:

 bin/iqtree -s example.phy