To previous part of Software pages
Tatsuya Ota
, most recently at the Hayama Information Network Center at the Graduate University of Advanced Studies, Hayama, Japan (ota
(at) soken.ac.jp
)
written a package,
DISPAN, (Genetic Distance and Phylogenetic Analysis),
which computes for gene frequency data the heterozygosity, gene diversity, Nei's standard genetic
distance or the DA distance, and their standard error.
It also constructs phylogenies using the
neighbor-joining (NJ) method or the UPGMA method. These trees can also be
bootstrapped. A tree editor allows the user to rearrange the tree and print it out.
The package consists of two programs, GNKDST and TREEVIEW. The first is a
rewrite of a program by A. K. Roychoudhury, Y. Tateno, D. Graur, N. Saitou,
and R. Schwartz, the second was written by Koichiro Tamura. DISPAN
is distributed as DOS executables (which can run under Windows in a Command
tool window. The package and its Readme file are available
at the IUBIO software
server
at http://iubio.bio.indiana.edu/soft/molbio/ibmpc/
and at a
web page describing it at
http://www.bio.psu.edu/People/Faculty/Nei/Lab/dispan2.htm
at the software pages of
Masatoshi Nei's laboratory at Molecular Evolution and Phylogenetics
at Pennsylvania State University.
http://www.sanger.ac.uk/resources/software/quicktree/
Travis Wheeler
http://nimbletwist.com/software/ninja/
It is
also available as part of the Mesquite package of Java programs.
Sudhir Kumar,
(S.Kumar (at) asu.edu
),
of the Center for Evolutionary Functional Genomics at Arizona State
University, Tempe, Arizona
has written PHYLTEST, version 2.0. It is a DOS
executable program for testing phylogenetic hypotheses about four
clusters of DNA sequences. It implements comparison of three alternative
phylogenetic trees for four monophyletic clusters of sequences, the
four-cluster analysis: Rzhetsky, A, S. Kumar, and M. Nei. 1995.
Four-cluster analysis: a simple method to test phylogenetic hypotheses.
Molecular Biology and Evolution 12: 163-167.
It can also carry out the interior branch test of the null hypothesis that an interior
branch length is significantly longer than zero (Rzhetsky, A. and M. Nei. 1992.
A simple method for estimating and testing minimum-evolution trees.
Molecular Biology and Evolution 9: 945-967), as
well as the estimation of average pairwise distances (and standard errors)
within and between clusters of sequences and
relative rate tests and the computation of the time of divergence.
PHYLTEST is distributed from
the IUBIO software
server at http://iubio.bio.indiana.edu/soft/molbio/ibmpc/
molbio/ibmpc
. The "readme" file for it is distributed there
and is also available at
Masatoshi Nei's lab software page
web page
at Pennsylvania State University at
http://www.bio.psu.edu/People/Faculty/Nei/Lab/phyltest2.htm
.
It is distributed as a self-extracting archive, containing the executables and
examples, with a Readme file. The
program can be run under DOS or in the Command tool of Window.
TREECON
version 1.3b is a software package developed by Yves Van de Peer of the Bioinformatics and Evolutionary Genomics group at the Department of Plant Systems Biology, University of Ghent, Belgium (yves.vandepeer
(at) @psb.ugent.be
) for the
construction and drawing of phylogenetic trees based on distance data.
Several equations are included to convert dissimilarity into evolutionary
distance and several methods (such as neighbor-joining) are included for
inferring the tree topology. It also includes bootstrap analysis. It also
has good facilities for rerooting and drawing trees. The
program is available for free for academic use, for other use you
are asked to contact its author. It on PCs under Windows.
It is described in several papers:
http://bioinformatics.psb.ugent.be/software_details.php?id=3
, and it can be
downloaded from there, and an online manual is also viewable there.
Andrey Rzhetsky
(andrey.rzhetsky
(at) dbmi.columbia.edu
) of the Department
of Biomedical Informatics at Columbia University, New York
and Masatoshi Nei of the Institute of Molecular and Evolutionary Genetics at Pennsylvania State
University have produced
METREE version 1.2, a program for carrying out the minimum-evolution
distance matrix method. METREE runs on
DOS systems and on Windows (under a Command Tool window). It computes
minimum evolution distance matrix trees from DNA and amino acid sequence data
and tests the statistical significance of
topological differences and of the branch lengths. Different distance
matrix measures may be used. The package is menu driven and the TREEVIEW
program written by Koichiro Tamura for
visualizing and printing out the final tree is also included. The method is
described in the paper by A. Rzhetsky and M. Nei. 1992. A simple method for
estimating and testing minimum-evolution trees. Molecular Biology and
Evolution 9: 945-967, and the program is described in
a paper by A. Rzhetsky and M. Nei. 1994. METREE: a program package for
inferring and testing minimum-evolution trees. Computer Applications in
the Biological Sciences (CABIOS) 10: 409-12.
METREE is distributed from
the IUBIO
server from http://iubio.bio.indiana.edu/soft/molbio/ibmpc/
.
A Readme file is also available at its listing in its listing at
the
software page of Masatoshi Nei's laboratory at
http://www.bio.psu.edu/People/Faculty/Nei/Lab/software.htm
Richard Desper, most recently of Ziheng Yang's lab at the Department of Biology, University College, London, U.K., and Olivier Gascuel of the LIRMM (Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier), Montpellier, France (gascuel (at) lirmm.fr) have written FastME, a fast program for the minimum evolution distance matrix method. It is described as faster than neighbor-joining methods, more accurate than them, and as accurate as least squares methods. It can analyze multiple data sets as part of bootstrapping analyses. Its methods are described in two papers:
http://www.atgc-montpellier.fr/fastme/binaries.php
.
Olivier Gascuel
gascuel (at) lirmm.fr
) of
the Laboratoire d'Informatique, de Robotique et de Micro-Electronique de
Montpellier (LIRMM) of the Universite de Montpellier II, France has written
BIONJ, an improved version of Neighbor-Joining
based on a simple model of sequence data. It follows the same
agglomerative scheme as NJ but uses a simple, first-order
model of the variances and covariances of evolutionary distance
estimates. This model is appropriate when these estimates are
obtained from aligned sequences. It retains the speed advantages of
Neighbor-Joining while using a slightly different criterion to select
pairs of taxa to join, one which will perform better when distances
between taxa are large. It is described in the paper: Gascuel, O. 1997.
BIONJ: An improved version of the NJ algorithm based on a simple model
of sequence data. Molecular Biology and Evolution 14: 685-695.
C source code and Windows, Linux, and Mac OS X executables of BIONJ are
available at
its web page at
http://www.atgc-montpellier.fr/bionj/binaries.php
.
It is also available as a web server here.
William J. Bruno of the Los Alamos National
Laboratory (billb (at) lanl.gov
) has released
nneighbor, a modification of the
PHYLIP Neighbor-Joining distance
matrix program that avoids negative branch lengths (its name means
Non-Negative Neighbor). The program is available as generic C code.
It is available at
one of Bruno's
web pages at
http://www.t10.lanl.gov/billb/related_links.html
.
William J. Bruno, Nicholas D. Socci, and Aaron L. Halpern
of the Los Alamos National Laboratory (billb (at) lanl.gov
) have
produced weighbor (Weighted nEIGHBOR-joining or perhaps
WEIGHted neighBOR-joining), version 1.2.1,
a distance matrix program for
performing a weighted version of the Neighbor-Joining method. The
weighting used is for nucleotide sequences and more correctly reflects the
uncertainty of the longer distances in the tree than does ordinary
Neighbor-Joining. It is thus closer to approximating maximum likelihood
and will be more accurate than Neighbor-Joining on large trees.
It is described in a paper:
Bruno, W. J., N. D. Socci, and A. L. Halpern 2000. Weighted
neighbor joining: a likelihood-based approach to distance-based
phylogeny reconstruction. Molecular Biology and Evolution 17:
189-197. Weighbor
is available as C source code and as Windows, IRIX, Solaris, and Linux
executables (plus some older executables for DOS and Mac OS) from
its web site
at http://www.t10.lanl.gov/billb/weighbor/index.html
.
It is also available as a web server at the Institut Pasteur
in Paris.
Paul Lewis, (plewis (at) uconnvm.uconn.edu
),
of the Department of Ecology and Evolutionary
Biology, University of Connecticut, and Dmitri Zaykin, then of North Carolina
State University.
have written GDA version 1.1,
a set of programs to carry out many of the statistical methods for
analyzing gene frequencies and sequence data that are described in
Bruce Weir's book Genetic Data Analysis II
(Sinauer Associates, Sunderland, Massachusetts, 1996). The programs run under
Windows and include the calculation of UPGMA and Neighbor-Joining phylogenies.
The program is described in a
Web site
maintained by Paul Lewis at
http://hydrodictyon.eeb.uconn.edu/people/plewis/software.php
There is also a link there to a command-line-only version of GDA by
Chris Basten that runs under Mac OS X.
The relevant feature for the purposes of this listing is the ability of
the programs to compute a number of distances.
Mark Miller
(MarkPerryMiller (at) gmail.com
) of the
Forest and Rangeland Ecosystem Science Center of the U.S. Geological Survey
has written TFPGA (Tools For Population Genetics
Analyses), a Windows program for the analysis of allozyme and
molecular population genetic data. It can calculate genetic distances.
In addition, this program calculates descriptive statistics,
and F-statistics, and performs tests for Hardy-Weinberg equilibrium, exact tests for genetic
differentiation, Mantel tests, and UPGMA cluster analyses. Additional features include the ability
to analyze hierarchical data sets as well as data from either codominant markers such as allozymes
or dominant markers such as AFLPs or RAPDs. It is available from
his software web page at
http://www.marksgeneticsoftware.net/
as a Windows executable.
François Bonhomme
of the Institut des Sciences de l'Evolution of the Université de Montpellier, France, along with K. Belkhir, P. Borsa, N. Raufaste and L. Chikhi (the program support email address isgenetix
(at) univ-montp2.fr
)
has released Genetix version 4.05. This is a Windows
executable program that does a wide variety of population genetic
procedures. The part relevant to the present list is that it
computes the Nei and the Cavalli-Sforza genetic distances, both with and
without bias correction. It also calculates F statistics and linkage
disequilibrium, and performs permutation tests on the results.
One advantage (or limitation, depending on your perspective) is that
the interface is in French.
Genetix is available from
its web site
(in French) at http://www.univ-montp2.fr/~genetix/genetix/genetix.htm
.
Steven Kalinowski
http://www.montana.edu/kalinowski/Software/TreeFit.htm
Immanuel Yap, now of the Department of
Plant Breeding and Genetics at Cornell University, Ithaca, New York
(noelyap
(at) ascus.plbr.cornell.edu)
and Rebecca Nelson
http://archive.irri.org/science/software/winboot.asp
.
María Jesús Martín and Joaquín Dopazo
, then of the R&D Department of TDI (TDI-EMBNet), Spain, (Dopazo is now at the Bioinformatics Department at the Centro de Investigación PrÃncipe Felipe (CIPF), Valencia, Spain: jdopazo (at) cipf.es ) >tdi.es ordopazo (at) tdi.es
) have developed OSA (Optimal Sequence
Analysis), version 2.0. It finds, whithin large sequences, those regions with an information
content similar to that of the whole sequence and it selects, among
them, the shortest ones. This program was formerly called ORF.
The algorithm used is based on comparing pairwise genetic distances, calculated
for windows of variable size and position, to the
distance matrix obtained for the whole sequence. Either uncorrected
genetic distances or Jukes-Cantor distances can be used.
Two methods are used to set cutoff levels: simulation-based significance
values or bootstrapping. A variety of options for search among possible
windows are available. The method has been described in a paper:
M. J. Martín, F. Gonzalez-Candelas, F. Sobrino and J. Dopazo. 1995.
A method for determining the position and size of optimal sequence regions for
phylogenetic analysis. Journal of Molecular Evolution 41: 1128-1138.
OSA uses aligned sequences in a number of common formats as input.
It runs on UNIX-based machines. It is available in Gnu Pascal source code
and also executable versions
for Solaris and IRIX operating systems are available.
The program can analyze up to 50 sequences of a maximum length of 10,000 bp.
It can be obtained by ftp
from ftp.ebi.ac.uk
in directory pub/software/unix/osa
,
where the source code, a documentation file,
and the Solaris and Irix executables are available.
Johannes Schaefer and Michael Schoeniger, then of the
Lehrstuhl für Theoretische Chemie of the
Technische Universität München
have written DISTREE. It
computes pairwise distances of
aligned nucleotide sequences utilizing various models of base
substitution. Moreover it provides the user with information
on the goodness of fit of the models to the given set of
sequence data. Each of the models is implemented in two
variants, assuming identical and gamma distributed
substitution rates across sequence sites.
It is available as a DOS executable with C source code, or as source code for Unix
systems.
DISTREE is distributed through
the EBI software site archive
at
http://mirror.pscigrid.gov.ph/ebi-software/software/dos/distree/
,
Mikael Thollesson
(lddist (at) artedi.ebc.uu.se), of the Department of Molecular Evolution, Evolutionary Biology Centre, Uppsala University, Sweden has written LDDist version 1.3.2, which calculates LogDet distances from DNA and protein sequences. It accomodates rate variation from site to site as well, by excluding invariant sites or by allowing different rates for different sites to be preassigned. LDDist is described in a paper: Thollesson, M. 2004. LDDist: a Perl module for calculating LogDet pair-wise distances for protein and nucleotide sequences. Bioinformatics 20: 416-418. LDDist is, as this says, written in Perl and C++. With it is distributed PLD.pl, a companion script that serves as a front-end and example of how to use LDDist. They are distributed in source code from its web site athttp://artedi.ebc.uu.se/molev/software/LDDist.html
William J. Bruno and Lars Arvestad
(billb (at) t10.lanl.gov
) of the Theoretical Biology and
Biophysics Group at Los Alamos National Laboratory,
have released DISTANCE, version 1.0.
It estimates the most general reversible substitution matrix
corresponding to a given collection of aligned DNA sequences.
This matrix can then be used to calculate evolutionary distances between pairs
of sequences. The method is described in a paper:
Arvestad, L. and W. J. Bruno. 1997. Estimation of reversible substitution
matrices from multiple pairs of sequences. Journal of Molecular Evolution
45: 696-703. The program is written in C, and distributed from
its web site at
http://www.t10.lanl.gov/evolution/
, along with Sun SPARC
binaries.
Joyce Miller Hersh
(msmead (at) doctorbeer.com
),
formerly of the Whitehead Institute at MIT (and more recently a high-tech
patent attorney) wrote RESTSITE, version 1.2,
a package of DOS programs for computing distances between species based on
restriction sites or restriction fragments. The programs also include
NJTREE and UPGMA which can infer phylogenies by the Neighbor-Joining and
UPGMA distance matrix methods. The programs are written in Microsoft C:
source code is available too. The programs, documentation, and source code are distributed by
its Web site, http://www-genome.wi.mit.edu/~jmiller/restsite.htm
.
The programs and their methods were described in two papers:
Doug McElroy
(Doug.McElroy (at) wku.edu
) of
Western Kentucky University distributes REAP, the
Restriction Enzyme Analysis Package, written by him, Paul Moran, Eldredge
Bermingham, and Irv Kornfeld. REAP can calculate distances from restriction sites,
restriction fragments data, and from nucleotide sequences (the Kimura
2-parameter distance). REAP is a package of DOS executables available
from McElroy's web site.
at http://bioweb.wku.edu/faculty/mcelroy/
.
It is described in the paper:
McElroy, D., P. Moran, E. Bermingham, and I. Kornfield. 1992. REAP: An integrated environment
for the manipulation and phylogenetic analysis of restriction data.
Journal of Heredity 83: 157-158.
Peter Rice, Alan Bleasby, and Jon Ison
http://emboss.sourceforge.net/what/
MacVector, Inc., PMB 150, PO Box 582, 1939 High House
Rd., Cary, NC 27519 and PO Box 582, Cambridge, U.K. CB1 0FH
(info (at) macvector.com)
sells MacVector version 10.0.2, a
sequence analysis program for Mac OS and Mac OS X systems. The features that
are relevant for this listing are its ability to do
alignment and produce a guide tree
using ClustalW, and
either UPGMA or Neighbor-Joining distance matrix methods. It has many other
features including sequence search, gene finding, motif searching,
protein secondary structure
and hydrophobicity prediction, and prediction of restriction digests and
primer sites. Version 7.2 onwards can run natively on Mac OS X systems.
It can be ordered through
its web page
at http://www.macvector.com
. Its
price for academic use was formerly $2,500, and for commercial use $5,000.
Currently they do not give prices on their web page, but they have said to
me that the above is slightly more expensive than what they charge now.
Soll Technologies, Inc., (sales
(at) solltechnologies.com)
321 Lexington Ave., Iowa City, Iowa 52246,
USA distributes DENDRON, a
computer-assisted system for Windows for analyzing DNA fingerprinting gels. It
reads and compares gel images. One feature is an average-linkage clustering
algorithm that can produce trees from the gel images. For information and
pricing, contact Soll Technologies. The DENDRON
web page is
at http://www.solltechnologies.com/products.html
.
Philipp Schlüter
http://www.famd.me.uk/famd.html
James McInerney
of the Department of Biology of the National University of Ireland, Maynooth, County Kildare, Ireland (james.o.mcinerney (at) may.ie) has written GCUA (General Codon Usage Analysis). It does codon usage and amino acid usage statistics, and also performs correspondence analysis/principle components analysis on both codon usage and amino acid usage statistics. Its relevance to the present list is that it also produces a distance matrix, based on Relative Synonymous Codon Usage (RSCU) statistics, whose format is PHYLIP/PAUP*4.0 -compatible. Although McInerney cautions that this matrix should not be used for phylogenetic inference, I wonder whether this distance does not have some phylogenetic information. The program is described in the paper: McInerney, J. O. 1998. GCUA (General Codon Usage Analysis). Bioinformatics 14 (4): 372-373. It is available as Mac OS X, Mac OS, Windows, IBM AIX, Digital Unix, and Linux binaries. The code isn't available, he says "because it is so embarassingly poor". It is available at his software downloads site athttp://bioinf.nuim.ie/downloads.html
.
Earlier binaries, version 1.1 for Digital Unix, SunOS, Mac OS and Irix
and version 1.2 for Linux, Digital Unix, Mac OS and SunOS
can be retrieved via anonymous ftp
from ftp.nhm.ac.uk
in directory pub/gcua
David T. Pride (dpride (at) partners.org),
formerly of Vanderbilt University (currently an internal medicine specialist
in Berkeley, California), has written Swaap version
1.02. Swaap performs sliding window analyses on nucleotide sequences, computing
a large variety of statistics on the sequences. The relevant feature for
this listing is the ability to compute four different distance measures
between sequences, either on full sequences or on sliding windows.
Swaap is distributed as a Windows executable from
the Swaap
and Swaap PH web site
at http://www.bacteriamuseum.org/SWAAP/SwaapPage.htm#Swaap
David T. Pride (dpride (at) partners.org),
formerly of Vanderbilt University (currently an internal medicine specialist
in Berkeley, California), has written Swaap PH version
1.02. Swaap PH computes many different kinds of statistics on nucleotide
frequencies and oligonucleotide frequencies in sliding windows along
nucleotide sequences. It can compute distances based on these frequencies.
Swaap PH is a a Windows executable available from
the Swaap and Swaap PH web site
at http://www.bacteriamuseum.org/SWAAP/SwaapPage.htm#Swaap
Mathieu Blanchette
of the McGill University Centre for Bioinformatics (blanchem (at) mcb.mcgill.ca
) and David Sankoff
of the Department of Mathematics and Statistics of the University of
Ottawa, Canada
have produced DERANGE2, a program to reconstruct the
history of two gene maps using weighted inversions, transpositions
and inverted transpositions. It can thus construct a set of distances
based on the gene orders (not the sequences of the genes themselves).
It is available as a standard C source code and can readily be compiled on
Unix systems. It is available
by anonymous ftp
from ftp.ebi.ac.uk
in directory pub/software/unix
.
Laurent Excoffier
of the Computational and Population Genetics Lab of the Institute of Zoology, University of Bern, Switzerland (laurent.excoffier (at) zoo.unibe.ch) has produced MINSPNET, a program that produces a minimum spanning tree and network from a distance matrix. It is available as a Windows executable. It can be obtained from a web page which lists software from that lab at http://cmpg.unibe.ch/software.htm.
Francis Yeh (francis.yeh (at) ualberta.ca
) of the
Department of Renewable Resources at the University of Alberta, Canada, has
released POPGENE version 1.32, a free program for the analysis of genetic
variation among and within populations using co-dominant and dominant markers.
The feature that is relevant to the present list is that it can compute
a number of genetic distances for gene frequencies.
It is distributed as a Windows executable from
its home page at
http://www.ualberta.ca/~fyeh/index.htm
.
F. James Rohlf
has written NTSYSpc (Numerical Taxonomy System, Version 2.2), a clustering program that includes calculation of various kinds of distance measures, as well as Hierarchical clustering methods such as UPGMA as well as Neighbor-Joining and consensus trees. It can also do a variety of other things including ordination, scatter diagrams, and elliptic Fourier transforms (for shape analysis). NTSYSpc 2.1 is a Windows95 executable which will also run on Windows NT. It is available for $350 ($250 for educational and government institutions). 10-user site licensese are also available. It is distrubuted by Exeter Software (the biological software company, not the warehouse-inventory-software house of the same name). Their e-mail address issales (at) exetersoftware.com
. Their
toll-free telephone number is 800-842-5892, their not-so-free
phone number is +1-631-689-7838, and their fax number is +1-631-689-0103.
Their mailing address is
47 Route 25A, Suite 2, Setauket, NY 11733-2870 USA .
Further information is available on their
Web page
at http://www.exetersoftware.com/cat/ntsyspc/ntsyspc.html
.
Warren Kovach
of Kovach Computing Services, Anglesey, Wales (info (at) kovcomp.co.uk
) has produced MVSP,
a comprehensive multivariate statistical package for the PC platform.
It can do many kinds of analyses (principal components, clustering, etc.)
but the features relevant to this listing are clustering with a variety
of methods and a variety of distance measures, including Li and Nei's
restriction sites distance. MVSP may be ordered from Kovach Software
through its
web site
at http://www.kovcomp.com/mvsp/
.
MVSP 3.1 for Windows
costs UK £85 or US$ 150 for an academic license.
A version on CD with a printed manual is £20 ($35) more.
Commercial licenses are £115 ($185).
Version 2.2 for DOS costs UK £65 or US$ 100.
Free evaluation versions which works for a limited period can be
downloaded from
the Kovach Computing download web page at
http://www.kovcomp.co.uk/downl2.html#mvsp
. An evaluation
version of version 2.2 for
DOS is also available for downloading by ftp from
garbo.uwasa.fi
in directory pc/stat/
.
MVSP is also distributed by Exeter Software at
its web site
at http://www.ExeterSoftware.com/cat/kovach/mvsp.html
Version 3.1 costs $185 for an academic license, $265 for a commercial license.
There are discounts for multi-user licenses.
Other vendors include Rockware and
GeoMem.
János Podani of the Department of Plant Taxonomy and Ecology,
Eötvös Loránd University, Budapest, Hungary (podani (at) ludens.elte.hu)
has developed SYN-TAX 2000, a general package for
clustering. It can calculate a wide variety of distance coefficients from
numerical data, and can perform hierarchical clustering, nonhierarchical
clustering, and ordination. This includes, in addition to many clustering
methods, minimum spanning trees and additive trees by Neighbor-Joining.
SYN-TAX 2000 is available as commercial software from Exeter Software at
its web site there
at http://www.ExeterSoftware.com/cat/syntax/syntax.html
. It costs $350 for an educational license, $450 for a commercial
license. Podani also maintains his own SYN-TAX web site at http://ramet.elte.hu/~podani/SYN2000.html
where there are descriptions, screen shots, some free upgrades of
certain program components, and also an older DOS executable
version, 5.1, and a Macintosh version, SYN-TAX 5.02. There is a demo
version available for the DOS version, and both the DOS and Mac versions
are sold, each for $150 (for educational use $200), and both together for
$300. Over the years various versions of SYN-TAX have been described by papers.
The most recent description in a journal is: Podani, J. 1993. SYN-TAX 5.0: Computer programs for multivariate data analysis in ecology and systematics.
Abstracta Botanica 17: 289-302.
John Archer and David Robertson
http://www.manchester.ac.uk/bioinformatics/ctree
B. McCune and M. J.
Mefford
http://home.centurytel.net/~mjm/pcordwin.htm
.
It is available at a price of $299 for a regular user, or $199 for a student
license. A license for each additional simultaneous user is $100 (or $50).
Simon Goodman, then of the Institute of Cell, Animal, and
Population Biology of the University of Edinburgh produced
RSTCALC, version 2.2. It is primarily
intended to perform analyses of population structure, genetic
differentiation and gene flow using microsatellite data.
IT calculates estimates the Rst measure of differentiation among a number of
populations, but in
addition you can also use RSTCALC to obtain estimates of the delta-mu^2 distance measure.
Its calculations are described in a paper:
Goodman, S. J. 1997. Rst Calc: a collection of computer programs for
calculating estimates of genetic differentition from microsatellite data and a determining their
significance. Molecular Ecology 6: 881-885.
The program runs on Windows and is available from
its web site
http://www.biology.ed.ac.uk/research/institutes/evolution/software/rst/rst.html
as a Windows executable.
Daniel Montagnon (Daniel.Montagnon (at)
wanadoo.fr) of the Institut d'Embryologie, Faculté de
Médecine, Strasbourg, France has written
YCDMA (Y Chromosome Data MAnagement), version 1.2. This is
a data management program for microsatellite data. It can do a wide variety
of management tasks, maintaining and manipulating databases of genotypes,
calculating gene frequencies, and converting file formats.
For the purposes of this listing, its relevant feature is the calculation of
a variety of gene frequency genetic distances between populations, and
a squared copy number microsatellite genetic distance. YCDMA is written in
Microsoft Visual Basic. It is available as a Windows executable from
its web site
at http://perso.wanadoo.fr/daniel.montagnon/YCDMAAng.htm
.
http://web.unife.it/progetti/genetica/Giorgio/giorgio_soft.html
Stephane Guindon and Olivier Gascuel
http://www.lirmm.fr/~guindon/gamma.html
It is also available as a web server here.
Gaston Gonnet and Chantal Korostensky
of the Computational Biochemistry Research Group at ETH in Zürich, Switzerland, have made available Darwin, Data Analysis and Retrieval With Indexed Nucleotide/peptide sequences, version 2.1. It is an environment which enables the user to carry out a variety of kinds of analysis with sequences, including phylogeny methods These seem to include distance matrix, split decompositon, and a form of likelihood method. Darwin is available as executables for Solaris, Intel-compatible Linux, Irix, and HP/Compaq/Digital Alpha machines. These are available free if the user registers by filling out a form at the download page at the Darwin web page. The executables can then be transferred to the user by ftp or by e-mail of encoded files. It is described in the paper: Gonnet, G. H., M. T. Hallett, C. Korostensky, and L. Bernardin. 2000. Darwin v. 2.0: an interpreted computer language for the biosciences. Bioinformatics 16: 101-103. Details and distribution policies are explained further at Darwin's web page athttp://cbrg.inf.ethz.ch/darwin
.
Darwin is also made available as
a server.
Vladimir Makarenkov
(makarenkov.vladimir (at) uqam.ca) of the Departement d'Informatique of the Université du Québec à Montréal and the Département de Sciences Biologiques of the Université de Montréal, and Philippe Casgraincasgrain (at) magellan.umontreal.ca
) of the
Département de Sciences Biologiques of the
Université de Montréal have released
T-REX (Tree and Reticulogram rEconstruXion), version 4.0a1.
This program performs four methods of fitting an additive distance
(distance in a nonclocklike tree) to a given dissimilarity. The
methods available include Sattath and Tversky's ADDTREE method,
Nei and Saitou's Neighbor-Joining method, Gascuel's UNJ Unweighted
Neighbor-Joining method, his BIONJ method, the
Circular order reconstruction method of Makarenkov and Leclerc
(1997), and Yushmanov (1984),
and the MW weighted least-squares method by Makarenkov (1997) and
Makarenkov and Leclerc (1998). A number of methods for fitting trees to
distance matrices that have missing values are also available.
Nucleotide sequence distance can be computed from sequences using
many of the widely-used distances.
The program can also carry out bootstrap and jackknife resampling to
assess strength of support for features of the trees.
It also allows construction and plotting
of "reticulograms" that show departures from treelike structure, and
interactive manipulation of the tree and reticulogram diagrams.
It is described in the paper: Makarenkov, V. 2001. T-Rex: reconstructing and
visualizing phylogenetic trees and reticulation networks.
Bioinformatics 17: 664-668.
Executables for Windows (the 4.0a1 version) and for Macintosh (the version
1.2a4 executable for PowerMacs) and an executable for a 32-bit DOS version
are available at
The T-REX web site at
http://www.labunix.uqam.ca/~makarenv/trex.html
. C++ source
code is also available there. A web
server for T-REX with more tree construction and manipulation methods is
also available.
http://www.bio.umontreal.ca/casgrain/en/labo/permute/index.html
Jérôme Goudet, of the Department of Ecology and Evolution of the
University of Lausanne, Switzerland (jerome.goudet (at) unil.ch)
has written FSTAT, version 2.9.3.2, a program to
estimate and test gene diversity statistics from codominant markers.
For our purposes, the important feature is its ability to calculate the
Nei and Cockerham/Weir families of distance measures. It can
convert data in its own format to and from the format of Genepop.
Version 2.9.2.3 is a Windows executable; an earlier version, 1.2, which
is a DOS executable is also available. Both can be downloaded
from its web site
at http://www2.unil.ch/popgen/softwares/fstat.htm
.
Michel Raymond and François Rousset
of the Equipe Génétique et Environnement of the Institut des
Sciences de l'Evolution at the University of Montellier II, France
(Raymond (at) isem.univ-montp2.fr and Rousset (at) isem.univ-montp2.fr).
have written distributed Genepop version 4.0,
a program to carry out a variety of population genetics tests. It can
test assumptions of Hardy-Weinberg and linkage equilibrium,
run log-likelihood G-based test of differentiation between populations,
use Slatkin's rare allele method to estimate number of migrants per generation,
and calculate allele frequencies. For our purposes the relevant feature is
its ability to calculates Fst and Rst measures of population differentiation,
which are genetic distances. It is described in a paper: Raymond, M. and
F. Rousset. 1995. GENEPOP (version 1.2) population genetic software for exact
tests and ecumenicism. Journal of Heredity 86: 248-249.
Genepop is a DOS executable that can run under Windows in a Command Tool
window. It can be downloaded
its web page
at http://kimura.univ-montp2.fr/~rousset/Genepop.htm
.
An older version, 3.4, can be downloaded by ftp from the University of
Montpellier at ftp://ftp.cefe.cnrs.fr/PC/MSDOS/GENEPOP/
.
A web server for Genepop 3.4 is also available in Australia at the
the John Curtin University of Technology.
Laurent Excoffier
of the Computational and Population Genetics Lab of the Institute of Zoology, University of Bern, Switzerland (laurent.excoffier (at) zoo.unibe.ch), Stephan Schneider, and David Roessli have released Arlequin version 3.5.1, a program for population genetics analysis. It can perform many kinds of population genetic tasks including estimation of gene frequencies, testing of linkage disequilibrium, and analysis of diversity between populations. For the purposes of this list, the relevant feature is its ability to compute a variety of genetic distance measures including of Jukes and Cantor, the Kimura 2-parameter distance, and the Tamura-Nei distance, each of these with or without correction for gamma-distributed rates of evolution. It can also compute a Minimum Spanning Tree network. It is available as binaries for Windows, for either 32 or 64-bit processors. A special version to compute some summary statistics is also included. An archive including the binaries and a PDF documentation file are available at its web site athttp://cmpg.unibe.ch/software/arlequin3/
.
Naoko Takezaki
of the Life Science Research Center of Kagawa University, Japan (takezaki (at) med.kagawa-u.ac.jp
),
Masastoshi Nei of the Institute of Molecular and Evolutionary Genetics
of the Department of Biology. Pennsylvania State University, University Park,
Pennsylvania, and Koichiro Tamura of Tokyo Metropolitan University, Tokyo,
Japan
have released POPTREE2, which
computes various genetic distance measures and constructs
trees of populations or closely related species from gene
frequency data by using the Neighbor-Joining method and UPGMA.
POPTREE2 can compute Nei's genetic distance and his Da
genetic distance, as well as Latter's Fst* distance and the
(Δμ)2 and Dsw measures of microsatellite genetic distance.
It can also perform bootstrapping, and compute heterozygosity and Gst measures
of the extent of genetic variation in a population and genetic differentiation
among subdivided population.
The program uses a Windows graphical user interface, and
trees can be displayed in a publishable for and changed by the user.
POPTREE2 is described in a paper: Takezaki, N., M. Nei, and K. Tamura. 2009.
POPTREE2: Software for constructing population trees from allele frequency
data and computing other population statistics with Windows interface.
Molecular Biology and Evolution 27: 747-752.
It is available from its web site
at http://www.med.kagawa-u.ac.jp/~genomelb/takezaki/poptree2/index.html
.
A source code Unix version (POPTREE version 1) and an
executable DOS version (which is called poptrfdos) are also available there.
It is also available
from the IUBIO archive
at http://iubio.bio.indiana.edu/soft/molbio/evolve
.
POPTREE was also formerly called njbafd, and
under that name its earlier version is also available at the same IUBIO site.
Olivier Hardy and Xavier Vekemans
http://ebe.ulb.ac.be/ebe/Software.html
Julio Rozas, J. C. Sánchez-DelBarrio, X. Messeguer and Ricardo Rosas of the Departament de Genètica, Universitat
de Barcelona, Spain (jrozas (at) ub.edu
) have released
DnaSP version 5.10.00, a software package for the analysis of
nucleotide polymorphism from aligned DNA sequence data. DnaSP can estimate
several measures of DNA sequence variation within and between populations
(in noncoding, synonymous or nonsynonymous sites), as well as linkage
disequilibrium, recombination, gene flow and gene conversion parameters.
It can also carry out several tests of neutrality:
Additionally, it can estimate the confidence intervals of some test-statistics
by the coalescent. The results of the analyses are displayed on tabular and graphic form.
For the purposes of this web site, the relevant features are the
calculation of measures of population divergence, which include the
Jukes-Cantor method which can be used as a
distance in phylogeny reconstruction. DnaSP is described in the papers:
It is distributed as a Windows executable from
its web site
at http://www.ub.es/dnasp/
.
Jianzhi George Zhang, now of the Laboratory of Genomic and Molecular Evolution
in the Department of Ecology and Evolutionary Biology of the University of
Michigan, Ann Arbor, Michigan
(jianzhi (at) umich.edu)
wrote Bn-Bs, a program
to estimate branch lengths in terms of synonymous and nonsynonymous
substitutions per site, while the tree topology is given. The program uses the
modified Nei-Gojobori method to estimate pairwise
synonymous and nonsynonymous distances among present-sequences and then
estimates branch lengths and their variances by using the ordinary
least-squares method. The method is described in the paper:
Zhang J., H. F. Rosenberg, and M. Nei. 1998. Positive Darwinian selection
after gene duplication in primate ribonuclease genes. Proceedings of the
Natonal Academy of Sciences, USA 95: 3708-3713.
It is available as C source code and as DOS
executables from the software web site of Masatoshi Nei's lab in
which the work was done. A zip archive of the files can be downloaded from
the link there. A documentation file is can also be read there.
Jianzhi George Zhang, now of the Laboratory of Genomic and Molecular Evolution
in the Department of Ecology and Evolutionary Biology of the University of
Michigan, Ann Arbor, Michigan
(jianzhi (at) umich.edu)
released HON-new, a program to compute the amounts of
conservative and radical amino acid substitution between pairs of DNA
sequences of coding region exons. The program uses a classification of
amino acids into categories. Three types of amino acid classifications
(by charge, by polarity and one of Miyata and Yasunaga) are provided. One can
also define conservative and radical amino acid oneself.
The method is modified from the original method of
Hughes, Ota, and Nei (1990) by taking into account transition bias. It is
described in the paper: Zhang J. 2000. Rates of conservative and radical
nonsynonymous nucleotide substitutions in mammalian nuclear genes. Journal
of Molecular Evolution 50: 56-68. It is available in C source
code and as a Windows executable at the Nei laboratory software web site at
Kevin Thornton
Daniel Montagnon (Daniel.Montagnon (at)
wanadoo.fr) of the Institut d'Embryologie, Faculté de
Médecine, Strasbourg, France has written NSA
(Nucleotide Sequences Analyzer), version 3.3. It is a general program
for reading in sequences and writing them out in a variety of data
formats, with the ability to select particular sets of sites and sequences.
For our purposes, the relevant feature is the ability to calculate
a number of different nucleotide sequence distances, as well as some
simple protein sequence distances. These include the Jukes-Cantor,
Kimura, and Tamura-Nei distances, as well as a simple protein distance
based on the fraction of similar amino acids. These can also have a
correction for a gamma distribution of rates across sites. The program
is written in Visual Basic, and is available as a Windows executable from
its web site
at
John Brzustowski
http://homes.bio.psu.edu/people/faculty/nei/software.htm
http://molpopgen.org/software/lseqsoftware.html
http://perso.wanadoo.fr/daniel.montagnon/NSAAng.htm
jbrzusto (at) ualberta.ca
),
wrote qclust, a program to carry out a number of
clustering methods including Neighbor-Joining. The neighbor-joining method
has been improved over our own Neighbor program, so as to be able to handle
large numbers of taxa much more quickly. The program is available
along with another program, calcdist which calculates
distances from 0/1 data. The programs are available
as C source and as DOS executables from
its web
page at http://www.biology.ualberta.ca/jbrzusto/dosclust.html
.
A more interactive version of the program is also available as Java from
a web page
at http://www2.biology.ualberta.ca/jbrzusto/cluster.php
.
(Brzustowski has declared that both of these programs are unsupported software,
and he will not answer questions about them).
http://pubmlst.org/software/analysis/start2/
. A previous
version, S.T.A.R.T., is available at
another page at the same web site.
.
http://profdist.bioapps.biozentrum.uni-wuerzburg.de/
.
Olivier Langella
(Olivier.Langella (at) pge.cnrs-gif.fr
)
of the Laboratoire PGE, CNRS UPR9034, Gif sur Yvette, France, distributes
Populations, version 1.2.30. It can calculate a wide
variety of distances from multiple-allele diploid or haploid genotypes
and from microsatellite data, and can
also infer phylogenies by distance methods including Neighbor-Joining and
UPGMA. It can bootstrap the data across loci and/or across individuals
when constructing phylogenies. The trees can be trees of populations or
trees of individuals.
Populations is available as a free download from its web site at
http://bioinformatics.org/~tryphon/populations/
, as source code, as executables for Windows.
Patrick Meirmans
http://www.bentleydrummer.nl/software
Allen Rodrigo, Alexei Drummond, and Matthew Goode
of the Computational and Evolutionary Biology Laboratory, School of Biological Sciences, University of Auckland, New Zealand (a.rodrigo (at) auckland.ac.nz and m.goode (at) auckland.ac.nz) have released Pebble, version 1.0, (Phylogenetics, Evolutionary Biology, and Bioinformatics in a moduLar Environment) This is a graphical user interface around a functional programming language for evolutionary inferences. The system is written in Java using the PAL project classes as its components. This alpha release provides the basic user interface and some component packages. The following analyses and tools are available in vCEBL 0.3a:http://www.cebl.auckland.ac.nz/software2.php
. It requires
Java VM 1.1.1 or higher. It can also be obtained there as an applet for your
browser, with some features lacking.
Le Sy Vinh
(Vinh (at) cs.uni-duesseldorf.de) of the Bioinformatics Institute of the University of Düsseldorf, Germany and Arndt von Haeseler (arndt.von.haeseler (at) univie.ac.at) of the Centre for Integrative Bioinformatics Vienna (CBIV) have released STC (Shortest Triplet Clustering). This method constructs k-representative sets from triplet of species. The resuling clustering method is O(n2) in speed and can handle thousands of species with good accuracy. It is described in a paper: Vinh, L. S. and A. von Haeseler. 2005. Shortest triplet clustering: reconstructing large phylogenies using representative sets. BMC Bioinformatics 6: 92. The program is available as Linux and as Windows executables at its web site athttp://www.cibiv.at/software/stc/
Naoko Takezaki
of the Life Science Research Center of Kagawa University, Japan (takezaki
(at) med.kagawa-u.ac.jp
) has written sendbs.
It computes average nucleotide substitutions within and between populations. The
method is described in the paper by M. Nei and L. Jin (1989, Molecular
Biology and Evolution 6: 290-300). However, sendbs differs from
their method by using a bootstrap across sites obtain standard errors of
the distances.
It also constructs a tree of populations using a neighbor-joining method.
It is distributed as source code for Unix, and also as a DOS executable, from
by ftp from the Indiana ftp server
and through the software page of Masatoshi Nei's lab
at Pennsylvania State University at
http://www.bio.psu.edu/People/Faculty/Nei/Lab/software.htm
.
Applied Maths BVBA
http://www.applied-maths.com/gc/gc.htm
. A detailed
brochure is available for downloading there. Gelcompar II is commercial
software. For price and ordering information contact them by phone at
+32 9 22222 100, fax them at +32 9 2222 102, e-mail them at
info (at) applied-maths.com, or use the information request form at
their web pages. Their U.S. Sales Office is at Applied Maths Inc.,
512 East 11th Street, Suite 207, Austin, Texas 78701.
phone +1 512-482-9700, fax +1 512-482-9708 (email is info-us (at) applied-maths.com). (One company vending Gelcompar II sells the whole
package for $20,000, though if only the basic module and the cluster analysis
module are ordered the price is $5,400).
Andrey Rzhetsky, now of the Department of Human Genetics at the University of
Chicago (arzhetsk (at) medicine.bsd.uchicago.edu)
Statio, a program for testing stationarity of nucleotide
composition or amino acid composition in pairs of sequences. The program reads
a pair of sequences and then tests stationarity under a number of possible
models of DNA evolution or protein evolution. The method is described in
a paper: Rzhetsky, A. and M. Nei. 1994. Tests of applicability of several
substitution models for DNA sequence data. Molecular Biology and Evolution 12(1): 131-151.
It can be downloaded as a set
of MSDOS executables from the Nei lab software web site
at
Probal Chaudhuri
https://homes.bio.psu.edu/people/faculty/nei/software.htm
.
A trial Windows version is available from
its web site
at http://www.isical.ac.in/~probal/main.htm
. It is described as
available as C++ source code, Windows executables, Linux executables and
Mac OS X universal executables.
Mike Sanderson
(sanderm (at) email.arizona.edu) of Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona has written r8s, version 1.71, a program to adjust branch lengths and divergence times in a phylogeny to infer divergence times by smoothing rates of evolution to approximate a molecular clock (allow a "relaxed" clock). The program is given the tree with branch lengths as input and smooths this tree and infers divergence times. Sanderson's main approaches to smoothing divergence times are described in his papers:http://loco.biosci.arizona.edu/r8s/index.html
.
Torsten Eriksson
of the Bergius Botanical Garden, Stockholm, Sweden (torsten (at) bergianska.se
)
has released the r8s bootstrap kit. This is a number
of Perl scripts and three general command blocks for PAUP* and r8s which enable bootstrapping
analyses with r8s. It is available from his
software web site
at http://www.bergianska.se/index_forskning_soft.html
.
Kai Chan (kaichan (at) stanford.edu)
of the Department of Biological Sciences, Stanford University, Stanford,
California, and Brian Moore
(brian.moore (at) yale.edu) of the Department of Ecology and
Evolutionary Biology, Yale University, New Haven, Connecticut
have released SymmeTREE version 1.1. It is a program
to test whether branches of a tree have diversified at different rates,
and along which branches the significant shifts of diversity have occurred.
This is evaluated using the species diversity of different parts of the tree.
The program is described in a paper: Chan, K. M. A. and B. R. Moore. 2004.
SYMMETREE: whole-tree analysis of differential diversification rates.
Bioinformatics Advance Access publication November 30, 2004.
The program is available as executables for Windows, Mac OS X, and Linux
and as source code for other flavors of Unix. It is distributed from
its web site at
http://www.phylodiversity.net/bmoore/software.html
.
Galina Glazko, now of the Department of Biomedical Informatics of the University of
Arkansas Medical School, Little Rock, Arkansas
(GVGlazko (at) uams.edu) and
Masatoshi Nei of the Institute of Molecular Evolutionary Genetics at
Pennsylvania State University, University Park, Pennsylvania have released
TIMER, which estimates divergence times using a linearized
tree approach. It can use DNA or protein sequences at multiple loci. It
constructs a phylogeny using the Neighbor-Joining method, and then estimates
branch lengths and divergenece times for the individual loci as well as for the
full set of loci. It can carry out the Two-Cluster Test for constancy of
rate of divergence at an individual node in the tree. The methods are
explained in a paper: Nei, M., P. Xu, and G. Glazko. 2001. Estimation of
divergence times from multiprotein sequences for a few mammalian species and
several distantly related organisms. Proceedings of the National Academy of
Sciences 98: 2497-2502. TIMER is available as a Windows executable
at the Nei lab software web site
at https://homes.bio.psu.edu/people/faculty/nei/software.htm