Character simulations and randomizations
Mesquite can simulate and randomize characters to build statistical tests. On this page we give an overview of these features. A more in-depth account of simulation of DNA sequence evolution is given separately.Contents
- Using results of simulations & randomizations
- Simulations of character evolution
- Viewing results of simulations
- Randomizing characters
Using results of simulations & randomizations
The simulated or randomized characters can be used or stored in several ways:- The characters can be stored into matrices in the current file by choosing options in theMake New Matrix from submenu of the Characters menu. For instance, if you choose Simulated Matrices on Current Tree, the matrix simulated will be stored in the file.
- The characters may be used directly, at that moment, in calculations. For example, if you make a Bar & Line Chart for Characters, and choose Simulated Characters as your source of characters, the characters will be simulated and used in the chart without being stored in the file.
- A series of many data files can be saved, each one with a different replicate of the simulated or randomized data matrix. This is available through the Save Multiple Matrices submenu of the Character menu
- A series of many data files can be saved in combination with scripting files to instruct programs such as Swofford's PAUP to run the files. This can be done using the Batch Architect, a description of which is in the page on DNA simulations and some of the Studies.
To replicate the results of a simulation or randomization, you can use the Set Seed menu item to set the random number seed used. If you are using the same conditions, including the same seed, the simulations and randomizations should be reproducable.
Simulations of character evolution
Stochastic models can be used to simulate character evolution along the branches of a phylogenetic tree by selecting Simulated Characters (to generate characters one at a time) or Simulated Matrices on Current Tree (to generate whole matrices, on a current tree in a Tree Window), or Simulated Matrices on Trees (to generate whole matrices, each one on a different tree from a source of trees). These options are available whenever characters or matrices might be called for, for instance when making a chart of characters or matrices.The following are the character types and models that can be simulated:
- Evolve Categorical characters. The following models are also discussed in the section on likelihood reconstructions.
- Mk1 model — Single parameter model analogous to Jukes-Cantor. Rates of change equal for all types of state-to-state changes.
- AsymmMk model — Two parameter asymmetrical model with differing rates of forward and backward changes. Forward changes are those in which state number increases (e.g., state 0 to state 1); backward changes are those in which state decreases (e.g., state 1 to state 0). One can specify the forward and backward rates directly, or alternatively, one can specify an overall rate of change in combination with a bias of forward versus backward. This model will generally be appropriate only for binary characters. You can also specify whether the states are the root are assumed to be at the equilibrium frequencies implied by the model. If so, then Mesquite chooses an ancestral state in the simulations according to the equilibrium frequencies implied by the bias in gains versus losses. Otherwise, Mesquite chooses a state with equal probabilities.
- Evolve DNA characters
- See page on DNA simulations for details.
- Evolve Continous characters
- Brownian motion model — Model with a single parameter, the rate of change.
Viewing results of simulations
Simulated characters can be used in many calculations, but if you want to visualize directly the results of a simulations you can use the Trace Character History feature available in the Analysis:Tree menu of the Tree Window. By default Trace Character History shows a reconstruction of ancestral states. Thus, if the character is simulated, the states at nodes shown would not be the "true" ancestral states that occurred during the simulation, but rather states inferred from the states given to the terminal taxa by the simulation. However, once Trace Character History is active, its Trace menu has a Character History Source menu item. Choose Simulate Ancestral States and specify the simulation. The states indicated at the nodes will then be the true ancestral states in the simulation. You can set the Seed to make the simulation equivalent to simulations done in other contexts.Randomizing characters
Existing characters can be randomized as follows:- Reshuffle Character — Supplies replicate reshufflings of a single chosen character. In each reshuffling, the character states are randomly scrambled among taxa, keeping the frequencies of different character states fixed.
- Reshuffle States within Characters— Supplies matrices, each of which is a reshuffling of an existing matrix. The first character of the matrix is a reshuffling of the first character of the original matrix; the second character is a reshuffling of the second original character; and so on. You can think of this as reshuffling within each column of the matrix.
- Reshuffle Within Characters (Taxa Partitioned) — As for Reshuffle States within Characters, but respecting taxa partitions. Each character of the matrix is a reshuffling of the respective character of the original matrix, but done only within groups of the current taxa partition. Thus, if taxa have a current partition that divides the taxa into Group A and Group B, then the reshuffling within a character first shuffles all states of the character among Group A taxa. Then, it shuffles states of the character within Group B taxa.
- Reshuffle States Within Taxa — Supplies matrices, each of which is a reshuffling of an existing matrix. Instead of reshuffling within each column, this reshuffles within each row. Thus, the states of each taxon are reshuffled among characters. This might be used, for instance, to generate DNA data with no phylogenetic signal that preserves the base composition of each taxon.
- Reshuffle Within Taxa (Char. Partitioned) — As for Reshuffle States within Taxa, but respecting character paritions. Thus, if the characters have a current partition that divides them into (for example) groups 28S and COI, then the reshuffling within a taxon first shuffles all states of a taxon among 28S sites, then shuffles all states of the taxon among COI sites.
- Bootstrap resample — Supplies matrices, each of which is a bootstrap resampled version of an existing matrix. Characters are sampled randomly from the original matrix and moved into the resampled matrix until it contains as many characters as were in the original matrix. Some of the original characters may by chance be sampled more than once; some may be not sampled at all.
- Rarefy characters — Supplies matrices, each of which is derived from an existing matrix by randomly deleting entire characters.
- Sprinkle missing — Supplies matrices, each of which is derived from an existing matrix by randomly assigning "missing" (? or unassigned) to cells of the matrix with a particular probability.
- Add noise (for continuous matrices only) — Available in the Character Matrix Editor under Alter/Transform, this adds noise to the states of all or selected cells of the matrix.
- Random Fill — Available in the Alter/Transform of the Matrix menu of the Character Matrix Editor, it can be used to fill all or selected cells of the matrix with randomly chosen states.