Customization API
The customization API provides functions that one can use to define
custom behavior in the program, whereby specific parts of the workflow
can be given new implementations (e.g., the function used to calculate
the score of a grid point). The user-defined implementation is given a
label, which can be used in the input file to request it. The
customization API is provided in the module gmak.api.
The functions in the customization API are named with the pattern
add_SPEC_custom_OBJ. OBJ indicates the modified part of the
program workflow. SPEC is used to particularize some contexts when
OBJ is not sufficient. Roughly speaking, calling such a function
includes a new implementation for the OBJ part of the workflow.
The label of this new implementation, as well as the actual
implementation, are provided in the function parameters. This
frequently involves passing user-defined functions as these functions
parameters. These function-parameter functions have signature
requirements that are specific to SPEC and OBJ. To properly
document them, dummy implementations are provided in the
gmak.api_signatures module and referred to in this documentation.
Here, these modules are documented in sections separated by context. Each one includes also a brief explanation of how the corresponding functions fit in the program workflow. Auxiliary functions and classes that are involved in the customization API but not part of it are documented in the last section.
Usage
The module gmak.api is made available by installing gmak.
However, simply using it in a Python script or module does not
immediately affect the program. To make the modifications take effect,
the script or module should be saved in the working directory (i.e.,
the directory where the program is deployed) with the name
custom.py, and the command-line option --custom must be passed
to the program. It will then be interpreted at runtime, incorporating
the user-defined implementations into the program.
Note
We recommend using the template file provided with the
program as a starting point to write the custom.py file.
Note
Customization examples are provided in the
gmak.api_examples package. They are discussed in the Example
Applications part of the documentation.
Systems
In gmak, the type of a system determines the construction of its
topology. There are two aspects to this. First, a suitable
representation for the topology must be chosen, so that it is passed
around in the program and used as input for the simulation and
property-calculation routines. This representation will be referred to
as the topology-output object. Secondly, the way to apply the values
of the parameters modified by the program to these topology-output
objects must be specified.
To add a custom system type to the program, one should use the
function add_custom_system(). It receives as
parameters the name of the custom system type and two user-implemented
functions which are chained together. The first one creates a
topology-output object for some given system, grid-shift iteration and
grid point. The second one applies the values of the parameters
modified by the program to the topology-output object returned by the
first one.
Alternatively, one can customize the behavior of the default gmx
system type, used for GROMACS-based systems. While nonbonded and
macro-type parameters (see Interaction Parameters) are
handled automatically by the program, other types of parameters
(falling into the umbrella category of custom parameters) are not. For
those, the user can
add_gmx_custom_parameter_writer().
- gmak.api.add_custom_system(type_name, topo_out_creator, topo_out_writer)
Adds a custom system type to the program. In the input file, it can be referenced with the type
type_name.- Parameters
type_name (str) – Name of the type of the custom system.
topo_out_creator (callable) – A function that creates a
TopologyOutputobject for a given system, grid-shift iteration and grid point (see General Workflow andtopo_out_creator()).topo_out_writer (callable) – A function that applies the interaction-parameter values to the topology-output object (see
topo_out_writer())
- gmak.api.add_gmx_custom_parameter_writer(funct)
Adds to the program a function to apply the values of custom parameters (see Custom Parameters) to a GROMACS topology file.
- Parameters
funct (callable) – The function used to apply the values of custom parameters (see
gmx_custom_parameter_writer())
- gmak.api_signatures.topo_out_creator(workdir, name, grid, state, system_pars)
Returns a
TopologyOutputthat encodes a topology.This instance can be initialized with a simple
TopologyOutput()statement, and, after that, the user can set the instance attributes as desired. The returned instance is the same one passed as a parameter intopo_out_writer(). It is also passed to thesimulator()andcomponent_calculator()functions.- Parameters
workdir (str) – The directory where the topology files should be written
name (str) – The name of the current system
grid (int) – The current grid-shift iteration (see General Workflow)
state (int) – The linear index of the current grid point (see Grid Indexing)
system_pars (
InputParameters) – The system input parameters defined in the input file
- Returns
The topology-output object
- Return type
Note
If your custom system type is to be used with other GROMACS compatible objects (e.g., with the GROMACS-compatible General Protocol), you should use
GmxTopologyOutputand set this function to:import gmak.gmx_system.GmxTopologyOutput as GmxTopologyOutput return GmxTopologyOutput(workdir, name, grid, state)
Note
For even more flexibility, you can create your own customized topology-output class, inheriting from
TopologyOutput(as done e.g. forGmxTopologyOutput). In this way, you can provide custom methods as well as data to the topology.
- gmak.api_signatures.topo_out_writer(params, topo_out, system_pars)
Applies the interaction-parameter values to the topology-output object. If any topology files need to be written, this should also be done in this function.
- Parameters
params (list of
InteractionParameter) – The list of interaction parameters to be applied to the topologytopo_out (
TopologyOutput) – TheTopologyOutputinstance returned by thetopo_out_creator()function.system_pars (
InputParameters) – The system input parameters defined in the input file
- gmak.api_signatures.gmx_custom_parameter_writer(param, istream, ostream)
Example of the signature of a function used to apply the values of custom parameters to a GROMACS topology file. Such a function is called for all custom parameters, so if any distinction needs to be made between them, it should be done in the body of this function.
- Parameters
param (
InteractionParameter) – The custom interaction parameter.istream (io.StringIO) – A readable input stream with the content of an intermediate topology file to which the value of
paramis not yet applied. This file is obtained from the template file given in the$systemblock by expanding the#includedirectives in it and applying to the resulting content the values of the interaction parameters specified beforeparamin the input file.ostream (io.StringIO) – A writable stream in which the user must put the content of the input stream with the value of
paramproperly applied to it.
Protocols
A protocol type determines how the protocol simulations are carried out for a given system. This
includes the choice of the software package used to carry out the
simulations, the deployment of replicas, the sequence of simulations,
etc. To add a custom protocol type to the program, one should use the
function add_custom_protocol(). Support for
extending the simulations is also provided.
Up to four functions can be implemented by the user and supplied to
the protocol-customization function. The first and mandatory one is
the simulator() function, which is
responsible for carrying out the simulations. If extending the
simulations is desired, this function has to handle not only setting
up the initial simulation, but also extending it when appropriate.
These two situations can be distinguished based on the value of the
ext parameter, which evaluates to True for extension
simulations.
The next two functions are optional and should be specified only when
extending the simulations is desired. The first one of those is
calc_initial_len(), which returns the
length of the initial simulation (prior to extending it). It is a
function, and not a variable, to allow for recycling the protocol type
for different applications. The second one is
calc_extend(), which contains the logic
for determining the length of an extended simulation.
Finally, get_last_frame(), if provided,
makes the protocol type followable. This function returns the
configuration file that the user wants to use as a starting point for
the following protocol. Typically, this will be the last frame of the
production run of the protocol.
- gmak.api.add_custom_protocol(type_name, simulator, calc_initial_len=None, calc_extend=None, get_last_frame=None)
Adds a custom protocol type to the program. In the input file, it can be referenced with the type
type_name.- Parameters
type_name (str) – Name of the type of the custom protocol.
simulator (callable) – The simulator function (see
simulator())calc_initial_len (callable) – (optional) The function to calculate the initial length of the production run (see
calc_initial_len()). This is used only when support for extensions is desired. Defaults toNone, indicating that extensions are not desired.calc_extend (callable) – (optional) The function to calculate the new length of the production run (see
calc_extend()). This is used only when support for extensions is desired. Defaults toNone, indicating that extensions are not desired.get_last_frame (callable) – (optional) The function to extract the configuration file to be used as a starting point for the following protocol (see
get_last_frame()). This is used only when a followable protocol is desired. Defaults toNone.
- gmak.api_signatures.simulator(length, topology, coords, ext, protocol_pars, workdir)
The main simulator function.
- Parameters
length (int or float) – The length of the current production run
topology (
TopologyOutput) – The topology considered in the simulationscoords (str) – The path of the initial configuration file for the simulations
ext (bool) – Indicates whether this is an extension or the first simulation.
protocol_pars (
InputParameters) – The protocol input parameters defined in the input fileworkdir (str) – The directory where the simulations are run.
- Returns
Output files of the simulation
- Return type
- gmak.api_signatures.calc_initial_len(protocol_pars)
Calculates the initial length of the production runs.
- Parameters
protocol_pars (
InputParameters) – The protocol input parameters defined in the input file- Returns
The initial length of the production run
- Return type
- gmak.api_signatures.calc_extend(errs_tols, last_length, protocol_pars)
Returns the new length of the simulation based on the uncertainties and tolerances of the properties involving the protocol. Return
Noneto indicate that no more extensions are needed.- Parameters
errs_tols (dict) – The dictionary of uncertainties and tolerances. The keys are the property names. The values are dictionaries
{'tol': TOL, 'err': ERR}, whereTOL(float) is the tolerance andERR(float) is the uncertainty for the property.last_length (int or float) – The current length of the production run.
protocol_pars (
InputParameters) – The protocol input parameters defined in the input file
- Returns
The new length of the production run, or None
- Return type
- gmak.api_signatures.get_last_frame(protocol_output)
Returns the path of the configuration file to be followed. It typically corresponds to the last frame of the production run.
- Parameters
protocol_output (dict) – The output files of the simulation (returned by
simulator())- Returns
The path of the configuration file to be followed
- Return type
Properties
The customization of properties occurs on the levels of component and composite properties (see Properties). For this reason, to set up custom properties, one must actually set up custom composite and component properties. For the trivial case where the composite property is associated with a single component property, the API provides a way to automatically derive the former from the latter one.
To add a custom component-property type to the program, one should use
the function add_custom_component_property(). It
receives as arguments a calculator function and a boolean parameter
that indicates whether the latter function returns (a) samples of the
property for each simulation frame; or (b) the expected value and
uncertainty of the property. In the former case, the program
automatically takes care of carrying out the statistical treatment of
the sampled values in order to produce the expected value and
uncertainty. In the latter case, this is the responsibility of the
user implementation. The calculator function is called individually
for each sampled grid point and protocol after the corresponding
simulations are performed.
To add a custom composite-property type to the program, one should use
the function add_custom_composite_property(). It
receives as argument the expected values and uncertainties of the
corresponding component properties for a particular system and grid
point and returns the corresponding expected value and
uncertainty of the composite property.
Note
There is no need to define a custom property type for a
component property that can be extracted with the gmx energy
program. A component property of type gmx_PROPNAME in the
input file is automatically interpreted as one that is obtained
with gmx energy with PROPNAME as the input string.
- gmak.api.add_custom_component_property(type_name, component_calculator, is_timeseries)
Adds a custom component-property type to the program. In the input file, it can be referenced with the type
type_name.- Parameters
type_name (str) – Name of the type of the custom component property.
component_calculator (callable) – The function used to calculate the custom component property (see
component_calculator())is_timeseries (bool) –
Trueindicates that the component property is obtained as a timeseries;False, as a tuple(EA, dEA)with the expected value and statistical uncertainty.
- gmak.api.add_custom_composite_property(type_name, composite_calculator=None)
Adds a custom composite-property type to the program. In the input file, it can be referenced with the type
type_name.- Parameters
type_name (str) – Name of the type of the custom composite property.
composite_calculator (callable) – (optional) The calculator function (see
composite_calculator()). If it is not supplied, the program implicitly assumes that the property has only one component and identifies the values and errors of the composite property with those of the component property.
- gmak.api_signatures.component_calculator(topology, protocol_output, property_pars)
The function used to calculate the custom component property.
- Parameters
topology (
TopologyOutput) – The topology considered in the simulationsprotocol_output (dict) – The output files of the simulation. For custom protocols, it is the
dictreturned by thesimulator()function. For GROMACS-based protocols, it is the simulation-outputdictdescribed in GROMACS-compatible General Protocol.property_pars (
InputParameters) – The property input parameters defined in the input file
- Returns
A tuple
(EA, dEA)with the expected value and uncertainty of the property, or a list with the values of the property for each frame of the simulation. In the latter case, the program will automatically calculate the corresponding average and statistical uncertainty.- Return type
- gmak.api_signatures.composite_calculator(values, errs, property_pars)
The function used to calculate the custom composite property.
- Parameters
values (list) – The list with the expected values of each component property. Each member is a tuple
(PTYPE, VALUE)wherePTYPE(str) is the type of component property andVALUE(float) is the expected value.errs (list) – The list with the uncertainties of each component property. Each member is a tuple
(PTYPE, VALUE)wherePTYPE(str) is the type of component property andVALUE(float) is the estimated uncertainty.property_pars (
InputParameters) – The property input parameters defined in the input file
- Returns
A tuple
(EA, dEA)with the expected value and uncertainty of the composite property- Return type
Surrogate Model
The surrogate-model type defines the function used to estimate the expected values and uncertainties of a component property for all grid points based on these quantities for the sampled grid points.
To add a custom surrogate-model type to the program, one should use
the function add_custom_surrogate_model().
- gmak.api.add_custom_surrogate_model(type_name, compute, corners=False)
Adds a custom surrogate-model type to the program. In the input file, it can be referenced with the type
type_name.
- gmak.api_signatures.compute(EA_s, dEA_s, I_s, gridshape, X_ki)
Computes the expected values and uncertainties of a component property for the entire grid.
- Parameters
EA_s (numpy.ndarray) – A 1D array of shape
(NSAMP,)with the average values of the property for each sampled point.dEA_s (numpy.ndarray) – A 1D array of shape
(NSAMP,)with the uncertainties of the property for each sampled point.I_s (list) – A list of length
NSAMPwith the linear indexes of the sampled grid points.gridshape (tuple) – A tuple with the grid dimensions.
X_ki (numpy.ndarray) – A 2D array with the parameter-space values for each grid point (more specifically, the parameters of the Main Variation). The first index is the linear index, and the second index is the coordinate of the parameter-space point. For example,
X_ki[1,0]is the value of the first parameter of the Main Variation for the grid point with linear index equal to 1.
- Returns
A tuple
(EA_k, dEA_k)containing the estimated values and uncertainties, respectively, of the property for each grid point.EA_kanddEA_kare 1D arrays indexed by the linear index of the grid point.- Return type
Note
The function
gmak.cartesiangrid.flat2tuple()can be used to convert a linear to a tuple index if desired.
Score Function
The score-function type defines how the estimates of the values and uncertainties of the composite properties are used to calculate a score for the grid point.
To add a custom score-function type, one should use the function
add_custom_score().
- gmak.api.add_custom_score(type_name, calc_score, calc_score_err=None)
Adds a custom score-function type to the program. In the input file, it can be referenced with the type
type_name.- Parameters
type_name (str) – Name of the type of the custom function
calc_score (callable) – The score function (see
calc_score())calc_score_err (callable) – (optional) The score uncertainty function (see
calc_score_err()). Defaults toNone, which means that the uncertainties are not calculated.
- gmak.api_signatures.calc_score(estimates, errors, weis, refs)
Calculates the score for a grid point.
- Parameters
estimates (dict) – The dictionary of composite-property estimates for a grid point. The keys are the property names. Each key maps to the expected value of the property at a given grid point.
errors (dict) – The dictionary of composite-property uncertainties for a grid point. The keys are the property names. Each key maps to the value of the uncertainty of the property at a given grid point.
weis (dict) – The dictionary of composite-property weights for a grid point. The keys are the property names. Each key maps to the weight of the property in the optimization.
refs (dict) – The dictionary of composite-reference values of the properties for a grid point. The keys are the property names. Each key maps to the reference values of the property in the optimization.
- Returns
The value of the score
- Return type
- gmak.api_signatures.calc_score_err(estimates, errors, weis, refs)
Calculates the score uncertainty for a grid point.
- Parameters
estimates (dict) – The dictionary of composite-property estimates for a grid point. The keys are the property names. Each key maps to the expected value of the property at a given grid point.
errors (dict) – The dictionary of composite-property uncertainties for a grid point. The keys are the property names. Each key maps to the value of the uncertainty of the property at a given grid point.
weis (dict) – The dictionary of composite-property weights for a grid point. The keys are the property names. Each key maps to the weight of the property in the optimization.
refs (dict) – The dictionary of composite-reference values of the properties for a grid point. The keys are the property names. Each key maps to the reference values of the property in the optimization.
- Returns
A tuple
(MIN, MAX)that represents a confidence interval for the score- Return type
Grid Shifting
The grid-shifting type defines how the new origin of the grid is calculated for grid-shifting procedure.
To add a custom grid-shifting type, one should use the function
add_custom_gridshifter().
- gmak.api.add_custom_gridshifter(type_name, calculator)
Adds a custom grid-shifting procedure to the program. In the input file, it can be referenced with the type
type_name.- Parameters
type_name (str) – Name of the type of the custom protocol.
calculator (callable) – The custom grid-shifting function (see
calculator()).
- gmak.api_signatures.calculator(tuple_indexes, scores, propnames, averages, uncertainties)
The grid-shifting procedure.
- Parameters
tuple_indexes (list) – The tuple indexes of all grid points, ordered by linear index.
scores (list) – A list with the scores of the grid points, ordered by linear index.
propnames (list) – The names of the composite properties.
averages (numpy.ndarray) – A 2D array with the estimated values of the composite properties for all grid points. The first index is the linear index of the grid point. The second index corresponds to the property index in
propnames.uncertainties (numpy.ndarray) – A 2D array with the estimated uncertainties of the composite properties for all grid points. The first index is the linear index of the grid point. The second index corresponds to the property index in
propnames.
- Returns
The shifting tuple, or
Noneto complete the run. For example, a return value of (-5, 5) indicates that the origin is shifted by -5 grid cells in the first coordinate and +5 grid cells in the second.- Return type
tuple or None
Logging
- class gmak.logger.Logger(fn, indentString=' ')
- indent(number=1)
Indents the stream by a certain number of indentation strings.
- Parameters
number (int) – The number of indentation strings (default is 1).
- putMessage(msg, dated=True)
Writes a message to the log-file stream with proper indentation.
- gmak.logger.globalLogger =gmak.logger.Logger
The global
Loggerinstance that has access to the log-file stream.
Tip
In any customized function, one can request that a message is written in the log file, at runtime, with:
from gmak.logger import globalLogger as log
log.putMessage("Write this message!")
Utilitary Functions and Classes
- gmak.cartesiangrid.flat2tuple(gridshape, idx)
For a grid with dimensions
gridshape, convert a linear indexidxto a tuple index.
- class gmak.custom_attributes.CustomizableAttributesMixin.InputParameters
An instance of this class stores the input parameters supplied in a block of the input file as attributes. Input parameters that correspond to a single value token are stored as objects with type appropriate to the token value. Input parameters that correspond to a list of multiple value tokens are stored as a list of objects, each one processed as above.
For example, the block
$coords name liquid type gmx_liquid coords molecule.gro nmols 1500 box 3.0 3.0 3.0 $end
corresponds to an object
coords_attrfor which>>> coords_attr.name 'liquid' >>> coords_attr.type 'gmx_liquid' >>> coords_attr.coords 'molecule.gro' >>> coords_attr.nmols 1500 >>> coords_attr.box [3.0, 3.0, 3.0]
Note the automatic type conversion for the parameters
nmolsandbox, fromstringandlistofstringtointandlistoffloat, respectively.Note
Since the parameter is used as the name of an instance attribute, the use of only lowercase characters and underscores for custom parameters is highly recommended to avoid run-time errors.
- class gmak.systems.TopologyOutput
This class encodes a topology. It is intended to be used as a generalization of the path of the topology file, when more flexibility is needed in constructing it. It is essentially a container for data that the user wants to transfer around and that is identified by the program as representing the topology of a system. The way to store data in objects of this type is via instance attributes, which can be freely set by the user. These objects are created individually for each system, grid-shift iteration and grid point (General Workflow).
- class gmak.gmx_system.GmxTopologyOutput(workdir, name, grid, state)
Bases:
gmak.systems.TopologyOutputAutomatically sets a topology-file path with extension
.topbased on the directory pathworkdir, the system namename, the grid-shift iterationgridand the grid-point linear indexstate.- property path
The path of the topology file.
- class gmak.interaction_parameter.InteractionParameter
The class representing an interaction parameter that is set by the program. It contains its type and value and, when adequate, the types of the particles involved in the interaction.
- property name
The string that identifies the interaction parameter. It is the same one used in the
$variationblock of the input file.- Type
- property particles
The particles involved in the interaction
- Type
None,InteractionAtomorInteractionPair
- property type
The type of interaction parameter
- class gmak.interaction_parameter.InteractionParameterType(value)
Bases:
enum.EnumAn enumeration that identifies the type of interaction parameter.
- MacroParameter = 1
The type assigned to parameters of which the name starts with the
@character. For the GROMACS-based system type, it identifies parameters that are applied to the topology file via simple textual replacement. For custom system types, this has no special behavior.
- LJ_V = 2
The type assigned to \(\sigma\) or \(C_6\) Lennard-Jones parameters (see Lennard-Jones parameters for naming conventions).
- LJ_W = 3
The type assigned to \(\epsilon\) or \(C_{12}\) Lennard-Jones parameters (see Lennard-Jones parameters for naming conventions).
- LJ_14_V = 4
The type assigned to \(\sigma\) or \(C_6\) Lennard-Jones parameters when applied to 1,4 (third-neighbor) pairs (see Lennard-Jones parameters for naming conventions).
- LJ_14_W = 5
The type assigned to \(\epsilon\) or \(C_{12}\) Lennard-Jones parameters when applied to 1,4 (third-neighbor) pairs (see Lennard-Jones parameters for naming conventions).
- CustomParameter = 6
The type assigned to parameters which are not of the types above. For the GROMACS-based system type, parameters with this type are not handled automatically and require that the user
add_gmx_custom_parameter_writer().
- class gmak.interaction_parameter.InteractionAtom(name, pairs_include=None, pairs_exclude=None)
The class representing an interaction atom.
- property name
The name of the atom type. It is set automatically from the name of the corresponding
InteractionParameter.- Type
- property pairs_exclude
The regular expression controlling the standard pairtypes that are not affected by the atom.
- Type
- class gmak.interaction_parameter.InteractionPair(name_i, name_j)
The class representing an interaction pair.
Two pairs can be compared for equality regardless of the order of their atom types:
>>> InteractionPair("CH3", "CH1") == InteractionPair("CH1", "CH3") True >>> InteractionPair("CH3", "CH1") == InteractionPair("OA", "CH3") False
- derives_from_atom(atom)
Verifies if the pair derives from an atom based on the properties
pairs_includeandpairs_excludeof the latter.- Parameters
atom (
InteractionAtom) – The atom that is verified if the pair derives from- Returns
True if the the pair derives from the atom; False otherwise.
- Return type
- property name_i
The name of the first atom type of the pair. It is set automatically from the name of the corresponding
InteractionParameter.- Type
- property name_j
The name of the second atom type of the pair. It is set automatically from the name of the corresponding
InteractionParameter.- Type