Customization API

The customization API provides functions that one can use to define custom behavior in the program, whereby specific parts of the workflow can be given new implementations (e.g., the function used to calculate the score of a grid point). The user-defined implementation is given a label, which can be used in the input file to request it. The customization API is provided in the module gmak.api.

The functions in the customization API are named with the pattern add_SPEC_custom_OBJ. OBJ indicates the modified part of the program workflow. SPEC is used to particularize some contexts when OBJ is not sufficient. Roughly speaking, calling such a function includes a new implementation for the OBJ part of the workflow. The label of this new implementation, as well as the actual implementation, are provided in the function parameters. This frequently involves passing user-defined functions as these functions parameters. These function-parameter functions have signature requirements that are specific to SPEC and OBJ. To properly document them, dummy implementations are provided in the gmak.api_signatures module and referred to in this documentation.

Here, these modules are documented in sections separated by context. Each one includes also a brief explanation of how the corresponding functions fit in the program workflow. Auxiliary functions and classes that are involved in the customization API but not part of it are documented in the last section.

Usage

The module gmak.api is made available by installing gmak. However, simply using it in a Python script or module does not immediately affect the program. To make the modifications take effect, the script or module should be saved in the working directory (i.e., the directory where the program is deployed) with the name custom.py, and the command-line option --custom must be passed to the program. It will then be interpreted at runtime, incorporating the user-defined implementations into the program.

Note

We recommend using the template file provided with the program as a starting point to write the custom.py file.

Note

Customization examples are provided in the gmak.api_examples package. They are discussed in the Example Applications part of the documentation.

Systems

In gmak, the type of a system determines the construction of its topology. There are two aspects to this. First, a suitable representation for the topology must be chosen, so that it is passed around in the program and used as input for the simulation and property-calculation routines. This representation will be referred to as the topology-output object. Secondly, the way to apply the values of the parameters modified by the program to these topology-output objects must be specified.

To add a custom system type to the program, one should use the function add_custom_system(). It receives as parameters the name of the custom system type and two user-implemented functions which are chained together. The first one creates a topology-output object for some given system, grid-shift iteration and grid point. The second one applies the values of the parameters modified by the program to the topology-output object returned by the first one.

Alternatively, one can customize the behavior of the default gmx system type, used for GROMACS-based systems. While nonbonded and macro-type parameters (see Interaction Parameters) are handled automatically by the program, other types of parameters (falling into the umbrella category of custom parameters) are not. For those, the user can add_gmx_custom_parameter_writer().

gmak.api.add_custom_system(type_name, topo_out_creator, topo_out_writer)

Adds a custom system type to the program. In the input file, it can be referenced with the type type_name.

Parameters

type_name (str) – Name of the type of the custom system.
topo_out_creator (callable) – A function that creates a TopologyOutput object for a given system, grid-shift iteration and grid point (see General Workflow and topo_out_creator()).
topo_out_writer (callable) – A function that applies the interaction-parameter values to the topology-output object (see topo_out_writer())

gmak.api.add_gmx_custom_parameter_writer(funct)

Adds to the program a function to apply the values of custom parameters (see Custom Parameters) to a GROMACS topology file.

Parameters: funct (callable) – The function used to apply the values of custom parameters (see gmx_custom_parameter_writer())

gmak.api_signatures.topo_out_creator(workdir, name, grid, state, system_pars)

Returns a TopologyOutput that encodes a topology.

This instance can be initialized with a simple TopologyOutput() statement, and, after that, the user can set the instance attributes as desired. The returned instance is the same one passed as a parameter in topo_out_writer(). It is also passed to the simulator() and component_calculator() functions.

Parameters

workdir (str) – The directory where the topology files should be written
name (str) – The name of the current system
grid (int) – The current grid-shift iteration (see General Workflow)
state (int) – The linear index of the current grid point (see Grid Indexing)
system_pars (InputParameters) – The system input parameters defined in the input file

Returns

The topology-output object

Return type

TopologyOutput

Note

If your custom system type is to be used with other GROMACS compatible objects (e.g., with the GROMACS-compatible General Protocol), you should use GmxTopologyOutput and set this function to:

import gmak.gmx_system.GmxTopologyOutput as GmxTopologyOutput
return GmxTopologyOutput(workdir, name, grid, state)

Note

For even more flexibility, you can create your own customized topology-output class, inheriting from TopologyOutput (as done e.g. for GmxTopologyOutput). In this way, you can provide custom methods as well as data to the topology.

gmak.api_signatures.topo_out_writer(params, topo_out, system_pars)

Applies the interaction-parameter values to the topology-output object. If any topology files need to be written, this should also be done in this function.

Parameters

params (list of InteractionParameter) – The list of interaction parameters to be applied to the topology
topo_out (TopologyOutput) – The TopologyOutput instance returned by the topo_out_creator() function.
system_pars (InputParameters) – The system input parameters defined in the input file

gmak.api_signatures.gmx_custom_parameter_writer(param, istream, ostream)

Example of the signature of a function used to apply the values of custom parameters to a GROMACS topology file. Such a function is called for all custom parameters, so if any distinction needs to be made between them, it should be done in the body of this function.

Parameters

param (InteractionParameter) – The custom interaction parameter.
istream (io.StringIO) – A readable input stream with the content of an intermediate topology file to which the value of param is not yet applied. This file is obtained from the template file given in the $system block by expanding the #include directives in it and applying to the resulting content the values of the interaction parameters specified before param in the input file.
ostream (io.StringIO) – A writable stream in which the user must put the content of the input stream with the value of param properly applied to it.

Protocols

A protocol type determines how the protocol simulations are carried out for a given system. This includes the choice of the software package used to carry out the simulations, the deployment of replicas, the sequence of simulations, etc. To add a custom protocol type to the program, one should use the function add_custom_protocol(). Support for extending the simulations is also provided.

Up to four functions can be implemented by the user and supplied to the protocol-customization function. The first and mandatory one is the simulator() function, which is responsible for carrying out the simulations. If extending the simulations is desired, this function has to handle not only setting up the initial simulation, but also extending it when appropriate. These two situations can be distinguished based on the value of the ext parameter, which evaluates to True for extension simulations.

The next two functions are optional and should be specified only when extending the simulations is desired. The first one of those is calc_initial_len(), which returns the length of the initial simulation (prior to extending it). It is a function, and not a variable, to allow for recycling the protocol type for different applications. The second one is calc_extend(), which contains the logic for determining the length of an extended simulation.

Finally, get_last_frame(), if provided, makes the protocol type followable. This function returns the configuration file that the user wants to use as a starting point for the following protocol. Typically, this will be the last frame of the production run of the protocol.

gmak.api.add_custom_protocol(type_name, simulator, calc_initial_len=None, calc_extend=None, get_last_frame=None)

Adds a custom protocol type to the program. In the input file, it can be referenced with the type type_name.

Parameters

type_name (str) – Name of the type of the custom protocol.
simulator (callable) – The simulator function (see simulator())
calc_initial_len (callable) – (optional) The function to calculate the initial length of the production run (see calc_initial_len()). This is used only when support for extensions is desired. Defaults to None, indicating that extensions are not desired.
calc_extend (callable) – (optional) The function to calculate the new length of the production run (see calc_extend()). This is used only when support for extensions is desired. Defaults to None, indicating that extensions are not desired.
get_last_frame (callable) – (optional) The function to extract the configuration file to be used as a starting point for the following protocol (see get_last_frame()). This is used only when a followable protocol is desired. Defaults to None.

gmak.api_signatures.simulator(length, topology, coords, ext, protocol_pars, workdir)

The main simulator function.

Parameters

length (int or float) – The length of the current production run
topology (TopologyOutput) – The topology considered in the simulations
coords (str) – The path of the initial configuration file for the simulations
ext (bool) – Indicates whether this is an extension or the first simulation.
protocol_pars (InputParameters) – The protocol input parameters defined in the input file
workdir (str) – The directory where the simulations are run.

Returns

Output files of the simulation

Return type

dict

gmak.api_signatures.calc_initial_len(protocol_pars)

Calculates the initial length of the production runs.

Parameters: protocol_pars (InputParameters) – The protocol input parameters defined in the input file
Returns: The initial length of the production run
Return type: int or float

gmak.api_signatures.calc_extend(errs_tols, last_length, protocol_pars)

Returns the new length of the simulation based on the uncertainties and tolerances of the properties involving the protocol. Return None to indicate that no more extensions are needed.

Parameters

errs_tols (dict) – The dictionary of uncertainties and tolerances. The keys are the property names. The values are dictionaries {'tol': TOL, 'err': ERR}, where TOL (float) is the tolerance and ERR (float) is the uncertainty for the property.
last_length (int or float) – The current length of the production run.
protocol_pars (InputParameters) – The protocol input parameters defined in the input file

Returns

The new length of the production run, or None

Return type

int or float or None

gmak.api_signatures.get_last_frame(protocol_output)

Returns the path of the configuration file to be followed. It typically corresponds to the last frame of the production run.

Parameters: protocol_output (dict) – The output files of the simulation (returned by simulator())
Returns: The path of the configuration file to be followed
Return type: str

Properties

The customization of properties occurs on the levels of component and composite properties (see Properties). For this reason, to set up custom properties, one must actually set up custom composite and component properties. For the trivial case where the composite property is associated with a single component property, the API provides a way to automatically derive the former from the latter one.

To add a custom component-property type to the program, one should use the function add_custom_component_property(). It receives as arguments a calculator function and a boolean parameter that indicates whether the latter function returns (a) samples of the property for each simulation frame; or (b) the expected value and uncertainty of the property. In the former case, the program automatically takes care of carrying out the statistical treatment of the sampled values in order to produce the expected value and uncertainty. In the latter case, this is the responsibility of the user implementation. The calculator function is called individually for each sampled grid point and protocol after the corresponding simulations are performed.

To add a custom composite-property type to the program, one should use the function add_custom_composite_property(). It receives as argument the expected values and uncertainties of the corresponding component properties for a particular system and grid point and returns the corresponding expected value and uncertainty of the composite property.

Note

There is no need to define a custom property type for a component property that can be extracted with the gmx energy program. A component property of type gmx_PROPNAME in the input file is automatically interpreted as one that is obtained with gmx energy with PROPNAME as the input string.

gmak.api.add_custom_component_property(type_name, component_calculator, is_timeseries)

Adds a custom component-property type to the program. In the input file, it can be referenced with the type type_name.

Parameters

type_name (str) – Name of the type of the custom component property.
component_calculator (callable) – The function used to calculate the custom component property (see component_calculator())
is_timeseries (bool) – True indicates that the component property is obtained as a timeseries; False, as a tuple (EA, dEA) with the expected value and statistical uncertainty.

gmak.api.add_custom_composite_property(type_name, composite_calculator=None)

Adds a custom composite-property type to the program. In the input file, it can be referenced with the type type_name.

Parameters

type_name (str) – Name of the type of the custom composite property.
composite_calculator (callable) – (optional) The calculator function (see composite_calculator()). If it is not supplied, the program implicitly assumes that the property has only one component and identifies the values and errors of the composite property with those of the component property.

gmak.api_signatures.component_calculator(topology, protocol_output, property_pars)

The function used to calculate the custom component property.

Parameters

topology (TopologyOutput) – The topology considered in the simulations
protocol_output (dict) – The output files of the simulation. For custom protocols, it is the dict returned by the simulator() function. For GROMACS-based protocols, it is the simulation-output dict described in GROMACS-compatible General Protocol.
property_pars (InputParameters) – The property input parameters defined in the input file

Returns

A tuple (EA, dEA) with the expected value and uncertainty of the property, or a list with the values of the property for each frame of the simulation. In the latter case, the program will automatically calculate the corresponding average and statistical uncertainty.

Return type

tuple or list

gmak.api_signatures.composite_calculator(values, errs, property_pars)

The function used to calculate the custom composite property.

Parameters

values (list) – The list with the expected values of each component property. Each member is a tuple (PTYPE, VALUE) where PTYPE (str) is the type of component property and VALUE (float) is the expected value.
errs (list) – The list with the uncertainties of each component property. Each member is a tuple (PTYPE, VALUE) where PTYPE (str) is the type of component property and VALUE (float) is the estimated uncertainty.
property_pars (InputParameters) – The property input parameters defined in the input file

Returns

A tuple (EA, dEA) with the expected value and uncertainty of the composite property

Return type

tuple

Surrogate Model

The surrogate-model type defines the function used to estimate the expected values and uncertainties of a component property for all grid points based on these quantities for the sampled grid points.

To add a custom surrogate-model type to the program, one should use the function add_custom_surrogate_model().

gmak.api.add_custom_surrogate_model(type_name, compute, corners=False)

Adds a custom surrogate-model type to the program. In the input file, it can be referenced with the type type_name.

Parameters

type_name (str) – Name of the type of the custom surrogate model
compute (callable) – The surrogate-model function (see compute())
corners (bool) – Indicates whether the surrogate model requires the corners of the grid to be simulated (e.g., interpolation does). Defaults to False.

gmak.api_signatures.compute(EA_s, dEA_s, I_s, gridshape, X_ki)

Computes the expected values and uncertainties of a component property for the entire grid.

Parameters

EA_s (numpy.ndarray) – A 1D array of shape (NSAMP,) with the average values of the property for each sampled point.
dEA_s (numpy.ndarray) – A 1D array of shape (NSAMP,) with the uncertainties of the property for each sampled point.
I_s (list) – A list of length NSAMP with the linear indexes of the sampled grid points.
gridshape (tuple) – A tuple with the grid dimensions.
X_ki (numpy.ndarray) – A 2D array with the parameter-space values for each grid point (more specifically, the parameters of the Main Variation). The first index is the linear index, and the second index is the coordinate of the parameter-space point. For example, X_ki[1,0] is the value of the first parameter of the Main Variation for the grid point with linear index equal to 1.

Returns

A tuple (EA_k, dEA_k) containing the estimated values and uncertainties, respectively, of the property for each grid point. EA_k and dEA_k are 1D arrays indexed by the linear index of the grid point.

Return type

tuple

Note

The function gmak.cartesiangrid.flat2tuple() can be used to convert a linear to a tuple index if desired.

Score Function

The score-function type defines how the estimates of the values and uncertainties of the composite properties are used to calculate a score for the grid point.

To add a custom score-function type, one should use the function add_custom_score().

gmak.api.add_custom_score(type_name, calc_score, calc_score_err=None)

Adds a custom score-function type to the program. In the input file, it can be referenced with the type type_name.

Parameters

type_name (str) – Name of the type of the custom function
calc_score (callable) – The score function (see calc_score())
calc_score_err (callable) – (optional) The score uncertainty function (see calc_score_err()). Defaults to None, which means that the uncertainties are not calculated.

gmak.api_signatures.calc_score(estimates, errors, weis, refs)

Calculates the score for a grid point.

Parameters

estimates (dict) – The dictionary of composite-property estimates for a grid point. The keys are the property names. Each key maps to the expected value of the property at a given grid point.
errors (dict) – The dictionary of composite-property uncertainties for a grid point. The keys are the property names. Each key maps to the value of the uncertainty of the property at a given grid point.
weis (dict) – The dictionary of composite-property weights for a grid point. The keys are the property names. Each key maps to the weight of the property in the optimization.
refs (dict) – The dictionary of composite-reference values of the properties for a grid point. The keys are the property names. Each key maps to the reference values of the property in the optimization.

Returns

The value of the score

Return type

float

gmak.api_signatures.calc_score_err(estimates, errors, weis, refs)

Calculates the score uncertainty for a grid point.

Parameters

estimates (dict) – The dictionary of composite-property estimates for a grid point. The keys are the property names. Each key maps to the expected value of the property at a given grid point.
errors (dict) – The dictionary of composite-property uncertainties for a grid point. The keys are the property names. Each key maps to the value of the uncertainty of the property at a given grid point.
weis (dict) – The dictionary of composite-property weights for a grid point. The keys are the property names. Each key maps to the weight of the property in the optimization.
refs (dict) – The dictionary of composite-reference values of the properties for a grid point. The keys are the property names. Each key maps to the reference values of the property in the optimization.

Returns

A tuple (MIN, MAX) that represents a confidence interval for the score

Return type

tuple

Grid Shifting

The grid-shifting type defines how the new origin of the grid is calculated for grid-shifting procedure.

To add a custom grid-shifting type, one should use the function add_custom_gridshifter().

gmak.api.add_custom_gridshifter(type_name, calculator)

Adds a custom grid-shifting procedure to the program. In the input file, it can be referenced with the type type_name.

Parameters

type_name (str) – Name of the type of the custom protocol.
calculator (callable) – The custom grid-shifting function (see calculator()).

gmak.api_signatures.calculator(tuple_indexes, scores, propnames, averages, uncertainties)

The grid-shifting procedure.

Parameters

tuple_indexes (list) – The tuple indexes of all grid points, ordered by linear index.
scores (list) – A list with the scores of the grid points, ordered by linear index.
propnames (list) – The names of the composite properties.
averages (numpy.ndarray) – A 2D array with the estimated values of the composite properties for all grid points. The first index is the linear index of the grid point. The second index corresponds to the property index in propnames.
uncertainties (numpy.ndarray) – A 2D array with the estimated uncertainties of the composite properties for all grid points. The first index is the linear index of the grid point. The second index corresponds to the property index in propnames.

Returns

The shifting tuple, or None to complete the run. For example, a return value of (-5, 5) indicates that the origin is shifted by -5 grid cells in the first coordinate and +5 grid cells in the second.

Return type

tuple or None

Logging

class gmak.logger.Logger(fn, indentString=' ')

indent(number=1)

Indents the stream by a certain number of indentation strings.

Parameters: number (int) – The number of indentation strings (default is 1).

putMessage(msg, dated=True)

Writes a message to the log-file stream with proper indentation.

Parameters

msg (str) – The message.
dated (bool) – If True, write a timestamp at the end of the message.

unindent(number=1)

Unindents the stream by a certain number of indentation strings.

Parameters: number (int) – The number of indentation strings (default is 1).

gmak.logger.globalLogger =gmak.logger.Logger: The global Logger instance that has access to the log-file stream.

Tip

In any customized function, one can request that a message is written in the log file, at runtime, with:

from gmak.logger import globalLogger as log
log.putMessage("Write this message!")

Utilitary Functions and Classes

gmak.cartesiangrid.flat2tuple(gridshape, idx)

For a grid with dimensions gridshape, convert a linear index idx to a tuple index.

Parameters

gridshape (tuple) – A tuple with the grid dimensions.
idx (int) – The linear index to be converted.

Returns

The tuple index corresponding to the linear index idx.

Return type

tuple

class gmak.custom_attributes.CustomizableAttributesMixin.InputParameters

An instance of this class stores the input parameters supplied in a block of the input file as attributes. Input parameters that correspond to a single value token are stored as objects with type appropriate to the token value. Input parameters that correspond to a list of multiple value tokens are stored as a list of objects, each one processed as above.

For example, the block

$coords
name liquid
type gmx_liquid
coords molecule.gro
nmols 1500
box 3.0 3.0 3.0
$end

corresponds to an object coords_attr for which

>>> coords_attr.name
'liquid'
>>> coords_attr.type
'gmx_liquid'
>>> coords_attr.coords
'molecule.gro'
>>> coords_attr.nmols
1500
>>> coords_attr.box
[3.0, 3.0, 3.0]

Note the automatic type conversion for the parameters nmols and box, from string and list of string to int and list of float, respectively.

Note

Since the parameter is used as the name of an instance attribute, the use of only lowercase characters and underscores for custom parameters is highly recommended to avoid run-time errors.

class gmak.systems.TopologyOutput: This class encodes a topology. It is intended to be used as a generalization of the path of the topology file, when more flexibility is needed in constructing it. It is essentially a container for data that the user wants to transfer around and that is identified by the program as representing the topology of a system. The way to store data in objects of this type is via instance attributes, which can be freely set by the user. These objects are created individually for each system, grid-shift iteration and grid point (General Workflow).

class gmak.gmx_system.GmxTopologyOutput(workdir, name, grid, state)

Bases: gmak.systems.TopologyOutput

Automatically sets a topology-file path with extension .top based on the directory path workdir, the system name name, the grid-shift iteration grid and the grid-point linear index state.

property path: The path of the topology file.

class gmak.interaction_parameter.InteractionParameter

The class representing an interaction parameter that is set by the program. It contains its type and value and, when adequate, the types of the particles involved in the interaction.

property name

The string that identifies the interaction parameter. It is the same one used in the $variation block of the input file.

Type: str

property particles

The particles involved in the interaction

Type: None, InteractionAtom or InteractionPair

property type

The type of interaction parameter

Type: InteractionParameterType

property value

The value of the interaction parameter

Type: float

class gmak.interaction_parameter.InteractionParameterType(value)

Bases: enum.Enum

An enumeration that identifies the type of interaction parameter.

MacroParameter = 1: The type assigned to parameters of which the name starts with the @ character. For the GROMACS-based system type, it identifies parameters that are applied to the topology file via simple textual replacement. For custom system types, this has no special behavior.

LJ_V = 2: The type assigned to $\sigma$ or $C_6$ Lennard-Jones parameters (see Lennard-Jones parameters for naming conventions).

LJ_W = 3: The type assigned to $\epsilon$ or $C_{12}$ Lennard-Jones parameters (see Lennard-Jones parameters for naming conventions).

LJ_14_V = 4: The type assigned to $\sigma$ or $C_6$ Lennard-Jones parameters when applied to 1,4 (third-neighbor) pairs (see Lennard-Jones parameters for naming conventions).

LJ_14_W = 5: The type assigned to $\epsilon$ or $C_{12}$ Lennard-Jones parameters when applied to 1,4 (third-neighbor) pairs (see Lennard-Jones parameters for naming conventions).

CustomParameter = 6: The type assigned to parameters which are not of the types above. For the GROMACS-based system type, parameters with this type are not handled automatically and require that the user add_gmx_custom_parameter_writer().

class gmak.interaction_parameter.InteractionAtom(name, pairs_include=None, pairs_exclude=None)

The class representing an interaction atom.

property name

The name of the atom type. It is set automatically from the name of the corresponding InteractionParameter.

Type: str

property pairs_exclude

The regular expression controlling the standard pairtypes that are not affected by the atom.

Type: str

property pairs_include

The regular expression controlling the standard pairtypes affected by the atom.

Type: str

class gmak.interaction_parameter.InteractionPair(name_i, name_j)

The class representing an interaction pair.

Two pairs can be compared for equality regardless of the order of their atom types:

>>> InteractionPair("CH3", "CH1") == InteractionPair("CH1", "CH3")
True
>>> InteractionPair("CH3", "CH1") == InteractionPair("OA", "CH3")
False

derives_from_atom(atom)

Verifies if the pair derives from an atom based on the properties pairs_include and pairs_exclude of the latter.

Parameters: atom (InteractionAtom) – The atom that is verified if the pair derives from
Returns: True if the the pair derives from the atom; False otherwise.
Return type: bool

property name_i

The name of the first atom type of the pair. It is set automatically from the name of the corresponding InteractionParameter.

Type: str

property name_j

The name of the second atom type of the pair. It is set automatically from the name of the corresponding InteractionParameter.

Type: str

to_tuple()

Returns the component atom types as a tuple.

Return type: tuple

>>> ip = InteractionPair("CH3", "CH1")
>>> ip.to_tuple()
("CH3", "CH1")
>>> "CH1" in ip.to_tuple()
True
>>> "CH3" in ip.to_tuple()
True
>>> "OW" in ip.to_tuple()
False