Output Files
Basic file structure
The file structure below depicts the output files of a gmak job.
The highlighted tokens that start with the % character may
represent a unique value (e.g., %workdir) or several possible
values (e.g., %protocol), each yielding an individual file or
directory. The asterisk indicates files that are created only under
special circumstances. The tokens and files are explained in more
details below.
%workdir
├── %system
│ ├── ...
├── gmak_%jobid.log
├── state_%jobid.bin
├── grid_%iter
│ ├── %protocol
│ │ ├── %surrogatemodel-in-protocol
│ │ │ └── estimated_properties
│ │ │ ├── %componentprop-in-surrogatemodel_dEA_k.dat
│ │ │ ├── %componentprop-in-surrogatemodel_EA_k.dat
│ │ └── simu
│ │ ├── %sampledgridpoint
│ │ │ ├── %componentprop-in-protocol.xvg
│ │ │ ├── *filtered_%componentprop-in-protocol.xvg
│ │ │ └── %coords-in-protocol
│ ├── parameters_%variation.dat
│ ├── samples_0.dat
│ └── step_0
│ ├── %compositeprop_EA_k.dat
│ ├── %compositeprop_dEA_k.dat
│ ├── %compositeprop_diff.dat
│ ├── full_data.dat
│ ├── *full_data.dat.cis
│ ├── optimizer_data.dat
│ └── *optimizer_data.dat.cis
Tokens
%workdirThe name of the work directory of the job (the
workdirparameter in the input file).%systemThe name of a system (defined in a system block of the input file). In the file structure, it corresponds to a directory designed to contain the topology files of the corresponding system, for all grid-shift iterations and grid points.
%jobidThe PID of the
gmakjob.%iterA grid-shift iteration.
%protocolThe name of a protocol (defined in a protocol block of the input file).
%surrogatemodel-in-protocolThe name of a surrogate model associated with a protocol, i.e., used to calculate at least one component property associated with the protocol (defined in a compute block of the input file).
%componentprop-in-surrogatemodelThe name of a component property associated a surrogate model (defined in a compute block of the input file).
%sampledgridpointThe linear index of a sampled grid point.
%componentprop-in-protocolThe name of a component property associated with a protocol.
%coords-in-protocolThe path of the initial coordinates associated with a protocol (defined in a coordinates block of the input file).
%compositepropThe name of a composite property (defined in a compute block of the input file).
%variationThe name of a variation (defined in a variation block of the input file).
Files
gmak_%jobid.logThe log file of the job.
state_%jobid.binThe state of the job stored as a binary file. This contains several data that can be used to restart a run or perform post-processing analysis.
%componentprop-in-surrogatemodel_dEA_k.datThe statistical uncertainties of the component property for all grid points, as estimated by the surrogate model.
%componentprop-in-surrogatemodel_EA_k.datThe expected values of the component property for all grid points, as estimated by the surrogate model.
%componentprop-in-protocol.xvgThe values of the component property, as estimated from the protocol simulations. This can be a timeseries or a file containing the expected value and uncertainty of the property.
*filtered_%componentprop-in-protocol.xvgIf the component property is timeseries-based, this file contains the subsambled data after statistical processing to remove auto-correlation.
%coords-in-protocolThe path of the initial coordinates associated with a protocol (defined in a coordinates block of the input file).
parameters_%variation.datThe values of the force-field parameters of the variation for each grid point.
samples_0.datThe linear indexes of the sampled grid points. The suffix
_0has no special meaning in the current version of the program.%compositeprop_EA_k.datThe expected values of the composite property for all grid points.
%compositeprop_dEA_k.datThe uncertainties in the estimate of the composite property for all grid points.
%compositeprop_diff.datThe difference between the expected values and the reference value of the composite property for all grid points.
full_data.datA summary file containing the expected values and uncertainties of all composite properties for all grid points, as well as the value of the score function.
*full_data.dat.cisIf a score-uncertainty function is provided, this file contains the corresponding confidence intervals of the score for all grid points.
optimizer_data.datThis is the summary file
full_data.datordered from smallest to largest value of the score function.*optimizer_data.dat.cisThis is the file
full_data.dat.cisordered from smallest to largest value of the score function.
GROMACS-compatible Systems
For GROMACS-compatible systems, the %system directory contains
files %system_%iter_%sampledgridpoint.top, where the tokens
%system, %iter and %sampledgridpoint are explained above.
GROMACS-compatible General protocol
For the GROMACS-compatible general protocol, the %sampledgridpoint
directories contain additional directories corresponding to each
simulation in the sequence of simulations of the protocol. These
directories are identified by an index, %stage, and each one
contains the output files of the simulations, e.g., %stage.tpr,
%stage.xtc, %stage.edr, etc.
GROMACS-compatible Alchemical protocol
As explained here, the GROMACS-compatible Alchemical protocol
implements a sequence of GROMACS-compatible General (sub)protocols for
each state in the alchemical transformation. Each of these
subprotocols corresponds to a protocol directory with name
%protocol-%state, where %protocol is the name of the
alchemical protocol and %state is the corresponding state of the
alchemical transformation.
Log file
The progress of the job is reported in real time in the log file. It registers the beginning and end of all simulations, the results of checking whether they need to be extended or not, the values of the extended lengths (if existent), the result of the grid-shifting procedure, and, for some surrogate models, complementary information regarding the fitting of the model.
Stdout and Stderr
In general, gmak itself does not write any information in the
stdout and stderr streams, except for error messages—logging is done
entirely in the log file, and
output data is written in the output files. Other processes
initialized by the program (e.g., GROMACS binaries), however, may
write in those streams.