Core of runs (core package)

CoreRun

Class is responsible for run of one optimization of GADMA.

class gadma.core.core_run.CoreRun(index, shared_dict, settings)

Bases: object

Class of main run in GADMA. Has a run() method to start launch. Runs creates new directory named by its index in the output directory then all log, code and pictures will be saved there. AIC and CLAIC is calculated here.

Parameters
  • index (int) – Index of the run. Like id.

  • shared_dict (gadma.cor.shared_dict.SharedDictForCoreRun) – Dictionary to save results in callbacks. Will be saved with key equal to index. Is used for multiprocessing cooperation.

  • settings (gadma.cli.settings_storage.SettingsStorage) – Settings of the run. Information to form output directory and so on will be taken from settings.

EVAL_FILENAME = 'eval_file'
REPORT_FILENAME = 'GADMA_GA.log'
SAVE_FILENAME = 'save_file'
base_callback(x, y)

Base callback:

  1. Updates values of best solution in shared_dict.

  2. If new best values are received then draws and generates code to the output_dir of this run.

Parameters
  • x – Vector of values for model parameters.

  • y – Value of log-likelihood for this values.

callback(x, y)

Main callback for optimizers to get. It is combination of three callbacks:

  1. base_callback()

  2. draw_iter_callback()

  3. code_iter_callback()

Parameters
  • x – Vector of values for model parameters.

  • y – Value of log-likelihood for this values.

code_iter_callback(x, y)

Generates code of best model on current iteration to the code directory. Code directory is located in the output directory of this run. Generation happens every self.settings.print_models_code_every_n_iteration iteration.

Parameters
  • x – Vector of values for model parameters.

  • y – Value of log-likelihood for this values.

property data

Returns current data.

draw_iter_callback(x, y)

Draws best model on current iteration in the pictures directory. Pictures directory is located in the output directory of this run. Drawing happens every self.settings.draw_models_every_n_iteration iteration.

Parameters
  • x – Vector of values for model parameters.

  • y – Value of log-likelihood for this values.

draw_model_in_output_dir(x, y, best_by='log-likelihood', final=True)

Draws picture of demographic model with x as parameters to the output directory of run.

Parameters
  • x – Vector of values for model parameters.

  • y – Value of log-likelihood for this values.

  • best_by – By what function (log-likelihood, AIC, CLAIC) this colution is best.

  • final – If True then solution is final and it will be saved by final name.

get_run_options()

Returns iterator of run options, each element has four options:

  1. File to restore optimization from. If None then no resume.

  2. Structure of the demographic model for the run.

  3. Bool points_only - if True then resumed run uses old points as initial points only and optimization is run from the beginning.

  4. Function of x transformation - for case when restored vectors should be transformed somehow before they will be used in the optimization.

get_save_file()

Returns filename to save optimization run. If demographic model does not have structure then returns self.save_file else adds suffix about structure at the end of self.save_file.

intermediate_callback(x, y)

Almost final callback that is called after each global + local optimization.

Saves AIC and CLAIC values in the shared_dict if needed. If new best by AIC or CLAIC vector then code and picture are generated.

Parameters
  • x – Vector of values for model parameters.

  • y – Value of log-likelihood for this values.

property model

Returns current demographic model.

run(initial_kwargs={})

Main method of the class to run optimization. Runs run_without_increase() or run_with_increase() according to settings.

run_with_increase(initial_kwargs={})

Run launch with increase of the demographic model structure. Structure of the model will be increased up to final structure. Then the final solution will be returned. Runs run_without_increase() and increase_structure() in the loop.

Parameters

initial_kwargs – Initial kwargs for optimization.

run_without_increase(initial_kwargs={})

Run one launch without any increase of demographic model structure. Runs one global+local optimization for the current model.

Parameters

initial_kwargs – Initial kwargs for optimization.

SharedDict

class gadma.core.shared_dict.SharedDict(multiprocessing=True)

Bases: object

Wrapper class on multiprocessing.Manager.dict that could be used for multiprocessing applications. Is used as shared memory of processes that are run in parallel for GADMA.

All models in this dict can be divided in several processes and in several groups. When new model is added it has its own process and group. When dict should return some models it sort them by some value that is defined by key function and could be specific for each group.

Parameters

multiprocessing – If False than usual dict will be used inside.

add_model_for_process(process, group, model, key=None)

Adds additional model for process. After this method dict will have list of models saved under (process, group).

Parameters
  • process – Name of process.

  • group – Name of model group.

  • model – Model.

  • keykey function, if None then default_key() is used.

default_key(group)

Returns function for key in sort. Sorted elements will be compared by the value of default_key(element).

Parameters

group – Name of group of models.

get_available_groups()

Returns all available groups across processes.

get_best_model_for_process_in_group(process, group, key=None)

Returns best model for process, group.

get_best_model_in_group(group, key=None)

Returns best model for group.

get_models_for_process_in_group(process, group, key=None)

Returns models for (process, group).

get_models_in_group(group, key=None)

Returns dict of sorted pairs (process, model) for group (across all processes).

static get_value(model, key)

Returns value of key from model.

Parameters
  • model – Model.

  • key – Key function.

update_best_model_for_process(process, group, model, key=None)

Updates best model for process. Models are compared by the value of key function. So if new model has greater value of key function then it is saved in the dictionary as best for this process in this group.

Parameters
  • process – Name of process.

  • group – Name of model group.

  • model – Model.

  • keykey function, if None then default_key() is used.

class gadma.core.shared_dict.SharedDictForCoreRun(multiprocessing=True)

Bases: gadma.core.shared_dict.SharedDict

Class for shared dict in gadma.core.core_run.CoreRun.

Process is name of the process or index of CoreRun. Group is name of fitness function: log-likelihood, AIC, CLAIC. Model is tuple of demographic model, engine and fitness for this engine.

add_model_for_process(process, group, engine, x, y)

Adds additional model for process. After this method dict will have list of models saved under (process, group).

Parameters
  • process – Name of process.

  • group – Name of model group.

  • model – Model.

  • keykey function, if None then default_key() is used.

construct_model(group, engine, x, y)

Constructs model for group, engine and x, y. Model is a tuple of (engine, x, fitness), where engine contain demographic model and fitness is dict with available fitnesses (logLL, AIC, CLAIC).

Parameters
  • group – Name of fitness (log-likelihood, AIC, CLAIC)

  • engine – Engine with dem. model and data.

  • x – Values of dem. model parameters.

  • y – Value of fitness defined by group.

default_key(group)

For not log-likelihood groups sort should be reversed so key function multiplies fitness by -1. Also fitness could be dict of several values of several groups (group of log-likelihood but fitness has also value of AIC or CLAIC). So this function extracts correct fitness.

Parameters

group – Name of fitness function (log-likelihood, AIC, CLAIC).

get_models_in_group(group, key=None, align_y_dict=False)

Returns models for specified group.

Parameters
  • group – Name of fitness (log-likelihood, AIC, CLAIC).

  • keyKey function.

  • align_y_dict – If True then all fitnesses are dict with keys across all available groups and None’s if value for some group is not available.

update_best_model_for_process(process, group, engine, x, y)

Updates best model for process. Models are compared by the value of key function. So if new model has greater value of key function then it is saved in the dictionary as best for this process in this group.

Parameters
  • process – Name of process.

  • group – Name of model group.

  • model – Model.

  • keykey function, if None then default_key() is used.

Core

gadma.core.core.job(index, shared_dict, settings)

Function of one parallel run of GADMA. Creates gadma.core.core_run.CoreRun object and call its run() method.

Parameters
gadma.core.core.main()

Main function that is called from command line. Creates parallel runs of GADMA and holds base pool of processes. Prints progress periodically for each run, saves plots and pictures of best model and generates code.

Functions for drawing and code generation in main function of core

gadma.core.draw_and_generate_code.draw_plots_to_file(x, engine, settings, filename, fig_title)

Draws plots of data (SFS) and model from engine with parameters x.

Parameters
  • x (list or dict) – Values of the parameters in model.

  • engine (gadma.engines.engine.Engine) – Engine with specified model and data.

  • filename (str) – File name to save picture.

  • fig_title (str) – Title of the schematic model plot.

Note

print warnings if schematic model plot was not drawn.

gadma.core.draw_and_generate_code.generate_code_to_file(x, engine, settings, filename)

Generates code of demographic model to file. Settings are required to get engine arguments in evaluation() function.

Parameters
gadma.core.draw_and_generate_code.print_runs_summary(start_time, shared_dict, settings)

Prints best demographic model by logLL among all processes.

Parameters