estimation module

class estimation.ChoiceSetGenerator(schedules: List, params_file: str, n_alt: int = 10, mh_params: List = {'n_burn': 50000, 'n_iter': 200000, 'n_skip': 10, 'uniform': False}, activities: Optional[List] = ['home', 'work', 'education', 'shopping', 'errands_services', 'business_trip', 'leisure', 'escort'], operators: Optional[List] = ['Block', 'Assign', 'AddAnchor', 'Swap', 'InflateDeflate', 'MetaOperator'], proba_operators: Optional[List] = None, modes: Optional[List] = ['driving', 'pt', 'cycling'], locations: Optional[List] = None, variables: List = ['start_time', 'duration', 'participation'], outfile: str = 'choice_set.joblib', seed: int = 42, **kwargs)[source]

Bases: object

This class is used to generate choice sets of Schedule objects for given individuals.

  • schedules: List of Schedule objects

  • param_file: location of parameters for the target distribution

  • n_alt: number of alternatives in the choice set

  • mh_params: dictionary containing parameters for the random walk

  • activities: list of activities

  • operators: list of operators

  • p_operators: probabilities of operators

  • modes: list of modes

  • locations: list of locations

  • variables: list of variables for target distribution

  • outfile: location of file to save result

  • generate_set: generates choice set for a given individual.

  • run: Run metropolis_hastings algorithm for full dataset.

  • run: Run metropolis_hastings algorithm for full dataset, using parallel processing.

  • compute_sample_correction: Returns the corrective term for the utility function

  • train_test_sets: Creates train and test Dataframes to use in Biogeme.

compute_sample_correction(original_probas: List, unique_probas: List, k: int = 1) List[source]

Returns the corrective term for the utility function, to estimate the model on the sampled choice set passed as input. See Ben-Akiva & Lerman (1985) “Discrete choice analysis”, p.266

Parameters:
  • original_probas (-) –

  • unique_pobas (-) –

  • k (-) –

generate_set(schedule: Schedule) Tuple[List, List, List][source]

Generates choice set for a given individual.

Parameters:

schedule (-) –

Return type:

Choice sets, accepted operators and probabilities.

run() None[source]

Run metropolis_hastings algorithm for full dataset and saves choice sets, accepted operators and acceptance probabilities to file.

run_parallel(n_cpus: Optional[int] = None, verbose: int = 5) None[source]

Run metropolis_hastings algorithm for full dataset using parallel processing. Saves choice sets, accepted operators and acceptance probabilities to file.

Parameters:
  • n_cpus (number of CPUs to use for the parallel process.) –

  • verbose (gives frequency of progress ouptuts) –

train_test_sets(k: int = 1, train_ratio: float = 0.7) Tuple[DataFrame, DataFrame, List][source]

Creates train and test Dataframes to use in Biogeme.

  • k: proportionality constant for sample correction

  • train_ratio: train test split (default: 70% of observations will be used for the train set)

  • Train dataset in wide and long format, Test dataset