learning¶
Classes to facilitate machine learning with DexNet, including wrappers for GQCNN training datasets and MultiArmed Bandit methods for grasp robustness evaluation.

class
dexnet.learning.
Model
¶ A predictor of some value of the input data

predict
(x)¶ Predict the function of the data at some point x. For probabilistic models this returns the mean prediction

snapshot
()¶ Returns a concise description of the current model for debugging and logging purposes

update
()¶ Update the model based on current data


class
dexnet.learning.
DiscreteModel
¶ Maintains a prediction over a discrete set of points

max_prediction
()¶ Returns the index (or indices), posterior mean, and posterior variance of the variable(s) with the maximal mean predicted value

num_vars
()¶ Returns the number of variables in the model

sample
()¶ Sample discrete predictions from the model. For deterministic models, returns the deterministic prediction


class
dexnet.learning.
Snapshot
(best_pred_ind, num_obs)¶ Abstract class for storing the current state of a model

class
dexnet.learning.
BernoulliSnapshot
(best_pred_ind, means, num_obs)¶ Stores the current state of a Bernoulli model

class
dexnet.learning.
BetaBernoulliSnapshot
(best_pred_ind, alphas, betas, num_obs)¶ Stores the current state of a Beta Bernoulli model

class
dexnet.learning.
GaussianSnapshot
(best_pred_ind, means, variances, sample_vars, num_obs)¶ Stores the current state of a Gaussian model

class
dexnet.learning.
BernoulliModel
(num_vars, mean_prior=0.5)¶ Standard bernoulli model for predictions over a discrete set of candidates

num_vars
¶ int
– the number of variables to track

prior_means
¶ (float) prior on mean probabilty of success for candidates

static
bernoulli_mean
(p)¶ Mean of the beta distribution with params alpha and beta

static
bernoulli_variance
(p, n)¶ Uses Wald interval for variance prediction

max_prediction
()¶ Returns the index (or indices), posterior mean, and posterior variance of the variable(s) with the maximal mean probaiblity of success

predict
(index)¶ Predicts the probability of success for the variable indexed by index

sample
()¶ Samples probabilities of success from the given values

snapshot
()¶ Return copys of the model params

update
(index, value)¶ Update the model based on an observation of value at index index


class
dexnet.learning.
BetaBernoulliModel
(num_vars, alpha_prior=1.0, beta_prior=1.0)¶ BetaBernoulli model for predictions over a discrete set of candidates .. attribute:: num_vars
int – the number of variables to track
alpha_prior
¶ float – prior alpha parameter of the Beta distribution

beta_prior
¶ float – prior beta parameter of the Beta distribution

static
beta_mean
(alpha, beta)¶ Mean of the beta distribution with params alpha and beta

static
beta_variance
(alpha, beta)¶ Mean of the beta distribution with params alpha and beta

max_prediction
()¶ Returns the index (or indices), posterior mean, and posterior variance of the variable(s) with the maximal mean probaiblity of success

predict
(index)¶ Predicts the probability of success for the variable indexed by index

sample
(vis=False, stop=False)¶ Samples probabilities of success from the given values

static
sample_variance
(alpha, beta)¶ Mean of the beta distribution with params alpha and beta

snapshot
()¶ Return copies of the model params

update
(index, value)¶ Update the model based on an observation of value at index index


class
dexnet.learning.
GaussianModel
(num_vars)¶ Gaussian model for predictions over a discrete set of candidates.

num_vars
¶ int – the number of variables to track

max_prediction
()¶ Returns the index, mean, and variance of the variable(s) with the maximal predicted value.

predict
(index)¶ Predict the value of the index’th variable.
Parameters: index (int) – the variable to find the predicted value for

sample
(stop=False)¶ Sample discrete predictions from the model. Mean follows a tdistribution

snapshot
()¶ Returns a concise description of the current model for debugging and logging purposes.

update
(index, value)¶ Update the model based on current data.
Parameters:  index (int) – the index of the variable that was evaluated
 value (float) – the value of the variable

variances
¶ Confidence bounds on the mean

Correlated BetaBernoulli model for predictions over a discrete set of candidates.
list
– the objects to track
NearestNeighbor
– nearest neighbor structure to use for neighborhood lookups
Kernel
– kernel instance to measure similarities
float – for computing radius of neighborhood, between 0 and 1
float – prior alpha parameter of the Beta distribution
float – prior beta parameter of the Beta distribution
Create the full kernel matrix for debugging purposes
Return the index with the highest lower confidence bound
Return copys of the model params
Update the model based on current data
Parameters:  index (int) – the index of the variable that was evaluated
 value (float) – the value of the variable

class
dexnet.learning.
TerminationCondition
¶ Returns true when a condition is satisfied. Used for supplying different termination conditions to optimization algorithms

class
dexnet.learning.
MaxIterTerminationCondition
(max_iters)¶ Terminate based on reaching a maximum number of iterations.

max_iters
¶ int
– the maximum number of allowed iterations


class
dexnet.learning.
ProgressTerminationCondition
(eps)¶ Terminate based on lack of progress.

eps
¶ float
– the minimum admissible progress that must be made on each iteration to continue


class
dexnet.learning.
ConfidenceTerminationCondition
(eps)¶ Terminate based on model confidence.

eps
¶ float
– the amount of confidence in the predicted objective value that the model must have to terminate


class
dexnet.learning.
OrTerminationCondition
(term_conditions)¶ Terminate based on the OR of several termination conditions

term_conditions
¶ list
ofTerminationCondition
– termination conditions that are ORed to get the final termination results


class
dexnet.learning.
AndTerminationCondition
(term_conditions)¶ Terminate based on the AND of several termination conditions

term_conditions
¶ list
ofTerminationCondition
– termination conditions that are ANDed to get the final termination results


class
dexnet.learning.
ThompsonSelectionPolicy
(model=None)¶ Chooses the next point using the Thompson sampling selection policy

choose_next
(stop=False)¶ Returns the index of the maximal random sample, breaking ties uniformly at random


class
dexnet.learning.
BetaBernoulliGittinsIndex98Policy
(model=None)¶ Chooses the next point using the BetaBernoulli gittins index policy with gamma = 0.98

choose_next
()¶ Returns the index of the maximal random sample, breaking ties uniformly at random


class
dexnet.learning.
BetaBernoulliBayesUCBPolicy
(horizon=1000, c=6, model=None)¶ Chooses the next point using the Bayes UCB selection policy

choose_next
(stop=False)¶ Returns the index of the maximal random sample, breaking ties uniformly at random


class
dexnet.learning.
Objective
¶ Acts as a function that returns a numeric value for classes of input data, with checks for valid input.

check_valid_input
(x)¶ Return whether or not a point is valid for the objective.
Parameters: x ( object
) – point at which to evaluate the objective

evaluate
(x)¶ Evaluates a function to be maximized at some point x.
Parameters: x ( object
) – point at which to evaluate the objective


class
dexnet.learning.
DifferentiableObjective
¶ Objectives that are at least twotimes differentable.

gradient
(x)¶ Evaluate the gradient at x.
Parameters: x ( object
) – point at which to evaluate the objective

hessian
(x)¶ Evaluate the hessian at x.
Parameters: x ( object
) – point at which to evaluate the objective


class
dexnet.learning.
MaximizationObjective
(obj)¶ Wrapper for maximization of some supplied objective function. Actually not super important, here for symmetry.

class
dexnet.learning.
MinimizationObjective
(obj)¶ Wrapper for minimization of some supplied objective function. Used because internally all solvers attempt to maximize by default.

evaluate
(x)¶ Return negative, as all solvers will be assuming a maximization


class
dexnet.learning.
NonDeterministicObjective
(det_objective)¶ Wrapper for nondeterministic objective function evaluations. Samples random values of the input data x.

evaluate
(x)¶ Evaluates a function to be maximized at some point x.
Parameters: x ( object
with a sample() function) – point at which to evaluate the nondeterministic objective


class
dexnet.learning.
ZeroOneObjective
(b=0)¶ Zero One Loss based on thresholding.

b
¶ int
– threshold value, 1 iff x > b, 0 otherwise

check_valid_input
(x)¶ Check whether or not input is valid for the objective


class
dexnet.learning.
IdentityObjective
¶ Just returns the value x

check_valid_input
(x)¶ Check whether or not input is valid for the objective


class
dexnet.learning.
RandomBinaryObjective
¶ Returns a 0 or 1 based on some underlying random probability of success for the data points Evaluated data points must have a sample_success method that returns 0 or 1

check_valid_input
(x)¶ Check whether or not input is valid for the objective


class
dexnet.learning.
RandomContinuousObjective
¶ Returns a continuous value based on some underlying random probability of success for the data points Evaluated data points must have a sample method

check_valid_input
(x)¶ Check whether or not input is valid for the objective


class
dexnet.learning.
LeastSquaresObjective
(A, b)¶ Classic leastsquares loss 0.5 * norm(Ax  b)**2

A
¶ numpy.ndarray
– A matrix in least squares 0.5 * norm(Ax  b)**2

b
¶ numpy.ndarray
– b vector in least squares 0.5 * norm(Ax  b)**2


class
dexnet.learning.
LogisticCrossEntropyObjective
(X, y)¶ Logistic cross entropy loss.

X
¶ numpy.ndarray
– X matrix in logistic function 1 / (1 + exp( X^T beta)

y
¶ numpy.ndarray
– y vector, true labels


class
dexnet.learning.
CrossEntropyLoss
(true_p)¶ Cross entropy loss.

true_p
¶ numpy.ndarray
– the true probabilities for all admissible datapoints


class
dexnet.learning.
SquaredErrorLoss
(true_p)¶ Squared error (x  x_true)**2

true_p
¶ numpy.ndarray
– the true labels for all admissible inputs


class
dexnet.learning.
WeightedSquaredErrorLoss
(true_p)¶ Weighted squared error w * (x  x_true)**2

true_p
¶ numpy.ndarray
– the true labels for all admissible inputs

evaluate
(est_p, weights)¶ Evaluates the squared loss of the estimated p with given weights
Parameters: est_p ( list
offloat
) – points at which to evaluate the objective


class
dexnet.learning.
CCBPLogLikelihood
(true_p)¶ CCBP log likelihood of the true params under a current posterior distribution

true_p
¶ list
ofNumber
– true probabilities of datapoints

evaluate
(alphas, betas)¶ Evaluates the CCBP likelihood of the true data under estimated CCBP posterior parameters alpha and beta
Parameters:  alphas (
list
ofNumber
) – posterior alpha values  betas (
list
ofNumber
) – posterior beta values
 alphas (


class
dexnet.learning.
SamplingSolver
(objective)¶ Optimization methods based on a sampling strategy

class
dexnet.learning.
AdaptiveSamplingResult
(best_candidates, best_pred_means, best_pred_vars, total_time, checkpt_times, iters, indices, vals, models)¶ Struct to store the results of sampling / optimization.

best_candidates
¶ list of candidate objects – list of the best candidates as estimated by the optimizer

best_pred_means
¶ list of floats – list of the predicted mean objective value for the best candidates

best_pred_vars
¶ list of floats – list of the variance in the predicted objective value for the best candidates

total_time
¶ float – the total optimization time

checkpt_times
¶ list of floats – the time since start at which the snapshots were taken

iters
¶ list of ints – the iterations at which snapshots were taked

indices
¶ list of ints – the indices of the candidates selected at each snapshot iteration

vals
¶ list of objective output values – the value returned by the evaluated candidate at each snapshot iteration

models
¶ list of
Model
– the state of the current candidate objective value predictive model at each snapshot iteration

best_pred_ind
¶ list of int – the indices of the candidate predicted to be the best by the model at each snapshot iteration


class
dexnet.learning.
BetaBernoulliBandit
(objective, candidates, policy, alpha_prior=1.0, beta_prior=1.0)¶ Class for running Beta Bernoulli MultiArmed Bandits

candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over

policy
¶ DiscreteSelectionPolicy
– a policy to use to select the next candidate to evaluate (e.g. ThompsonSampling)

alpha_prior
¶ float – the prior to use on the alpha parameter

beta_prior
¶ float – the prior to use on the beta parameter

reset_model
(candidates)¶ Needed to independently maximize over subsets of data


class
dexnet.learning.
UniformAllocationMean
(objective, candidates, alpha_prior=1.0, beta_prior=1.0)¶ Uniform Allocation with Beta Bernoulli MultiArmed Bandits

candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over

alpha_prior
¶ float – the prior to use on the alpha parameter

beta_prior
¶ float – the prior to use on the beta parameter


class
dexnet.learning.
ThompsonSampling
(objective, candidates, alpha_prior=1.0, beta_prior=1.0)¶ Thompson Sampling with Beta Bernoulli MultiArmed Bandits

candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over

alpha_prior
¶ float – the prior to use on the alpha parameter

beta_prior
¶ float – the prior to use on the beta parameter


class
dexnet.learning.
GittinsIndex98
(objective, candidates, alpha_prior=1.0, beta_prior=1.0)¶ Gittins Index Policy using gamma = 0.98 with Beta Bernoulli MultiArmed Bandits

candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over

alpha_prior
¶ float – the prior to use on the alpha parameter

beta_prior
¶ float – the prior to use on the beta parameter


class
dexnet.learning.
GaussianBandit
(objective, candidates, policy)¶ MultiArmed Bandit class using and independent Gaussian random variables to model the objective value of each candidate.

candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over

policy
¶ DiscreteSelectionPolicy
– a policy to use to select the next candidate to evaluate (e.g. ThompsonSampling)


class
dexnet.learning.
GaussianUniformAllocationMean
(objective, candidates)¶ Uniform Allocation with Independent Gaussian MultiArmed Bandit model

candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over


class
dexnet.learning.
GaussianThompsonSampling
(objective, candidates)¶ Thompson Sampling with Independent Gaussian MultiArmed Bandit model

candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over


class
dexnet.learning.
GaussianUCBSampling
(objective, candidates)¶ UCB with Independent Gaussian MultiArmed Bandit model

candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over

MultiArmed Bandit class using Continuous Correlated Beta Processes (CCBPs) to model the objective value of each candidate.
Objective
– the objective to optimize via sampling
list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
DiscreteSelectionPolicy
– a policy to use to select the next candidate to evaluate (e.g. ThompsonSampling)
NearestNeighbor
– nearest neighbor structure for fast lookups during module updates
Kernel
– kernel to use in CCBP model
float – the prior to use on the alpha parameter
float – the prior to use on the beta parameter
float – the lower confidence bound used for best arm prediction (e.g. 0.95 > return the 5th percentile of the belief distribution as the estimated objective value for each candidate)
Needed to independently maximize over subsets of data
Thompson Sampling with CCBP MultiArmed Bandit model
Objective
– the objective to optimize via sampling
list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
NearestNeighbor
– nearest neighbor structure for fast lookups during module updates
Kernel
– kernel to use in CCBP model
float – the prior to use on the alpha parameter
float – the prior to use on the beta parameter
float – the lower confidence bound used for best arm prediction (e.g. 0.95 > return the 5th percentile of the belief distribution as the estimated objective value for each candidate)
Bayes UCB with CCBP MultiArmed Bandit model (see “On Bayesian Upper Confidence Bounds for Bandit Problems” by Kaufmann et al.)
Objective
– the objective to optimize via sampling
list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
NearestNeighbor
– nearest neighbor structure for fast lookups during module updates
Kernel
– kernel to use in CCBP model
float – TODO
float – the prior to use on the alpha parameter
float – the prior to use on the beta parameter
int – horizon parameter for Bayes UCB
int – quantile parameter for Bayes UCB
float – the lower confidence bound used for best arm prediction (e.g. 0.95 > return the 5th percentile of the belief distribution as the estimated objective value for each candidate)
” Gittins Index Policy for gamma=0.98 with CCBP MultiArmed Bandit model
Objective
– the objective to optimize via sampling
list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
NearestNeighbor
– nearest neighbor structure for fast lookups during module updates
Kernel
– kernel to use in CCBP model
float – the prior to use on the alpha parameter
float – the prior to use on the beta parameter
float – the lower confidence bound used for best arm prediction (e.g. 0.95 > return the 5th percentile of the belief distribution as the estimated objective value for each candidate)

class
dexnet.learning.
ConfusionMatrix
(num_categories)¶ Confusion matrix for classification errors

class
dexnet.learning.
Tensor
(shape, dtype=<type ‘numpy.float32’>)¶ Abstraction for 4D tensor objects.

add
(datapoint)¶ Adds the datapoint to the tensor if room is available.

data_slice
(slice_ind)¶ Returns a slice of datapoints

datapoint
(ind)¶ Returns the datapoint at the given index.

static
load
(filename, compressed=True)¶ Loads a tensor from disk.

reset
()¶ Resets the current index.

save
(filename, compressed=True)¶ Save a tensor to disk.

set_datapoint
(ind, datapoint)¶ Sets the value of the datapoint at the given index.


class
dexnet.learning.
TensorDataset
(filename, config, access_mode=’WRITE’)¶ Encapsulates learning datasets and different training and test splits of the data.

add
(datapoint)¶ Adds a datapoint to the file.

datapoint
(ind)¶ Loads a tensor datapoint for a given global index.
Parameters: ind (int) – global index in the tensor Returns: the desired tensor datapoint Return type: TensorDatapoint

datapoint_indices
¶ Returns an array of all dataset indices.

datapoint_indices_for_tensor
(tensor_index)¶ Returns the indices for all datapoints in the given tensor.

flush
()¶ Flushes the data tensors. Alternate handle to write.

generate_tensor_filename
(field_name, file_num, compressed=True)¶ Generate a filename for a tensor.

load_tensor
(field_name, file_num)¶ Loads a tensor for a given field and file num.
Parameters:  field_name (str) – the name of the field to load
 file_num (int) – the number of the file to load from
Returns: the desired tensor
Return type:

next
()¶ Read the next datapoint.
Returns: the next datapoint Return type: TensorDatapoint

static
open
(dataset_dir)¶ Opens a tensor dataset.

split
(attribute, train_pct, val_pct)¶ Splits the dataset along the given attribute.

tensor_dir
¶ Return the tensor directory.

tensor_index
(datapoint_index)¶ Returns the index of the tensor containing the referenced datapoint.

tensor_indices
¶ Returns an array of all tensor indices.

write
()¶ Writes all tensors to the next file number.
