learning¶
Classes to facilitate machine learning with Dex-Net, including wrappers for GQ-CNN training datasets and Multi-Armed Bandit methods for grasp robustness evaluation.
-
class
dexnet.learning.
Model
¶ A predictor of some value of the input data
-
predict
(x)¶ Predict the function of the data at some point x. For probabilistic models this returns the mean prediction
-
snapshot
()¶ Returns a concise description of the current model for debugging and logging purposes
-
update
()¶ Update the model based on current data
-
-
class
dexnet.learning.
DiscreteModel
¶ Maintains a prediction over a discrete set of points
-
max_prediction
()¶ Returns the index (or indices), posterior mean, and posterior variance of the variable(s) with the maximal mean predicted value
-
num_vars
()¶ Returns the number of variables in the model
-
sample
()¶ Sample discrete predictions from the model. For deterministic models, returns the deterministic prediction
-
-
class
dexnet.learning.
Snapshot
(best_pred_ind, num_obs)¶ Abstract class for storing the current state of a model
-
class
dexnet.learning.
BernoulliSnapshot
(best_pred_ind, means, num_obs)¶ Stores the current state of a Bernoulli model
-
class
dexnet.learning.
BetaBernoulliSnapshot
(best_pred_ind, alphas, betas, num_obs)¶ Stores the current state of a Beta Bernoulli model
-
class
dexnet.learning.
GaussianSnapshot
(best_pred_ind, means, variances, sample_vars, num_obs)¶ Stores the current state of a Gaussian model
-
class
dexnet.learning.
BernoulliModel
(num_vars, mean_prior=0.5)¶ Standard bernoulli model for predictions over a discrete set of candidates
-
num_vars
¶ int
– the number of variables to track
-
prior_means
¶ (float) prior on mean probabilty of success for candidates
-
static
bernoulli_mean
(p)¶ Mean of the beta distribution with params alpha and beta
-
static
bernoulli_variance
(p, n)¶ Uses Wald interval for variance prediction
-
max_prediction
()¶ Returns the index (or indices), posterior mean, and posterior variance of the variable(s) with the maximal mean probaiblity of success
-
predict
(index)¶ Predicts the probability of success for the variable indexed by index
-
sample
()¶ Samples probabilities of success from the given values
-
snapshot
()¶ Return copys of the model params
-
update
(index, value)¶ Update the model based on an observation of value at index index
-
-
class
dexnet.learning.
BetaBernoulliModel
(num_vars, alpha_prior=1.0, beta_prior=1.0)¶ Beta-Bernoulli model for predictions over a discrete set of candidates .. attribute:: num_vars
int – the number of variables to track-
alpha_prior
¶ float – prior alpha parameter of the Beta distribution
-
beta_prior
¶ float – prior beta parameter of the Beta distribution
-
static
beta_mean
(alpha, beta)¶ Mean of the beta distribution with params alpha and beta
-
static
beta_variance
(alpha, beta)¶ Mean of the beta distribution with params alpha and beta
-
max_prediction
()¶ Returns the index (or indices), posterior mean, and posterior variance of the variable(s) with the maximal mean probaiblity of success
-
predict
(index)¶ Predicts the probability of success for the variable indexed by index
-
sample
(vis=False, stop=False)¶ Samples probabilities of success from the given values
-
static
sample_variance
(alpha, beta)¶ Mean of the beta distribution with params alpha and beta
-
snapshot
()¶ Return copies of the model params
-
update
(index, value)¶ Update the model based on an observation of value at index index
-
-
class
dexnet.learning.
GaussianModel
(num_vars)¶ Gaussian model for predictions over a discrete set of candidates.
-
num_vars
¶ int – the number of variables to track
-
max_prediction
()¶ Returns the index, mean, and variance of the variable(s) with the maximal predicted value.
-
predict
(index)¶ Predict the value of the index’th variable.
Parameters: index (int) – the variable to find the predicted value for
-
sample
(stop=False)¶ Sample discrete predictions from the model. Mean follows a t-distribution
-
snapshot
()¶ Returns a concise description of the current model for debugging and logging purposes.
-
update
(index, value)¶ Update the model based on current data.
Parameters: - index (int) – the index of the variable that was evaluated
- value (float) – the value of the variable
-
variances
¶ Confidence bounds on the mean
-
Correlated Beta-Bernoulli model for predictions over a discrete set of candidates.
list
– the objects to track
NearestNeighbor
– nearest neighbor structure to use for neighborhood lookups
Kernel
– kernel instance to measure similarities
float – for computing radius of neighborhood, between 0 and 1
float – prior alpha parameter of the Beta distribution
float – prior beta parameter of the Beta distribution
Create the full kernel matrix for debugging purposes
Return the index with the highest lower confidence bound
Return copys of the model params
Update the model based on current data
Parameters: - index (int) – the index of the variable that was evaluated
- value (float) – the value of the variable
-
class
dexnet.learning.
TerminationCondition
¶ Returns true when a condition is satisfied. Used for supplying different termination conditions to optimization algorithms
-
class
dexnet.learning.
MaxIterTerminationCondition
(max_iters)¶ Terminate based on reaching a maximum number of iterations.
-
max_iters
¶ int
– the maximum number of allowed iterations
-
-
class
dexnet.learning.
ProgressTerminationCondition
(eps)¶ Terminate based on lack of progress.
-
eps
¶ float
– the minimum admissible progress that must be made on each iteration to continue
-
-
class
dexnet.learning.
ConfidenceTerminationCondition
(eps)¶ Terminate based on model confidence.
-
eps
¶ float
– the amount of confidence in the predicted objective value that the model must have to terminate
-
-
class
dexnet.learning.
OrTerminationCondition
(term_conditions)¶ Terminate based on the OR of several termination conditions
-
term_conditions
¶ list
ofTerminationCondition
– termination conditions that are ORed to get the final termination results
-
-
class
dexnet.learning.
AndTerminationCondition
(term_conditions)¶ Terminate based on the AND of several termination conditions
-
term_conditions
¶ list
ofTerminationCondition
– termination conditions that are ANDed to get the final termination results
-
-
class
dexnet.learning.
ThompsonSelectionPolicy
(model=None)¶ Chooses the next point using the Thompson sampling selection policy
-
choose_next
(stop=False)¶ Returns the index of the maximal random sample, breaking ties uniformly at random
-
-
class
dexnet.learning.
BetaBernoulliGittinsIndex98Policy
(model=None)¶ Chooses the next point using the BetaBernoulli gittins index policy with gamma = 0.98
-
choose_next
()¶ Returns the index of the maximal random sample, breaking ties uniformly at random
-
-
class
dexnet.learning.
BetaBernoulliBayesUCBPolicy
(horizon=1000, c=6, model=None)¶ Chooses the next point using the Bayes UCB selection policy
-
choose_next
(stop=False)¶ Returns the index of the maximal random sample, breaking ties uniformly at random
-
-
class
dexnet.learning.
Objective
¶ Acts as a function that returns a numeric value for classes of input data, with checks for valid input.
-
check_valid_input
(x)¶ Return whether or not a point is valid for the objective.
Parameters: x ( object
) – point at which to evaluate the objective
-
evaluate
(x)¶ Evaluates a function to be maximized at some point x.
Parameters: x ( object
) – point at which to evaluate the objective
-
-
class
dexnet.learning.
DifferentiableObjective
¶ Objectives that are at least two-times differentable.
-
gradient
(x)¶ Evaluate the gradient at x.
Parameters: x ( object
) – point at which to evaluate the objective
-
hessian
(x)¶ Evaluate the hessian at x.
Parameters: x ( object
) – point at which to evaluate the objective
-
-
class
dexnet.learning.
MaximizationObjective
(obj)¶ Wrapper for maximization of some supplied objective function. Actually not super important, here for symmetry.
-
class
dexnet.learning.
MinimizationObjective
(obj)¶ Wrapper for minimization of some supplied objective function. Used because internally all solvers attempt to maximize by default.
-
evaluate
(x)¶ Return negative, as all solvers will be assuming a maximization
-
-
class
dexnet.learning.
NonDeterministicObjective
(det_objective)¶ Wrapper for non-deterministic objective function evaluations. Samples random values of the input data x.
-
evaluate
(x)¶ Evaluates a function to be maximized at some point x.
Parameters: x ( object
with a sample() function) – point at which to evaluate the nondeterministic objective
-
-
class
dexnet.learning.
ZeroOneObjective
(b=0)¶ Zero One Loss based on thresholding.
-
b
¶ int
– threshold value, 1 iff x > b, 0 otherwise
-
check_valid_input
(x)¶ Check whether or not input is valid for the objective
-
-
class
dexnet.learning.
IdentityObjective
¶ Just returns the value x
-
check_valid_input
(x)¶ Check whether or not input is valid for the objective
-
-
class
dexnet.learning.
RandomBinaryObjective
¶ Returns a 0 or 1 based on some underlying random probability of success for the data points Evaluated data points must have a sample_success method that returns 0 or 1
-
check_valid_input
(x)¶ Check whether or not input is valid for the objective
-
-
class
dexnet.learning.
RandomContinuousObjective
¶ Returns a continuous value based on some underlying random probability of success for the data points Evaluated data points must have a sample method
-
check_valid_input
(x)¶ Check whether or not input is valid for the objective
-
-
class
dexnet.learning.
LeastSquaresObjective
(A, b)¶ Classic least-squares loss 0.5 * norm(Ax - b)**2
-
A
¶ numpy.ndarray
– A matrix in least squares 0.5 * norm(Ax - b)**2
-
b
¶ numpy.ndarray
– b vector in least squares 0.5 * norm(Ax - b)**2
-
-
class
dexnet.learning.
LogisticCrossEntropyObjective
(X, y)¶ Logistic cross entropy loss.
-
X
¶ numpy.ndarray
– X matrix in logistic function 1 / (1 + exp(- X^T beta)
-
y
¶ numpy.ndarray
– y vector, true labels
-
-
class
dexnet.learning.
CrossEntropyLoss
(true_p)¶ Cross entropy loss.
-
true_p
¶ numpy.ndarray
– the true probabilities for all admissible datapoints
-
-
class
dexnet.learning.
SquaredErrorLoss
(true_p)¶ Squared error (x - x_true)**2
-
true_p
¶ numpy.ndarray
– the true labels for all admissible inputs
-
-
class
dexnet.learning.
WeightedSquaredErrorLoss
(true_p)¶ Weighted squared error w * (x - x_true)**2
-
true_p
¶ numpy.ndarray
– the true labels for all admissible inputs
-
evaluate
(est_p, weights)¶ Evaluates the squared loss of the estimated p with given weights
Parameters: est_p ( list
offloat
) – points at which to evaluate the objective
-
-
class
dexnet.learning.
CCBPLogLikelihood
(true_p)¶ CCBP log likelihood of the true params under a current posterior distribution
-
true_p
¶ list
ofNumber
– true probabilities of datapoints
-
evaluate
(alphas, betas)¶ Evaluates the CCBP likelihood of the true data under estimated CCBP posterior parameters alpha and beta
Parameters: - alphas (
list
ofNumber
) – posterior alpha values - betas (
list
ofNumber
) – posterior beta values
- alphas (
-
-
class
dexnet.learning.
SamplingSolver
(objective)¶ Optimization methods based on a sampling strategy
-
class
dexnet.learning.
AdaptiveSamplingResult
(best_candidates, best_pred_means, best_pred_vars, total_time, checkpt_times, iters, indices, vals, models)¶ Struct to store the results of sampling / optimization.
-
best_candidates
¶ list of candidate objects – list of the best candidates as estimated by the optimizer
-
best_pred_means
¶ list of floats – list of the predicted mean objective value for the best candidates
-
best_pred_vars
¶ list of floats – list of the variance in the predicted objective value for the best candidates
-
total_time
¶ float – the total optimization time
-
checkpt_times
¶ list of floats – the time since start at which the snapshots were taken
-
iters
¶ list of ints – the iterations at which snapshots were taked
-
indices
¶ list of ints – the indices of the candidates selected at each snapshot iteration
-
vals
¶ list of objective output values – the value returned by the evaluated candidate at each snapshot iteration
-
models
¶ list of
Model
– the state of the current candidate objective value predictive model at each snapshot iteration
-
best_pred_ind
¶ list of int – the indices of the candidate predicted to be the best by the model at each snapshot iteration
-
-
class
dexnet.learning.
BetaBernoulliBandit
(objective, candidates, policy, alpha_prior=1.0, beta_prior=1.0)¶ Class for running Beta Bernoulli Multi-Armed Bandits
-
candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
-
policy
¶ DiscreteSelectionPolicy
– a policy to use to select the next candidate to evaluate (e.g. ThompsonSampling)
-
alpha_prior
¶ float – the prior to use on the alpha parameter
-
beta_prior
¶ float – the prior to use on the beta parameter
-
reset_model
(candidates)¶ Needed to independently maximize over subsets of data
-
-
class
dexnet.learning.
UniformAllocationMean
(objective, candidates, alpha_prior=1.0, beta_prior=1.0)¶ Uniform Allocation with Beta Bernoulli Multi-Armed Bandits
-
candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
-
alpha_prior
¶ float – the prior to use on the alpha parameter
-
beta_prior
¶ float – the prior to use on the beta parameter
-
-
class
dexnet.learning.
ThompsonSampling
(objective, candidates, alpha_prior=1.0, beta_prior=1.0)¶ Thompson Sampling with Beta Bernoulli Multi-Armed Bandits
-
candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
-
alpha_prior
¶ float – the prior to use on the alpha parameter
-
beta_prior
¶ float – the prior to use on the beta parameter
-
-
class
dexnet.learning.
GittinsIndex98
(objective, candidates, alpha_prior=1.0, beta_prior=1.0)¶ Gittins Index Policy using gamma = 0.98 with Beta Bernoulli Multi-Armed Bandits
-
candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
-
alpha_prior
¶ float – the prior to use on the alpha parameter
-
beta_prior
¶ float – the prior to use on the beta parameter
-
-
class
dexnet.learning.
GaussianBandit
(objective, candidates, policy)¶ Multi-Armed Bandit class using and independent Gaussian random variables to model the objective value of each candidate.
-
candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
-
policy
¶ DiscreteSelectionPolicy
– a policy to use to select the next candidate to evaluate (e.g. ThompsonSampling)
-
-
class
dexnet.learning.
GaussianUniformAllocationMean
(objective, candidates)¶ Uniform Allocation with Independent Gaussian Multi-Armed Bandit model
-
candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
-
-
class
dexnet.learning.
GaussianThompsonSampling
(objective, candidates)¶ Thompson Sampling with Independent Gaussian Multi-Armed Bandit model
-
candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
-
-
class
dexnet.learning.
GaussianUCBSampling
(objective, candidates)¶ UCB with Independent Gaussian Multi-Armed Bandit model
-
candidates
¶ list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
-
Multi-Armed Bandit class using Continuous Correlated Beta Processes (CCBPs) to model the objective value of each candidate.
Objective
– the objective to optimize via sampling
list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
DiscreteSelectionPolicy
– a policy to use to select the next candidate to evaluate (e.g. ThompsonSampling)
NearestNeighbor
– nearest neighbor structure for fast lookups during module updates
Kernel
– kernel to use in CCBP model
float – the prior to use on the alpha parameter
float – the prior to use on the beta parameter
float – the lower confidence bound used for best arm prediction (e.g. 0.95 -> return the 5th percentile of the belief distribution as the estimated objective value for each candidate)
Needed to independently maximize over subsets of data
Thompson Sampling with CCBP Multi-Armed Bandit model
Objective
– the objective to optimize via sampling
list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
NearestNeighbor
– nearest neighbor structure for fast lookups during module updates
Kernel
– kernel to use in CCBP model
float – the prior to use on the alpha parameter
float – the prior to use on the beta parameter
float – the lower confidence bound used for best arm prediction (e.g. 0.95 -> return the 5th percentile of the belief distribution as the estimated objective value for each candidate)
Bayes UCB with CCBP Multi-Armed Bandit model (see “On Bayesian Upper Confidence Bounds for Bandit Problems” by Kaufmann et al.)
Objective
– the objective to optimize via sampling
list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
NearestNeighbor
– nearest neighbor structure for fast lookups during module updates
Kernel
– kernel to use in CCBP model
float – TODO
float – the prior to use on the alpha parameter
float – the prior to use on the beta parameter
int – horizon parameter for Bayes UCB
int – quantile parameter for Bayes UCB
float – the lower confidence bound used for best arm prediction (e.g. 0.95 -> return the 5th percentile of the belief distribution as the estimated objective value for each candidate)
” Gittins Index Policy for gamma=0.98 with CCBP Multi-Armed Bandit model
Objective
– the objective to optimize via sampling
list
of arbitrary objects that can be evaluted by the objective – the list of candidates to optimize over
NearestNeighbor
– nearest neighbor structure for fast lookups during module updates
Kernel
– kernel to use in CCBP model
float – the prior to use on the alpha parameter
float – the prior to use on the beta parameter
float – the lower confidence bound used for best arm prediction (e.g. 0.95 -> return the 5th percentile of the belief distribution as the estimated objective value for each candidate)
-
class
dexnet.learning.
ConfusionMatrix
(num_categories)¶ Confusion matrix for classification errors
-
class
dexnet.learning.
Tensor
(shape, dtype=<type ‘numpy.float32’>)¶ Abstraction for 4-D tensor objects.
-
add
(datapoint)¶ Adds the datapoint to the tensor if room is available.
-
data_slice
(slice_ind)¶ Returns a slice of datapoints
-
datapoint
(ind)¶ Returns the datapoint at the given index.
-
static
load
(filename, compressed=True)¶ Loads a tensor from disk.
-
reset
()¶ Resets the current index.
-
save
(filename, compressed=True)¶ Save a tensor to disk.
-
set_datapoint
(ind, datapoint)¶ Sets the value of the datapoint at the given index.
-
-
class
dexnet.learning.
TensorDataset
(filename, config, access_mode=’WRITE’)¶ Encapsulates learning datasets and different training and test splits of the data.
-
add
(datapoint)¶ Adds a datapoint to the file.
-
datapoint
(ind)¶ Loads a tensor datapoint for a given global index.
Parameters: ind (int) – global index in the tensor Returns: the desired tensor datapoint Return type: TensorDatapoint
-
datapoint_indices
¶ Returns an array of all dataset indices.
-
datapoint_indices_for_tensor
(tensor_index)¶ Returns the indices for all datapoints in the given tensor.
-
flush
()¶ Flushes the data tensors. Alternate handle to write.
-
generate_tensor_filename
(field_name, file_num, compressed=True)¶ Generate a filename for a tensor.
-
load_tensor
(field_name, file_num)¶ Loads a tensor for a given field and file num.
Parameters: - field_name (str) – the name of the field to load
- file_num (int) – the number of the file to load from
Returns: the desired tensor
Return type:
-
next
()¶ Read the next datapoint.
Returns: the next datapoint Return type: TensorDatapoint
-
static
open
(dataset_dir)¶ Opens a tensor dataset.
-
split
(attribute, train_pct, val_pct)¶ Splits the dataset along the given attribute.
-
tensor_dir
¶ Return the tensor directory.
-
tensor_index
(datapoint_index)¶ Returns the index of the tensor containing the referenced datapoint.
-
tensor_indices
¶ Returns an array of all tensor indices.
-
write
()¶ Writes all tensors to the next file number.
-