Image Classes¶

Image¶

class perception.Image(data, frame=’unspecified’)¶

Bases: object

Abstract wrapper class for images.

__init__(data, frame=’unspecified’)¶

Create an image from an array of data.

Parameters:

data (numpy.ndarray) – An array of data with which to make the image. The first dimension of the data should index rows, the second columns, and the third individual pixel elements (i.e. R,G,B values). Alternatively, if the matrix is one dimensional, it will be interpreted as an N by 1 image with single element list at each pixel, and if the matrix is two dimensional, it will be a N by M matrix with a single element list at each pixel.
frame (str) – A string representing the frame of reference in which this image lies.

Raises:

ValueError – If the data is not a properly-formatted ndarray or frame is not a string.

shape¶: tuple of int – The shape of the data array.

height¶: int – The number of rows in the image.

width¶: int – The number of columns in the image.

center¶: numpy.ndarray of int – The xy indices of the center of the image.

channels¶: int – The number of channels in each pixel. For example, RGB images have 3 channels.

type¶: numpy.dtype – The data type of the image’s elements.

raw_data¶: numpy.ndarray – The 3D array of data. The first dim is rows, the second is columns, and the third is pixel channels.

data¶: numpy.ndarray – The data array, but squeezed to get rid of extraneous dimensions.

frame¶: str – The frame of reference in which the image resides.

resize(size, interp)¶

Resize the image.

Parameters:	size (int, float, or tuple) – int - Percentage of current size. float - Fraction of current size. tuple - Size of the output image. interp (`str`, optional) – Interpolation to use for re-sizing (‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’, or ‘cubic’)

transform(translation, theta, method=’opencv’)¶

Create a new image by translating and rotating the current image.

Parameters:	translation (`numpy.ndarray` of float) – The XY translation vector. theta (float) – Rotation angle in radians, with positive meaning counter-clockwise. method (`str`) – Method to use for image transformations (opencv or scipy)
Returns:	An image of the same type that has been rotated and translated.
Return type:	`Image`

gradients()¶

Return the gradient as a pair of numpy arrays.

Returns:	The gradients of the image along each dimension.
Return type:	`tuple` of `numpy.ndarray` of float

ij_to_linear(i, j)¶

Converts row / column coordinates to linear indices.

Parameters:	i (`numpy.ndarray` of int) – A list of row coordinates. j (`numpy.ndarray` of int) – A list of column coordinates.
Returns:	A list of linear coordinates.
Return type:	`numpy.ndarray` of int

linear_to_ij(linear_inds)¶

Converts linear indices to row and column coordinates.

Parameters:	linear_inds (`numpy.ndarray` of int) – A list of linear coordinates.
Returns:	A 2D ndarray whose first entry is the list of row indices and whose second entry is the list of column indices.
Return type:	`numpy.ndarray` of int

mask_by_ind(inds)¶

Create a new image by zeroing out data at locations not in the given indices.

Parameters:	inds (`numpy.ndarray` of int) – A 2D ndarray whose first entry is the list of row indices and whose second entry is the list of column indices. The data at these indices will not be set to zero.
Returns:	A new Image of the same type, with data not indexed by inds set to zero.
Return type:	`Image`

mask_by_linear_ind(linear_inds)¶

Create a new image by zeroing out data at locations not in the given indices.

Parameters:	linear_inds (`numpy.ndarray` of int) – A list of linear coordinates.
Returns:	A new Image of the same type, with data not indexed by inds set to zero.
Return type:	`Image`

is_same_shape(other_im, check_channels=False)¶

Checks if two images have the same height and width (and optionally channels).

Parameters:	other_im (`Image`) – The image to compare against this one. check_channels (bool) – Whether or not to check equality of the channels.
Returns:	True if the images are the same shape, False otherwise.
Return type:	bool

static median_images(images)¶

Create a median Image from a list of Images.

:param list of Image: A list of Image objects.

Returns:	A new Image of the same type whose data is the median of all of the images’ data.
Return type:	`Image`

static min_images(images)¶

Create a min Image from a list of Images.

:param list of Image: A list of Image objects.

Returns:	A new Image of the same type whose data is the min of all of the images’ data.
Return type:	`Image`

__getitem__(indices)¶

Index the image’s data array.

Parameters:	indices (int or `tuple` of int) – int - A linear index. tuple - An ordered index in row, column, and (optionally) channel order.
Returns:	The indexed item.
Return type:	item
Raises:	`ValueError` – If the index is poorly formatted or out of bounds.

apply(method, *args, **kwargs)¶

Create a new image by applying a function to this image’s data.

Parameters:	method (`function`) – A function to call on the data. This takes in a ndarray as its first argument and optionally takes other arguments. It should return a modified data ndarray. args (arguments) – Additional args for method. kwargs (keyword arguments) – Additional keyword arguments for method.
Returns:	A new Image of the same type with new data generated by calling method on the current image’s data.
Return type:	`Image`

copy()¶

Returns a copy of this image.

Returns:	copy of this image
Return type:	`Image`

crop(height, width, center_i=None, center_j=None)¶

Crop the image centered around center_i, center_j.

Parameters:	height (int) – The height of the desired image. width (int) – The width of the desired image. center_i (int) – The center height point at which to crop. If not specified, the center of the image is used. center_j (int) – The center width point at which to crop. If not specified, the center of the image is used.
Returns:	A cropped Image of the same type.
Return type:	`Image`

focus(height, width, center_i=None, center_j=None)¶

Zero out all of the image outside of a crop box.

Parameters:	height (int) – The height of the desired crop box. width (int) – The width of the desired crop box. center_i (int) – The center height point of the crop box. If not specified, the center of the image is used. center_j (int) – The center width point of the crop box. If not specified, the center of the image is used.
Returns:	A new Image of the same type and size that is zeroed out except within the crop box.
Return type:	`Image`

center_nonzero()¶

Recenters the image on the mean of the coordinates of nonzero pixels.

Returns:	A new Image of the same type and size that is re-centered at the mean location of the non-zero pixels.
Return type:	`Image`

nonzero_pixels()¶

Return an array of the nonzero pixels.

Returns:	Nx2 array of the nonzero pixels
Return type:	`numpy.ndarray`

zero_pixels()¶

Return an array of the zero pixels.

Returns:	Nx2 array of the zero pixels
Return type:	`numpy.ndarray`

finite_pixels()¶

Return an array of the finite pixels.

Returns:	Nx2 array of the finite pixels
Return type:	`numpy.ndarray`

nonzero_data()¶

Returns the values in the image at the nonzero pixels

Returns:	NxC array of the nonzero data
Return type:	`numpy.ndarray`

replace_zeros(val, zero_thresh=0.0)¶

Replaces all zeros in the image with a specified value

Returns:	value to replace zeros with
Return type:	image dtype

save(filename)¶

Writes the image to a file.

Parameters:	filename (`str`) – The file to save the image to. Must be one of .png, .jpg, .npy, or .npz.
Raises:	`ValueError` – If an unsupported file type is specified.

savefig(output_path, title, dpi=400, format=’png’, cmap=None)¶

Write the image to a file using pyplot.

Parameters:	output_path (`str`) – The directory in which to place the file. title (`str`) – The title of the file in which to save the image. dpi (int) – The resolution in dots per inch. format (`str`) – The file format to save. Available options include .png, .pdf, .ps, .eps, and .svg. cmap (`Colormap`, optional) – A Colormap object fo the pyplot.

static load_data(filename)¶

Loads a data matrix from a given file.

Parameters:	filename (`str`) – The file to load the data from. Must be one of .png, .jpg, .npy, or .npz.
Returns:	The data array read from the file.
Return type:	`numpy.ndarray`

ColorImage¶

class perception.ColorImage(data, frame=’unspecified’)¶

Bases: perception.image.Image

An RGB color image.

__init__(data, frame=’unspecified’)¶

Create a color image from an array of data.

Parameters:

data (numpy.ndarray) – An array of data with which to make the image. The first dimension of the data should index rows, the second columns, and the third individual pixel elements (i.e. R,G,B values). Alternatively, the image may have a single channel, in which case it is interpreted as greyscale.
frame (str) – A string representing the frame of reference in which this image lies.

Raises:

ValueError – If the data is not a properly-formatted ndarray or frame is not a string.

r_data¶: numpy.ndarray of uint8 – The red-channel data.

g_data¶: numpy.ndarray of uint8 – The green-channel data.

b_data¶: numpy.ndarray of uint8 – The blue-channel data.

swap_channels(channel_swap)¶

Swaps the two channels specified in the tuple.

Parameters:	channel_swap (`tuple` of int) – the two channels to swap
Returns:	color image with cols swapped
Return type:	`ColorImage`

resize(size, interp=’bilinear’)¶

Resize the image.

Parameters:	size (int, float, or tuple) – int - Percentage of current size. float - Fraction of current size. tuple - Size of the output image. interp (`str`, optional) – Interpolation to use for re-sizing (‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’, or ‘cubic’)
Returns:	The resized image.
Return type:	`ColorImage`

find_chessboard(sx=6, sy=9)¶

Finds the corners of an sx X sy chessboard in the image.

Parameters:	sx (int) – Number of chessboard corners in x-direction. sy (int) – Number of chessboard corners in y-direction.
Returns:	A list containing the 2D points of the corners of the detected chessboard, or None if no chessboard found.
Return type:	`list` of `numpy.ndarray`

mask_binary(binary_im)¶

Create a new image by zeroing out data at locations where binary_im == 0.0.

Parameters:	binary_im (`BinaryImage`) – A BinaryImage of the same size as this image, with pixel values of either zero or one. Wherever this image has zero pixels, we’ll zero out the pixels of the new image.
Returns:	A new Image of the same type, masked by the given binary image.
Return type:	`Image`

foreground_mask(tolerance, ignore_black=True, use_hsv=False, scale=8, bgmodel=None)¶

Creates a binary image mask for the foreground of an image against a uniformly colored background. The background is assumed to be the mode value of the histogram for each of the color channels.

Parameters:	tolerance (int) – A +/- level from the detected mean backgroud color. Pixels withing this range will be classified as background pixels and masked out. ignore_black (bool) – If True, the zero pixels will be ignored when computing the background model. use_hsv (bool) – If True, image will be converted to HSV for background model generation. scale (int) – Size of background histogram bins – there will be 255/size bins in the color histogram for each channel. bgmodel (`list` of int) – A list containing the red, green, and blue channel modes of the background. If this is None, a background model will be generated using the other parameters.
Returns:	A binary image that masks out the background from the current ColorImage.
Return type:	`BinaryImage`

background_model(ignore_black=True, use_hsv=False, scale=8)¶

Creates a background model for the given image. The background color is given by the modes of each channel’s histogram.

Parameters:

ignore_black (bool) – If True, the zero pixels will be ignored when computing the background model.
use_hsv (bool) – If True, image will be converted to HSV for background model generation.
scale (int) – Size of background histogram bins – there will be 255/size bins in the color histogram for each channel.

Returns:

A list containing the red, green, and blue channel modes of the
background.

draw_box(box)¶

Draw a white box on the image.

:param autolab_core.Box: A 2D box to draw in the image.

Returns:	A new image that is the same as the current one, but with the white box drawn in.
Return type:	`ColorImage`

nonzero_hsv_data()¶

Computes non zero hsv values.

Returns:	array of the hsv values for the image
Return type:	`numpy.ndarray`

segment_kmeans(rgb_weight, num_clusters, hue_weight=0.0)¶

Segment a color image using KMeans based on spatial and color distances. Black pixels will automatically be assigned to their own ‘background’ cluster.

Parameters:	rgb_weight (float) – weighting of RGB distance relative to spatial and hue distance num_clusters (int) – number of clusters to use hue_weight (float) – weighting of hue from hsv relative to spatial and RGB distance
Returns:	image containing the segment labels
Return type:	`SegmentationImage`

inpaint(win_size=3, rescale_factor=1.0)¶

Fills in the zero pixels in the image.

Parameters:	win_size (int) – size of window to use for inpainting rescale_factor (float) – amount to rescale the image for inpainting, smaller numbers increase speed
Returns:	color image with zero pixels filled in
Return type:	`ColorImage`

to_binary(threshold=0.0)¶

Converts the color image to binary.

Returns:	Binary image corresponding to the nonzero px of the original image
Return type:	`BinaryImage`

to_grayscale()¶

Converts the color image to grayscale using OpenCV.

Returns:	Grayscale image corresponding to original color image.
Return type:	`GrayscaleImage`

static open(filename, frame=’unspecified’)¶

Creates a ColorImage from a file.

Parameters:	filename (`str`) – The file to load the data from. Must be one of .png, .jpg, .npy, or .npz. frame (`str`) – A string representing the frame of reference in which the new image lies.
Returns:	The new color image.
Return type:	`ColorImage`

DepthImage¶

class perception.DepthImage(data, frame=’unspecified’)¶

Bases: perception.image.Image

A depth image in which individual pixels have a single floating-point depth channel.

__init__(data, frame=’unspecified’)¶

Create a depth image from an array of data.

Parameters:	data (`numpy.ndarray`) – An array of data with which to make the image. The first dimension of the data should index rows, the second columns, and the third individual pixel elements (depths as floating point numbers). frame (`str`) – A string representing the frame of reference in which this image lies.
Raises:	`ValueError` – If the data is not a properly-formatted ndarray or frame is not a string.

resize(size, interp=’bilinear’)¶

Resize the image.

Parameters:	size (int, float, or tuple) – int - Percentage of current size. float - Fraction of current size. tuple - Size of the output image. interp (`str`, optional) – Interpolation to use for re-sizing (‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’, or ‘cubic’)
Returns:	The resized image.
Return type:	`DepthImage`

threshold(front_thresh=0.0, rear_thresh=100.0)¶

Creates a new DepthImage by setting all depths less than front_thresh and greater than rear_thresh to 0.

Parameters:	front_thresh (float) – The lower-bound threshold. rear_thresh (float) – The upper bound threshold.
Returns:	A new DepthImage created from the thresholding operation.
Return type:	`DepthImage`

threshold_gradients(grad_thresh)¶

Creates a new DepthImage by zeroing out all depths where the magnitude of the gradient at that point is greater than grad_thresh.

Parameters:	grad_thresh (float) – A threshold for the gradient magnitude.
Returns:	A new DepthImage created from the thresholding operation.
Return type:	`DepthImage`

threshold_gradients_pctile(thresh_pctile, min_mag=0.0)¶

Creates a new DepthImage by zeroing out all depths where the magnitude of the gradient at that point is greater than some percentile of all gradients.

Parameters:	thresh_pctile (float) – percentile to threshold all gradients above min_mag (float) – minimum magnitude of the gradient
Returns:	A new DepthImage created from the thresholding operation.
Return type:	`DepthImage`

inpaint(rescale_factor=1.0)¶

Fills in the zero pixels in the image.

Parameters:	rescale_factor (float) – amount to rescale the image for inpainting, smaller numbers increase speed
Returns:	depth image with zero pixels filled in
Return type:	`DepthImage`

mask_binary(binary_im)¶

Create a new image by zeroing out data at locations where binary_im == 0.0.

Parameters:	binary_im (`BinaryImage`) – A BinaryImage of the same size as this image, with pixel values of either zero or one. Wherever this image has zero pixels, we’ll zero out the pixels of the new image.
Returns:	A new Image of the same type, masked by the given binary image.
Return type:	`Image`

pixels_farther_than(depth_im)¶

Returns the pixels that are farther away than those in the corresponding depth image.

Parameters:	depth_im (`DepthImage`) – depth image to query replacement with
Returns:	the pixels
Return type:	`numpy.ndarray`

combine_with(depth_im)¶

Replaces all zeros in the source depth image with the value of a different depth image

Parameters:	depth_im (`DepthImage`) – depth image to combine with
Returns:	the combined depth image
Return type:	`DepthImage`

to_binary(threshold=0.0)¶

Creates a BinaryImage from the depth image. Points where the depth is greater than threshold are converted to ones, and all other points are zeros.

Parameters:	threshold (float) – The depth threshold.
Returns:	A BinaryImage where all 1 points had a depth greater than threshold in the DepthImage.
Return type:	`BinaryImage`

to_color(normalize=False)¶

Convert to a color image.

Parameters:	normalize (bool) – whether or not to normalize by the maximum depth
Returns:	color image corresponding to the depth image
Return type:	`ColorImage`

to_float()¶

Converts to 32-bit data.

Returns:	depth image with 32 bit float data
Return type:	`DepthImage`

point_normal_cloud(camera_intr)¶

Computes a PointNormalCloud from the depth image.

Parameters:	camera_intr (`CameraIntrinsics`) – The camera parameters on which this depth image was taken.
Returns:	A PointNormalCloud created from the depth image.
Return type:	`autolab_core.PointNormalCloud`

static open(filename, frame=’unspecified’)¶

Creates a DepthImage from a file.

Parameters:	filename (`str`) – The file to load the data from. Must be one of .png, .jpg, .npy, or .npz. frame (`str`) – A string representing the frame of reference in which the new image lies.
Returns:	The new depth image.
Return type:	`DepthImage`

IrImage¶

class perception.IrImage(data, frame=’unspecified’)¶

Bases: perception.image.Image

An IR image in which individual pixels have a single uint16 channel.

__init__(data, frame=’unspecified’)¶

Create an IR image from an array of data.

Parameters:	data (`numpy.ndarray`) – An array of data with which to make the image. The first dimension of the data should index rows, the second columns, and the third individual pixel elements (IR values as uint16’s). frame (`str`) – A string representing the frame of reference in which this image lies.
Raises:	`ValueError` – If the data is not a properly-formatted ndarray or frame is not a string.

resize(size, interp=’bilinear’)¶

Resize the image.

Parameters:	size (int, float, or tuple) – int - Percentage of current size. float - Fraction of current size. tuple - Size of the output image. interp (`str`, optional) – Interpolation to use for re-sizing (‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’, or ‘cubic’)
Returns:	The resized image.
Return type:	`IrImage`

static open(filename, frame=’unspecified’)¶

Creates an IrImage from a file.

Parameters:	filename (`str`) – The file to load the data from. Must be one of .png, .jpg, .npy, or .npz. frame (`str`) – A string representing the frame of reference in which the new image lies.
Returns:	The new IR image.
Return type:	`IrImage`

GrayscaleImage¶

class perception.IrImage(data, frame=’unspecified’)

Bases: perception.image.Image

An IR image in which individual pixels have a single uint16 channel.

__init__(data, frame=’unspecified’)

Create an IR image from an array of data.

Parameters:	data (`numpy.ndarray`) – An array of data with which to make the image. The first dimension of the data should index rows, the second columns, and the third individual pixel elements (IR values as uint16’s). frame (`str`) – A string representing the frame of reference in which this image lies.
Raises:	`ValueError` – If the data is not a properly-formatted ndarray or frame is not a string.

resize(size, interp=’bilinear’)

Resize the image.

Parameters:	size (int, float, or tuple) – int - Percentage of current size. float - Fraction of current size. tuple - Size of the output image. interp (`str`, optional) – Interpolation to use for re-sizing (‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’, or ‘cubic’)
Returns:	The resized image.
Return type:	`IrImage`

static open(filename, frame=’unspecified’)

Creates an IrImage from a file.

Parameters:	filename (`str`) – The file to load the data from. Must be one of .png, .jpg, .npy, or .npz. frame (`str`) – A string representing the frame of reference in which the new image lies.
Returns:	The new IR image.
Return type:	`IrImage`

BinaryImage¶

class perception.BinaryImage(data, frame=’unspecified’, threshold=128)¶

Bases: perception.image.Image

A binary image in which individual pixels are either black or white (0 or 255).

__init__(data, frame=’unspecified’, threshold=128)¶

Create a BinaryImage image from an array of data.

Parameters:

data (numpy.ndarray) – An array of data with which to make the image. The first dimension of the data should index rows, the second columns, and the third individual pixel elements (only one channel, all uint8). The data array will be thresholded and will end up only containing elements that are 255 or 0.
threshold (int) – A threshold value. Any value in the data array greater than threshold will be set to 255, and all others will be set to 0.
frame (str) – A string representing the frame of reference in which this image lies.

Raises:

ValueError – If the data is not a properly-formatted ndarray or frame is not a string.

resize(size, interp=’bilinear’)¶

Resize the image.

Parameters:	size (int, float, or tuple) – int - Percentage of current size. float - Fraction of current size. tuple - Size of the output image. interp (`str`, optional) – Interpolation to use for re-sizing (‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’, or ‘cubic’)
Returns:	The resized image.
Return type:	`BinaryImage`

mask_binary(binary_im)¶

Takes AND operation with other binary image.

Parameters:	binary_im (`BinaryImage`) – binary image for and operation
Returns:	AND of this binary image and other image
Return type:	`BinaryImage`

prune_contours(area_thresh=1000.0, dist_thresh=20, preserve_topology=True)¶

Removes all white connected components with area less than area_thresh. :param area_thresh: The minimum area for which a white connected component will not be

zeroed out.

Parameters:	dist_thresh (int) – If a connected component is within dist_thresh of the top of the image, it will not be pruned out, regardless of its area.
Returns:	The new pruned binary image.
Return type:	`BinaryImage`

find_contours(min_area=0.0, max_area=inf)¶

Returns a list of connected components with an area between min_area and max_area. :param min_area: The minimum area for a contour :type min_area: float :param max_area: The maximum area for a contour :type max_area: float

Returns:	A list of resuting contours
Return type:	`list` of `Contour`

contour_mask(contour)¶: Generates a binary image with only the given contour filled in.

boundary_map()¶

Computes the boundary pixels in the image and sets them to nonzero values.

Returns:	binary image with nonzeros on the boundary of the original image
Return type:	`BinaryImage`

closest_nonzero_pixel(pixel, direction, w=13, t=0.5)¶

Starting at pixel, moves pixel by direction * t until there is a non-zero pixel within a radius w of pixel. Then, returns pixel.

Parameters:	pixel (`numpy.ndarray` of float) – The initial pixel location at which to start. direction (`numpy.ndarray` of float) – The 2D direction vector in which to move pixel. w (int) – A circular radius in which to check for non-zero pixels. As soon as the current pixel has some non-zero pixel with a raidus w of it, this function returns the current pixel location. t (float) – The step size with which to move pixel along direction.
Returns:	The first pixel location along the direction vector at which there exists some non-zero pixel within a radius w.
Return type:	`numpy.ndarray` of float

add_frame(left_boundary, right_boundary, upper_boundary, lower_boundary)¶

Adds a frame to the image, e.g. turns the boundaries white

Parameters:	left_boundary (int) – the leftmost boundary of the frame right_boundary (int) – the rightmost boundary of the frame (must be greater than left_boundary) upper_boundary (int) – the upper boundary of the frame lower_boundary (int) – the lower boundary of the frame (must be greater than upper_boundary)
Returns:	binary image with white (255) on the boundaries
Return type:	`BinaryImage`

most_free_pixel()¶

Find the black pixel with the largest distance from the white pixels.

Returns:	2-vector containing the most free pixel
Return type:	`numpy.ndarray`

diff_with_target(binary_im)¶

Creates a color image to visualize the overlap between two images. Nonzero pixels that match in both images are green. Nonzero pixels of this image that aren’t in the other image are yellow Nonzero pixels of the other image that aren’t in this image are red

Parameters:	binary_im (`BinaryImage`) – binary image to take the difference with
Returns:	color image to visualize the image difference
Return type:	`ColorImage`

num_adjacent(i, j)¶

Counts the number of adjacent nonzero pixels to a given pixel.

Parameters:	i (int) – row index of query pixel j (int) – col index of query pixel
Returns:	number of adjacent nonzero pixels
Return type:	int

to_sdf()¶

Converts the 2D image to a 2D signed distance field.

Returns:	2D float array of the signed distance field
Return type:	`numpy.ndarray`

to_color()¶

Creates a ColorImage from the binary image.

Returns:	The newly-created color image.
Return type:	`ColorImage`

static open(filename, frame=’unspecified’)¶

Creates a BinaryImage from a file.

Parameters:	filename (`str`) – The file to load the data from. Must be one of .png, .jpg, .npy, or .npz. frame (`str`) – A string representing the frame of reference in which the new image lies.
Returns:	The new binary image.
Return type:	`BinaryImage`

SegmentationImage¶

class perception.SegmentationImage(data, frame=’unspecified’)¶

Bases: perception.image.Image

An image containing integer-valued segment labels.

__init__(data, frame=’unspecified’)¶

Create a Segmentation image from an array of data.

Parameters:	data (`numpy.ndarray`) – An array of data with which to make the image. The first dimension of the data should index rows, the second columns, and the third individual pixel elements (only one channel, all uint8). The integer-valued data should correspond to segment labels. frame (`str`) – A string representing the frame of reference in which this image lies.
Raises:	`ValueError` – If the data is not a properly-formatted ndarray or frame is not a string.

border_pixels(grad_sigma=0.5, grad_lower_thresh=0.1, grad_upper_thresh=1.0)¶

Returns the pixels on the boundary between all segments, excluding the zero segment.

Parameters:	grad_sigma (float) – standard deviation used for gaussian gradient filter grad_lower_thresh (float) – lower threshold on gradient threshold used to determine the boundary pixels grad_upper_thresh (float) – upper threshold on gradient threshold used to determine the boundary pixels
Returns:	Nx2 array of pixels on the boundary
Return type:	`numpy.ndarray`

segment_mask(segnum)¶

Returns a binary image of just the segment corresponding to the given number.

Parameters:	segnum (int) – the number of the segment to generate a mask for
Returns:	binary image data
Return type:	`BinaryImage`

resize(size, interp=’nearest’)¶

Resize the image.

Parameters:	size (int, float, or tuple) – int - Percentage of current size. float - Fraction of current size. tuple - Size of the output image. interp (`str`, optional) – Interpolation to use for re-sizing (‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’, or ‘cubic’)

static open(filename, frame=’unspecified’)¶: Opens a segmentation image

PointCloudImage¶

class perception.PointCloudImage(data, frame=’unspecified’)¶

Bases: perception.image.Image

A point cloud image in which individual pixels have three float channels.

__init__(data, frame=’unspecified’)¶

Create a PointCloudImage image from an array of data.

Parameters:	data (`numpy.ndarray`) – An array of data with which to make the image. The first dimension of the data should index rows, the second columns, and the third individual pixel elements (three floats). frame (`str`) – A string representing the frame of reference in which this image lies.
Raises:	`ValueError` – If the data is not a properly-formatted ndarray or frame is not a string.

resize(size, interp=’bilinear’)¶

Resize the image.

Parameters:	size (int, float, or tuple) – int - Percentage of current size. float - Fraction of current size. tuple - Size of the output image. interp (`str`, optional) – Interpolation to use for re-sizing (‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’, or ‘cubic’)
Returns:	The resized image.
Return type:	`PointCloudImage`

to_point_cloud()¶

Convert the image to a PointCloud object.

Returns:	The corresponding PointCloud.
Return type:	`autolab_core.PointCloud`

normal_cloud_im()¶

Generate a NormalCloudImage from the PointCloudImage.

Returns:	The corresponding NormalCloudImage.
Return type:	`NormalCloudImage`

static open(filename, frame=’unspecified’)¶

Creates a PointCloudImage from a file.

Parameters:	filename (`str`) – The file to load the data from. Must be one of .png, .jpg, .npy, or .npz. frame (`str`) – A string representing the frame of reference in which the new image lies.
Returns:	The new PointCloudImage.
Return type:	`PointCloudImage`

NormalCloudImage¶

class perception.NormalCloudImage(data, frame=’unspecified’)¶

Bases: perception.image.Image

A normal cloud image in which individual pixels have three float channels.

__init__(data, frame=’unspecified’)¶

Create a NormalCloudImage image from an array of data.

Parameters:	data (`numpy.ndarray`) – An array of data with which to make the image. The first dimension of the data should index rows, the second columns, and the third individual pixel elements (three floats). frame (`str`) – A string representing the frame of reference in which this image lies.
Raises:	`ValueError` – If the data is not a properly-formatted ndarray or frame is not a string.

resize(size, interp=’bilinear’)¶

This method is not implemented for NormalCloudImage.

Raises:	`NotImplementedError`

to_normal_cloud()¶

Convert the image to a NormalCloud object.

Returns:	The corresponding NormalCloud.
Return type:	`autolab_core.NormalCloud`

static open(filename, frame=’unspecified’)¶

Creates a NormalCloudImage from a file.

Parameters:	filename (`str`) – The file to load the data from. Must be one of .png, .jpg, .npy, or .npz. frame (`str`) – A string representing the frame of reference in which the new image lies.
Returns:	The new NormalCloudImage.
Return type:	`NormalCloudImage`

RenderMode¶

class perception.RenderMode¶

Bases: object

Supported rendering modes.

ObjectRender¶

class perception.ObjectRender(image, T_camera_world=RigidTransform(rotation=[[ 1. 0. 0.] [ 0. 1. 0.] [ 0. 0. 1.]], translation=[ 0. 0. 0.], from_frame=camera, to_frame=table), obj_key=None, stable_pose=None)¶

Bases: object

Class to encapsulate images of an object rendered from a virtual camera.

Note

In this class, the table’s frame of reference is the ‘world’ frame for the renderer.

__init__(image, T_camera_world=RigidTransform(rotation=[[ 1. 0. 0.] [ 0. 1. 0.] [ 0. 0. 1.]], translation=[ 0. 0. 0.], from_frame=camera, to_frame=table), obj_key=None, stable_pose=None)¶

Create an ObjectRender.

Parameters:	image (`Image`) – The image to be encapsulated. T_camera_world (`autolab_core.RigidTransform`) – A rigid transform from camera to world coordinates (positions the camera in the world). TODO – this should be renamed. obj_key (`str`, optional) – A string identifier for the object being rendered. stable_pose (`meshpy.StablePose`) – The object’s stable pose.

T_obj_camera¶

Returns the transformation from camera to object when the object is in the given stable pose.

Returns:	The desired transform.
Return type:	`autolab_core.RigidTransform`