mmdet3d.apis ¶

tensor¶

Float matrix with shape (N, box_dim).

Type: Tensor

box_dim¶

Integer indicating the dimension of a box. Each row is (x, y, z, x_size, y_size, z_size, yaw, …).

Type: int

with_yaw¶

If True, the value of yaw will be set to 0 as minmax boxes.

Type: bool

property bev: torch.Tensor¶

2D BEV box of each box with rotation in XYWHR format, in shape (N, 5).

Type: Tensor

property bottom_center: torch.Tensor¶

A tensor with center of each box in shape (N, 3).

Type: Tensor

property bottom_height: torch.Tensor¶

A vector with bottom height of each box in shape (N, ).

Type: Tensor

classmethod cat(boxes_list)[source]¶

Concatenate a list of Boxes into a single Boxes.

Parameters: boxes_list (Sequence[BaseInstance3DBoxes]) – List of boxes.
Returns: The concatenated boxes.
Return type: BaseInstance3DBoxes

property center: torch.Tensor¶

Calculate the center of all the boxes.

Note

In MMDetection3D’s convention, the bottom center is usually taken as the default center.

The relative position of the centers in different kinds of boxes are different, e.g., the relative center of a boxes is (0.5, 1.0, 0.5) in camera and (0.5, 0.5, 0) in lidar. It is recommended to use bottom_center or gravity_center for clearer usage.

Returns: A tensor with center of each box in shape (N, 3).
Return type: Tensor

clone()[source]¶

Clone the boxes.

Returns: Box object with the same properties as self.
Return type: BaseInstance3DBoxes

abstract convert_to(dst, rt_mat=None, correct_yaw=False)[source]¶

Convert self to dst mode.

Parameters

dst (int) – The target Box mode.
rt_mat (Tensor or np.ndarray, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.
correct_yaw (bool) – Whether to convert the yaw angle to the target coordinate. Defaults to False.

Returns

The converted box of the same type in the dst mode.

Return type

property corners: torch.Tensor¶

A tensor with 8 corners of each box in shape (N, 8, 3).

Type: Tensor

cpu()[source]¶

Convert current boxes to cpu device.

Returns: A new boxes object on the cpu device.
Return type: BaseInstance3DBoxes

cuda(*args, **kwargs)[source]¶

Convert current boxes to cuda device.

Returns: A new boxes object on the cuda device.
Return type: BaseInstance3DBoxes

detach()[source]¶

Detach the boxes.

Returns: Box object with the same properties as self.
Return type: BaseInstance3DBoxes

property device: torch.device¶

The device of the boxes are on.

Type: torch.device

property dims: torch.Tensor¶

Size dimensions of each box in shape (N, 3).

Type: Tensor

abstract flip(bev_direction='horizontal', points=None)[source]¶

Flip the boxes in BEV along given BEV direction.

Parameters

bev_direction (str) – Direction by which to flip. Can be chosen from ‘horizontal’ and ‘vertical’. Defaults to ‘horizontal’.
points (Tensor or np.ndarray or BasePoints, optional) – Points to flip. Defaults to None.

Returns

When points is None, the function returns None, otherwise it returns the flipped points.

Return type

Tensor or np.ndarray or BasePoints or None

property gravity_center: torch.Tensor¶

A tensor with center of each box in shape (N, 3).

Type: Tensor

property height: torch.Tensor¶

A vector with height of each box in shape (N, ).

Type: Tensor

classmethod height_overlaps(boxes1, boxes2)[source]¶

Calculate height overlaps of two boxes.

Note

This function calculates the height overlaps between boxes1 and boxes2, boxes1 and boxes2 should be in the same type.

Parameters

boxes1 (BaseInstance3DBoxes) – Boxes 1 contain N boxes.
boxes2 (BaseInstance3DBoxes) – Boxes 2 contain M boxes.

Returns

Calculated height overlap of the boxes.

Return type

Tensor

in_range_3d(box_range)[source]¶

Check whether the boxes are in the given range.

Parameters: box_range (Tensor or np.ndarray or Sequence[float]) – The range of box (x_min, y_min, z_min, x_max, y_max, z_max).
Return type: torch.Tensor

Note

In the original implementation of SECOND, checking whether a box in the range checks whether the points are in a convex polygon, we try to reduce the burden for simpler cases.

Returns: A binary vector indicating whether each point is inside the reference range.
Return type: Tensor
Parameters: box_range (Union[torch.Tensor, numpy.ndarray, Sequence[float]]) –

in_range_bev(box_range)[source]¶

Check whether the boxes are in the given range.

Parameters: box_range (Tensor or np.ndarray or Sequence[float]) – The range of box in order of (x_min, y_min, x_max, y_max).
Return type: torch.Tensor

Note

The original implementation of SECOND checks whether boxes in a range by checking whether the points are in a convex polygon, we reduce the burden for simpler cases.

Returns: A binary vector indicating whether each box is inside the reference range.
Return type: Tensor
Parameters: box_range (Union[torch.Tensor, numpy.ndarray, Sequence[float]]) –

limit_yaw(offset=0.5, period=3.141592653589793)[source]¶

Limit the yaw to a given period and offset.

Parameters

offset (float) – The offset of the yaw. Defaults to 0.5.
period (float) – The expected period. Defaults to np.pi.

Return type

property nearest_bev: torch.Tensor¶

A tensor of 2D BEV box of each box without rotation.

Type: Tensor

new_box(data)[source]¶

Create a new box object with data.

The new box and its tensor has the similar properties as self and self.tensor, respectively.

Parameters: data (Tensor or np.ndarray or Sequence[Sequence[float]]) – Data to be copied.
Returns: A new bbox object with data, the object’s other properties are similar to self.
Return type: BaseInstance3DBoxes

nonempty(threshold=0.0)[source]¶

Find boxes that are non-empty.

A box is considered empty if either of its side is no larger than threshold.

Parameters: threshold (float) – The threshold of minimal sizes. Defaults to 0.0.
Returns: A binary vector which represents whether each box is empty (False) or non-empty (True).
Return type: Tensor

numpy()[source]¶

Reload numpy from self.tensor.

Return type: numpy.ndarray

classmethod overlaps(boxes1, boxes2, mode='iou')[source]¶

Calculate 3D overlaps of two boxes.

Note

This function calculates the overlaps between boxes1 and boxes2, boxes1 and boxes2 should be in the same type.

Parameters

boxes1 (BaseInstance3DBoxes) – Boxes 1 contain N boxes.
boxes2 (BaseInstance3DBoxes) – Boxes 2 contain M boxes.
mode (str) – Mode of iou calculation. Defaults to ‘iou’.

Returns

Calculated 3D overlap of the boxes.

Return type

Tensor

points_in_boxes_all(points, boxes_override=None)[source]¶

Find all boxes in which each point is.

Parameters

points (Tensor) – Points in shape (1, M, 3) or (M, 3), 3 dimensions are (x, y, z) in LiDAR or depth coordinate.
boxes_override (Tensor, optional) – Boxes to override self.tensor. Defaults to None.

Returns

A tensor indicating whether a point is in a box with shape (M, T). T is the number of boxes. Denote this tensor as A, it the m^th point is in the t^th box, then A[m, t] == 1, otherwise A[m, t] == 0.

Return type

Tensor

points_in_boxes_part(points, boxes_override=None)[source]¶

Find the box in which each point is.

Parameters

points (Tensor) – Points in shape (1, M, 3) or (M, 3), 3 dimensions are (x, y, z) in LiDAR or depth coordinate.
boxes_override (Tensor, optional) – Boxes to override self.tensor. Defaults to None.

Return type

torch.Tensor

Note

If a point is enclosed by multiple boxes, the index of the first box will be returned.

Returns

The index of the first box that each point is in with shape (M, ). Default value is -1 (if the point is not enclosed by any box).

Return type

Tensor

Parameters

points (torch.Tensor) –
boxes_override (Optional[torch.Tensor]) –

abstract rotate(angle, points=None)[source]¶

Rotate boxes with points (optional) with the given angle or rotation matrix.

Parameters

angle (Tensor or np.ndarray or float) – Rotation angle or rotation matrix.
points (Tensor or np.ndarray or BasePoints, optional) – Points to rotate. Defaults to None.

Returns

When points is None, the function returns None, otherwise it returns the rotated points and the rotation matrix rot_mat_T.

Return type

tuple or None

scale(scale_factor)[source]¶

Scale the box with horizontal and vertical scaling factors.

Parameters

scale_factors (float) – Scale factors to scale the boxes.
scale_factor (float) –

Return type

property shape: torch.Size¶

Shape of boxes.

Type: torch.Size

to(device, *args, **kwargs)[source]¶

Convert current boxes to a specific device.

Parameters: device (str or torch.device) – The name of the device.
Returns: A new boxes object on the specific device.
Return type: BaseInstance3DBoxes

property top_height: torch.Tensor¶

A vector with top height of each box in shape (N, ).

Type: Tensor

translate(trans_vector)[source]¶

Translate boxes with the given translation vector.

Parameters: trans_vector (Tensor or np.ndarray) – Translation vector of size 1x3.
Return type: None

property volume: torch.Tensor¶

A vector with volume of each box in shape (N, ).

Type: Tensor

property yaw: torch.Tensor¶

A vector with yaw of each box in shape (N, ).

Type: Tensor

class mmdet3d.structures.bbox_3d.Box3DMode(value)[source]¶

Enum of different ways to represent a box.

Coordinates in LiDAR:

            up z
               ^   x front
               |  /
               | /
left y <------ 0

The relative coordinate of bottom center in a LiDAR box is (0.5, 0.5, 0), and the yaw is around the z axis, thus the rotation axis=2.

Coordinates in Camera:

        z front
       /
      /
     0 ------> x right
     |
     |
     v
down y

The relative coordinate of bottom center in a CAM box is (0.5, 1.0, 0.5), and the yaw is around the y axis, thus the rotation axis=1.

Coordinates in Depth:

up z
   ^   y front
   |  /
   | /
   0 ------> x right

The relative coordinate of bottom center in a DEPTH box is (0.5, 0.5, 0), and the yaw is around the z axis, thus the rotation axis=2.

static convert(box, src, dst, rt_mat=None, with_yaw=True, correct_yaw=False)[source]¶

Convert boxes from src mode to dst mode.

Parameters

(Sequence[float] or np.ndarray or Tensor or (box) – BaseInstance3DBoxes): Can be a k-tuple, k-list or an Nxk array/tensor.
src (Box3DMode) – The source box mode.
dst (Box3DMode) – The target box mode.
rt_mat (np.ndarray or Tensor, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.
with_yaw (bool) – If box is an instance of BaseInstance3DBoxes, whether or not it has a yaw angle. Defaults to True.
correct_yaw (bool) – If the yaw is rotated by rt_mat. Defaults to False.
box (Union[Sequence[float], numpy.ndarray, torch.Tensor, mmdet3d.structures.bbox_3d.base_box3d.BaseInstance3DBoxes]) –

Returns

Sequence[float] or np.ndarray or Tensor or BaseInstance3DBoxes: The converted box of the same type.

Return type

Union[Sequence[float], numpy.ndarray, torch.Tensor, mmdet3d.structures.bbox_3d.base_box3d.BaseInstance3DBoxes]

class mmdet3d.structures.bbox_3d.CameraInstance3DBoxes(tensor, box_dim=7, with_yaw=True, origin=(0.5, 1.0, 0.5))[source]¶

3D boxes of instances in CAM coordinates.

Coordinates in Camera:

        z front (yaw=-0.5*pi)
       /
      /
     0 ------> x right (yaw=0)
     |
     |
     v
down y

The relative coordinate of bottom center in a CAM box is (0.5, 1.0, 0.5), and the yaw is around the y axis, thus the rotation axis=1. The yaw is 0 at the positive direction of x axis, and decreases from the positive direction of x to the positive direction of z.

Parameters

tensor (Tensor or np.ndarray or Sequence[Sequence[float]]) – The boxes data with shape (N, box_dim).
box_dim (int) – Number of the dimension of a box. Each row is (x, y, z, x_size, y_size, z_size, yaw). Defaults to 7.
with_yaw (bool) – Whether the box is with yaw rotation. If False, the value of yaw will be set to 0 as minmax boxes. Defaults to True.
origin (Tuple[float]) – Relative position of the box origin. Defaults to (0.5, 1.0, 0.5). This will guide the box be converted to (0.5, 1.0, 0.5) mode.

Return type

tensor¶

Float matrix with shape (N, box_dim).

Type: Tensor

box_dim¶

Integer indicating the dimension of a box. Each row is (x, y, z, x_size, y_size, z_size, yaw, …).

Type: int

with_yaw¶

If True, the value of yaw will be set to 0 as minmax boxes.

Type: bool

property bev: torch.Tensor¶

2D BEV box of each box with rotation in XYWHR format, in shape (N, 5).

Type: Tensor

property bottom_height: torch.Tensor¶

A vector with bottom height of each box in shape (N, ).

Type: Tensor

convert_to(dst, rt_mat=None, correct_yaw=False)[source]¶

Convert self to dst mode.

Parameters

dst (int) – The target Box mode.
rt_mat (Tensor or np.ndarray, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.
correct_yaw (bool) – Whether to convert the yaw angle to the target coordinate. Defaults to False.

Returns

The converted box of the same type in the dst mode.

Return type

property corners: torch.Tensor¶

Convert boxes to corners in clockwise order, in the form of (x0y0z0, x0y0z1, x0y1z1, x0y1z0, x1y0z0, x1y0z1, x1y1z1, x1y1z0).

             front z
                  /
                 /
   (x0, y0, z1) + -----------  + (x1, y0, z1)
               /|            / |
              / |           /  |
(x0, y0, z0) + ----------- +   + (x1, y1, z1)
             |  /      .   |  /
             | / origin    | /
(x0, y1, z0) + ----------- + -------> right x
             |             (x1, y1, z0)
             |
             v
        down y

Returns: A tensor with 8 corners of each box in shape (N, 8, 3).
Return type: Tensor

flip(bev_direction='horizontal', points=None)[source]¶

Flip the boxes in BEV along given BEV direction.

In CAM coordinates, it flips the x (horizontal) or z (vertical) axis.

Parameters

bev_direction (str) – Direction by which to flip. Can be chosen from ‘horizontal’ and ‘vertical’. Defaults to ‘horizontal’.
points (Tensor or np.ndarray or BasePoints, optional) – Points to flip. Defaults to None.

Returns

When points is None, the function returns None, otherwise it returns the flipped points.

Return type

Tensor or np.ndarray or BasePoints or None

property gravity_center: torch.Tensor¶

A tensor with center of each box in shape (N, 3).

Type: Tensor

property height: torch.Tensor¶

A vector with height of each box in shape (N, ).

Type: Tensor

classmethod height_overlaps(boxes1, boxes2)[source]¶

Calculate height overlaps of two boxes.

Note

This function calculates the height overlaps between boxes1 and boxes2, boxes1 and boxes2 should be in the same type.

Parameters

boxes1 (CameraInstance3DBoxes) – Boxes 1 contain N boxes.
boxes2 (CameraInstance3DBoxes) – Boxes 2 contain M boxes.

Returns

Calculated height overlap of the boxes.

Return type

Tensor

property local_yaw: torch.Tensor¶

A vector with local yaw of each box in shape (N, ). local_yaw equals to alpha in kitti, which is commonly used in monocular 3D object detection task, so only CameraInstance3DBoxes has the property.

Type: Tensor

points_in_boxes_all(points, boxes_override=None)[source]¶

Find all boxes in which each point is.

Parameters

points (Tensor) – Points in shape (1, M, 3) or (M, 3), 3 dimensions are (x, y, z) in LiDAR or depth coordinate.
boxes_override (Tensor, optional) – Boxes to override self.tensor. Defaults to None.

Returns

The index of all boxes in which each point is with shape (M, T).

Return type

Tensor

points_in_boxes_part(points, boxes_override=None)[source]¶

Find the box in which each point is.

Parameters

points (Tensor) – Points in shape (1, M, 3) or (M, 3), 3 dimensions are (x, y, z) in LiDAR or depth coordinate.
boxes_override (Tensor, optional) – Boxes to override self.tensor. Defaults to None.

Returns

The index of the first box that each point is in with shape (M, ). Default value is -1 (if the point is not enclosed by any box).

Return type

Tensor

rotate(angle, points=None)[source]¶

Rotate boxes with points (optional) with the given angle or rotation matrix.

Parameters

angle (Tensor or np.ndarray or float) – Rotation angle or rotation matrix.
points (Tensor or np.ndarray or BasePoints, optional) – Points to rotate. Defaults to None.

Returns

When points is None, the function returns None, otherwise it returns the rotated points and the rotation matrix rot_mat_T.

Return type

tuple or None

property top_height: torch.Tensor¶

A vector with top height of each box in shape (N, ).

Type: Tensor

class mmdet3d.structures.bbox_3d.Coord3DMode(value)[source]¶

Enum of different ways to represent a box and point cloud.

Coordinates in LiDAR:

            up z
               ^   x front
               |  /
               | /
left y <------ 0

The relative coordinate of bottom center in a LiDAR box is (0.5, 0.5, 0), and the yaw is around the z axis, thus the rotation axis=2.

Coordinates in Camera:

        z front
       /
      /
     0 ------> x right
     |
     |
     v
down y

The relative coordinate of bottom center in a CAM box is (0.5, 1.0, 0.5), and the yaw is around the y axis, thus the rotation axis=1.

Coordinates in Depth:

up z
   ^   y front
   |  /
   | /
   0 ------> x right

The relative coordinate of bottom center in a DEPTH box is (0.5, 0.5, 0), and the yaw is around the z axis, thus the rotation axis=2.

static convert(input, src, dst, rt_mat=None, with_yaw=True, correct_yaw=False, is_point=True)[source]¶

Convert boxes or points from src mode to dst mode.

Parameters

(Sequence[float] or np.ndarray or Tensor or (input) – BaseInstance3DBoxes or BasePoints): Can be a k-tuple, k-list or an Nxk array/tensor.
src (Box3DMode or Coord3DMode) – The source mode.
dst (Box3DMode or Coord3DMode) – The target mode.
rt_mat (np.ndarray or Tensor, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.
with_yaw (bool) – If box is an instance of BaseInstance3DBoxes, whether or not it has a yaw angle. Defaults to True.
correct_yaw (bool) – If the yaw is rotated by rt_mat. Defaults to False.
is_point (bool) – If input is neither an instance of BaseInstance3DBoxes nor an instance of BasePoints, whether or not it is point data. Defaults to True.
input (Union[Sequence[float], numpy.ndarray, torch.Tensor, mmdet3d.structures.bbox_3d.base_box3d.BaseInstance3DBoxes, mmdet3d.structures.points.base_points.BasePoints]) –

Returns

Sequence[float] or np.ndarray or Tensor or BaseInstance3DBoxes or BasePoints: The converted box or points of the same type.

static convert_box(box, src, dst, rt_mat=None, with_yaw=True, correct_yaw=False)[source]¶

Convert boxes from src mode to dst mode.

Parameters

(Sequence[float] or np.ndarray or Tensor or (box) – BaseInstance3DBoxes): Can be a k-tuple, k-list or an Nxk array/tensor.
src (Box3DMode) – The source box mode.
dst (Box3DMode) – The target box mode.
rt_mat (np.ndarray or Tensor, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.
with_yaw (bool) – If box is an instance of BaseInstance3DBoxes, whether or not it has a yaw angle. Defaults to True.
correct_yaw (bool) – If the yaw is rotated by rt_mat. Defaults to False.
box (Union[Sequence[float], numpy.ndarray, torch.Tensor, mmdet3d.structures.bbox_3d.base_box3d.BaseInstance3DBoxes]) –

Returns

Sequence[float] or np.ndarray or Tensor or BaseInstance3DBoxes: The converted box of the same type.

Return type

Union[Sequence[float], numpy.ndarray, torch.Tensor, mmdet3d.structures.bbox_3d.base_box3d.BaseInstance3DBoxes]

static convert_point(point, src, dst, rt_mat=None)[source]¶

Convert points from src mode to dst mode.

Parameters

box (Sequence[float] or np.ndarray or Tensor or BasePoints) – Can be a k-tuple, k-list or an Nxk array/tensor.
src (Coord3DMode) – The source point mode.
dst (Coord3DMode) – The target point mode.
rt_mat (np.ndarray or Tensor, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.
point (Union[Sequence[float], numpy.ndarray, torch.Tensor, mmdet3d.structures.points.base_points.BasePoints]) –

Returns

The converted point of the same type.

Return type

Sequence[float] or np.ndarray or Tensor or BasePoints

class mmdet3d.structures.bbox_3d.DepthInstance3DBoxes(tensor, box_dim=7, with_yaw=True, origin=(0.5, 0.5, 0))[source]¶

3D boxes of instances in DEPTH coordinates.

Coordinates in Depth:

up z    y front (yaw=0.5*pi)
   ^   ^
   |  /
   | /
   0 ------> x right (yaw=0)

The relative coordinate of bottom center in a Depth box is (0.5, 0.5, 0), and the yaw is around the z axis, thus the rotation axis=2. The yaw is 0 at the positive direction of x axis, and increases from the positive direction of x to the positive direction of y.

Parameters

tensor (Union[torch.Tensor, numpy.ndarray, Sequence[Sequence[float]]]) –
box_dim (int) –
with_yaw (bool) –
origin (Tuple[float, float, float]) –

Return type

tensor¶

Float matrix with shape (N, box_dim).

Type: Tensor

box_dim¶

Integer indicating the dimension of a box. Each row is (x, y, z, x_size, y_size, z_size, yaw, …).

Type: int

with_yaw¶

If True, the value of yaw will be set to 0 as minmax boxes.

Type: bool

convert_to(dst, rt_mat=None, correct_yaw=False)[source]¶

Convert self to dst mode.

Parameters

dst (int) – The target Box mode.
rt_mat (Tensor or np.ndarray, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.
correct_yaw (bool) – Whether to convert the yaw angle to the target coordinate. Defaults to False.

Returns

The converted box of the same type in the dst mode.

Return type

property corners: torch.Tensor¶

Convert boxes to corners in clockwise order, in the form of (x0y0z0, x0y0z1, x0y1z1, x0y1z0, x1y0z0, x1y0z1, x1y1z1, x1y1z0).

                            up z
             front y           ^
                  /            |
                 /             |
   (x0, y1, z1) + -----------  + (x1, y1, z1)
               /|            / |
              / |           /  |
(x0, y0, z1) + ----------- +   + (x1, y1, z0)
             |  /      .   |  /
             | / origin    | /
(x0, y0, z0) + ----------- + --------> right x
                           (x1, y0, z0)

Returns: A tensor with 8 corners of each box in shape (N, 8, 3).
Return type: Tensor

enlarged_box(extra_width)[source]¶

Enlarge the length, width and height of boxes.

Parameters: extra_width (float or Tensor) – Extra width to enlarge the box.
Returns: Enlarged boxes.
Return type: DepthInstance3DBoxes

flip(bev_direction='horizontal', points=None)[source]¶

Flip the boxes in BEV along given BEV direction.

In Depth coordinates, it flips the x (horizontal) or y (vertical) axis.

Parameters

bev_direction (str) – Direction by which to flip. Can be chosen from ‘horizontal’ and ‘vertical’. Defaults to ‘horizontal’.
points (Tensor or np.ndarray or BasePoints, optional) – Points to flip. Defaults to None.

Returns

When points is None, the function returns None, otherwise it returns the flipped points.

Return type

Tensor or np.ndarray or BasePoints or None

get_surface_line_center()[source]¶

Compute surface and line center of bounding boxes.

Returns: Surface and line center of bounding boxes.
Return type: Tuple[Tensor, Tensor]

rotate(angle, points=None)[source]¶

Rotate boxes with points (optional) with the given angle or rotation matrix.

Parameters

angle (Tensor or np.ndarray or float) – Rotation angle or rotation matrix.
points (Tensor or np.ndarray or BasePoints, optional) – Points to rotate. Defaults to None.

Returns

When points is None, the function returns None, otherwise it returns the rotated points and the rotation matrix rot_mat_T.

Return type

tuple or None

class mmdet3d.structures.bbox_3d.LiDARInstance3DBoxes(tensor, box_dim=7, with_yaw=True, origin=(0.5, 0.5, 0))[source]¶

3D boxes of instances in LIDAR coordinates.

Coordinates in LiDAR:

                         up z    x front (yaw=0)
                            ^   ^
                            |  /
                            | /
(yaw=0.5*pi) left y <------ 0

The relative coordinate of bottom center in a LiDAR box is (0.5, 0.5, 0), and the yaw is around the z axis, thus the rotation axis=2. The yaw is 0 at the positive direction of x axis, and increases from the positive direction of x to the positive direction of y.

Parameters

tensor (Union[torch.Tensor, numpy.ndarray, Sequence[Sequence[float]]]) –
box_dim (int) –
with_yaw (bool) –
origin (Tuple[float, float, float]) –

Return type

tensor¶

Float matrix with shape (N, box_dim).

Type: Tensor

box_dim¶

Integer indicating the dimension of a box. Each row is (x, y, z, x_size, y_size, z_size, yaw, …).

Type: int

with_yaw¶

If True, the value of yaw will be set to 0 as minmax boxes.

Type: bool

convert_to(dst, rt_mat=None, correct_yaw=False)[source]¶

Convert self to dst mode.

Parameters

dst (int) – The target Box mode.
rt_mat (Tensor or np.ndarray, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.
correct_yaw (bool) – Whether to convert the yaw angle to the target coordinate. Defaults to False.

Returns

The converted box of the same type in the dst mode.

Return type

property corners: torch.Tensor¶

Convert boxes to corners in clockwise order, in the form of (x0y0z0, x0y0z1, x0y1z1, x0y1z0, x1y0z0, x1y0z1, x1y1z1, x1y1z0).

                               up z
                front x           ^
                     /            |
                    /             |
      (x1, y0, z1) + -----------  + (x1, y1, z1)
                  /|            / |
                 / |           /  |
   (x0, y0, z1) + ----------- +   + (x1, y1, z0)
                |  /      .   |  /
                | / origin    | /
left y <------- + ----------- + (x0, y1, z0)
    (x0, y0, z0)

Returns: A tensor with 8 corners of each box in shape (N, 8, 3).
Return type: Tensor

enlarged_box(extra_width)[source]¶

Enlarge the length, width and height of boxes.

Parameters: extra_width (float or Tensor) – Extra width to enlarge the box.
Returns: Enlarged boxes.
Return type: LiDARInstance3DBoxes

flip(bev_direction='horizontal', points=None)[source]¶

Flip the boxes in BEV along given BEV direction.

In LIDAR coordinates, it flips the y (horizontal) or x (vertical) axis.

Parameters

bev_direction (str) – Direction by which to flip. Can be chosen from ‘horizontal’ and ‘vertical’. Defaults to ‘horizontal’.
points (Tensor or np.ndarray or BasePoints, optional) – Points to flip. Defaults to None.

Returns

When points is None, the function returns None, otherwise it returns the flipped points.

Return type

Tensor or np.ndarray or BasePoints or None

rotate(angle, points=None)[source]¶

Rotate boxes with points (optional) with the given angle or rotation matrix.

Parameters

angle (Tensor or np.ndarray or float) – Rotation angle or rotation matrix.
points (Tensor or np.ndarray or BasePoints, optional) – Points to rotate. Defaults to None.

Returns

When points is None, the function returns None, otherwise it returns the rotated points and the rotation matrix rot_mat_T.

Return type

tuple or None

mmdet3d.structures.bbox_3d.get_box_type(box_type)[source]¶

Get the type and mode of box structure.

Parameters: box_type (str) – The type of box structure. The valid value are “LiDAR”, “Camera” and “Depth”.
Raises: ValueError – A ValueError is raised when box_type does not belong to the three valid types.
Returns: Box type and box mode.
Return type: tuple

mmdet3d.structures.bbox_3d.get_proj_mat_by_coord_type(img_meta, coord_type)[source]¶

Obtain image features using points.

Parameters

img_meta (dict) – Meta information.
coord_type (str) – ‘DEPTH’ or ‘CAMERA’ or ‘LIDAR’. Can be case- insensitive.

Returns

Transformation matrix.

Return type

Tensor

mmdet3d.structures.bbox_3d.limit_period(val, offset=0.5, period=3.141592653589793)[source]¶

Limit the value into a period for periodic function.

Parameters

val (np.ndarray or Tensor) – The value to be converted.
offset (float) – Offset to set the value range. Defaults to 0.5.
period (float) – Period of the value. Defaults to np.pi.

Returns

Value in the range of [-offset * period, (1-offset) * period].

Return type

np.ndarray or Tensor

mmdet3d.structures.bbox_3d.mono_cam_box2vis(cam_box)[source]¶

This is a post-processing function on the bboxes from Mono-3D task. If we want to perform projection visualization, we need to:

rotate the box along x-axis for np.pi / 2 (roll)

change orientation from local yaw to global yaw

convert yaw by (np.pi / 2 - yaw)

After applying this function, we can project and draw it on 2D images.

Parameters: cam_box (CameraInstance3DBoxes) – 3D bbox in camera coordinate system before conversion. Could be gt bbox loaded from dataset or network prediction output.
Returns: Box after conversion.
Return type: CameraInstance3DBoxes

mmdet3d.structures.bbox_3d.points_cam2img(points_3d, proj_mat, with_depth=False)[source]¶

Project points in camera coordinates to image coordinates.

Parameters

points_3d (Tensor or np.ndarray) – Points in shape (N, 3).
proj_mat (Tensor or np.ndarray) – Transformation matrix between coordinates.
with_depth (bool) – Whether to keep depth in the output. Defaults to False.

Returns

Points in image coordinates with shape [N, 2] if with_depth=False, else [N, 3].

Return type

Tensor or np.ndarray

mmdet3d.structures.bbox_3d.points_img2cam(points, cam2img)[source]¶

Project points in image coordinates to camera coordinates.

Parameters

points (Tensor or np.ndarray) – 2.5D points in 2D images with shape [N, 3], 3 corresponds with x, y in the image and depth.
cam2img (Tensor or np.ndarray) – Camera intrinsic matrix. The shape can be [3, 3], [3, 4] or [4, 4].

Returns

Points in 3D space with shape [N, 3], 3 corresponds with x, y, z in 3D space.

Return type

Tensor or np.ndarray

mmdet3d.structures.bbox_3d.rotation_3d_in_axis(points, angles, axis=0, return_mat=False, clockwise=False)[source]¶

Rotate points by angles according to axis.

Parameters

points (np.ndarray or Tensor) – Points with shape (N, M, 3).
angles (np.ndarray or Tensor or float) – Vector of angles with shape (N, ).
axis (int) – The axis to be rotated. Defaults to 0.
return_mat (bool) – Whether or not to return the rotation matrix (transposed). Defaults to False.
clockwise (bool) – Whether the rotation is clockwise. Defaults to False.

Raises

ValueError – When the axis is not in range [-3, -2, -1, 0, 1, 2], it will raise ValueError.

Returns

Tuple[np.ndarray, np.ndarray] or Tuple[Tensor, Tensor] or np.ndarray or Tensor: Rotated points with shape (N, M, 3) and rotation matrix with shape (N, 3, 3).

Return type

Union[Tuple[numpy.ndarray, numpy.ndarray], Tuple[torch.Tensor, torch.Tensor], numpy.ndarray, torch.Tensor]

mmdet3d.structures.bbox_3d.xywhr2xyxyr(boxes_xywhr)[source]¶

Convert a rotated boxes in XYWHR format to XYXYR format.

Parameters: boxes_xywhr (Tensor or np.ndarray) – Rotated boxes in XYWHR format.
Returns: Converted boxes in XYXYR format.
Return type: Tensor or np.ndarray

ops¶

points¶

class mmdet3d.structures.points.BasePoints(tensor, points_dim=3, attribute_dims=None)[source]¶

Base class for Points.

Parameters

tensor (Tensor or np.ndarray or Sequence[Sequence[float]]) – The points data with shape (N, points_dim).
points_dim (int) – Integer indicating the dimension of a point. Each row is (x, y, z, …). Defaults to 3.
attribute_dims (dict, optional) – Dictionary to indicate the meaning of extra dimension. Defaults to None.

Return type

tensor¶

Float matrix with shape (N, points_dim).

Type: Tensor

points_dim¶

Integer indicating the dimension of a point. Each row is (x, y, z, …).

Type: int

attribute_dims¶

Dictionary to indicate the meaning of extra dimension. Defaults to None.

Type: dict, optional

rotation_axis¶

Default rotation axis for points rotation.

Type: int

property bev: torch.Tensor¶

BEV of the points in shape (N, 2).

Type: Tensor

classmethod cat(points_list)[source]¶

Concatenate a list of Points into a single Points.

Parameters: points_list (Sequence[BasePoints]) – List of points.
Returns: The concatenated points.
Return type: BasePoints

clone()[source]¶

Clone the points.

Returns: Point object with the same properties as self.
Return type: BasePoints

property color: Optional[torch.Tensor]¶

Returns a vector with color of each point in shape (N, 3).

Type: Tensor or None

abstract convert_to(dst, rt_mat=None)[source]¶

Convert self to dst mode.

Parameters

dst (int) – The target Point mode.
rt_mat (Tensor or np.ndarray, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.

Returns

The converted point of the same type in the dst mode.

Return type

property coord: torch.Tensor¶

Coordinates of each point in shape (N, 3).

Type: Tensor

cpu()[source]¶

Convert current points to cpu device.

Returns: A new points object on the cpu device.
Return type: BasePoints

cuda(*args, **kwargs)[source]¶

Convert current points to cuda device.

Returns: A new points object on the cuda device.
Return type: BasePoints

detach()[source]¶

Detach the points.

Returns: Point object with the same properties as self.
Return type: BasePoints

property device: torch.device¶

The device of the points are on.

Type: torch.device

abstract flip(bev_direction='horizontal')[source]¶

Flip the points along given BEV direction.

Parameters: bev_direction (str) – Flip direction (horizontal or vertical). Defaults to ‘horizontal’.
Return type: None

property height: Optional[torch.Tensor]¶

Returns a vector with height of each point in shape (N, ).

Type: Tensor or None

in_range_3d(point_range)[source]¶

Check whether the points are in the given range.

Parameters: point_range (Tensor or np.ndarray or Sequence[float]) – The range of point (x_min, y_min, z_min, x_max, y_max, z_max).
Return type: torch.Tensor

Note

In the original implementation of SECOND, checking whether a box in the range checks whether the points are in a convex polygon, we try to reduce the burden for simpler cases.

Returns: A binary vector indicating whether each point is inside the reference range.
Return type: Tensor
Parameters: point_range (Union[torch.Tensor, numpy.ndarray, Sequence[float]]) –

in_range_bev(point_range)[source]¶

Check whether the points are in the given range.

Parameters: point_range (Tensor or np.ndarray or Sequence[float]) – The range of point in order of (x_min, y_min, x_max, y_max).
Returns: A binary vector indicating whether each point is inside the reference range.
Return type: Tensor

new_point(data)[source]¶

Create a new point object with data.

The new point and its tensor has the similar properties as self and self.tensor, respectively.

Parameters: data (Tensor or np.ndarray or Sequence[Sequence[float]]) – Data to be copied.
Returns: A new point object with data, the object’s other properties are similar to self.
Return type: BasePoints

numpy()[source]¶

Reload numpy from self.tensor.

Return type: numpy.ndarray

rotate(rotation, axis=None)[source]¶

Rotate points with the given rotation matrix or angle.

Parameters

rotation (Tensor or np.ndarray or float) – Rotation matrix or angle.
axis (int, optional) – Axis to rotate at. Defaults to None.

Returns

Rotation matrix.

Return type

Tensor

scale(scale_factor)[source]¶

Scale the points with horizontal and vertical scaling factors.

Parameters

scale_factors (float) – Scale factors to scale the points.
scale_factor (float) –

Return type

property shape: torch.Size¶

Shape of points.

Type: torch.Size

shuffle()[source]¶

Shuffle the points.

Returns: The shuffled index.
Return type: Tensor

to(device, *args, **kwargs)[source]¶

Convert current points to a specific device.

Parameters: device (str or torch.device) – The name of the device.
Returns: A new points object on the specific device.
Return type: BasePoints

translate(trans_vector)[source]¶

Translate points with the given translation vector.

Parameters: trans_vector (Tensor or np.ndarray) – Translation vector of size 3 or nx3.
Return type: None

class mmdet3d.structures.points.CameraPoints(tensor, points_dim=3, attribute_dims=None)[source]¶

Points of instances in CAM coordinates.

Parameters

tensor (Tensor or np.ndarray or Sequence[Sequence[float]]) – The points data with shape (N, points_dim).
points_dim (int) – Integer indicating the dimension of a point. Each row is (x, y, z, …). Defaults to 3.
attribute_dims (dict, optional) – Dictionary to indicate the meaning of extra dimension. Defaults to None.

Return type

tensor¶

Float matrix with shape (N, points_dim).

Type: Tensor

points_dim¶

Integer indicating the dimension of a point. Each row is (x, y, z, …).

Type: int

attribute_dims¶

Dictionary to indicate the meaning of extra dimension. Defaults to None.

Type: dict, optional

rotation_axis¶

Default rotation axis for points rotation.

Type: int

property bev: torch.Tensor¶

BEV of the points in shape (N, 2).

Type: Tensor

convert_to(dst, rt_mat=None)[source]¶

Convert self to dst mode.

Parameters

dst (int) – The target Point mode.
rt_mat (Tensor or np.ndarray, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.

Returns

The converted point of the same type in the dst mode.

Return type

flip(bev_direction='horizontal')[source]¶

Flip the points along given BEV direction.

Parameters: bev_direction (str) – Flip direction (horizontal or vertical). Defaults to ‘horizontal’.
Return type: None

class mmdet3d.structures.points.DepthPoints(tensor, points_dim=3, attribute_dims=None)[source]¶

Points of instances in DEPTH coordinates.

Parameters

tensor (Tensor or np.ndarray or Sequence[Sequence[float]]) – The points data with shape (N, points_dim).
points_dim (int) – Integer indicating the dimension of a point. Each row is (x, y, z, …). Defaults to 3.
attribute_dims (dict, optional) – Dictionary to indicate the meaning of extra dimension. Defaults to None.

Return type

tensor¶

Float matrix with shape (N, points_dim).

Type: Tensor

points_dim¶

Integer indicating the dimension of a point. Each row is (x, y, z, …).

Type: int

attribute_dims¶

Dictionary to indicate the meaning of extra dimension. Defaults to None.

Type: dict, optional

rotation_axis¶

Default rotation axis for points rotation.

Type: int

convert_to(dst, rt_mat=None)[source]¶

Convert self to dst mode.

Parameters

dst (int) – The target Point mode.
rt_mat (Tensor or np.ndarray, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.

Returns

The converted point of the same type in the dst mode.

Return type

flip(bev_direction='horizontal')[source]¶

Flip the points along given BEV direction.

Parameters: bev_direction (str) – Flip direction (horizontal or vertical). Defaults to ‘horizontal’.
Return type: None

class mmdet3d.structures.points.LiDARPoints(tensor, points_dim=3, attribute_dims=None)[source]¶

Points of instances in LIDAR coordinates.

Parameters

tensor (Tensor or np.ndarray or Sequence[Sequence[float]]) – The points data with shape (N, points_dim).
points_dim (int) – Integer indicating the dimension of a point. Each row is (x, y, z, …). Defaults to 3.
attribute_dims (dict, optional) – Dictionary to indicate the meaning of extra dimension. Defaults to None.

Return type

tensor¶

Float matrix with shape (N, points_dim).

Type: Tensor

points_dim¶

Integer indicating the dimension of a point. Each row is (x, y, z, …).

Type: int

attribute_dims¶

Dictionary to indicate the meaning of extra dimension. Defaults to None.

Type: dict, optional

rotation_axis¶

Default rotation axis for points rotation.

Type: int

convert_to(dst, rt_mat=None)[source]¶

Convert self to dst mode.

Parameters

dst (int) – The target Point mode.
rt_mat (Tensor or np.ndarray, optional) – The rotation and translation matrix between different coordinates. Defaults to None. The conversion from src coordinates to dst coordinates usually comes along the change of sensors, e.g., from camera to LiDAR. This requires a transformation matrix.

Returns

The converted point of the same type in the dst mode.

Return type

flip(bev_direction='horizontal')[source]¶

Flip the points along given BEV direction.

Parameters: bev_direction (str) – Flip direction (horizontal or vertical). Defaults to ‘horizontal’.
Return type: None

mmdet3d.testing¶

mmdet3d.visualization¶

mmdet3d.utils¶

class mmdet3d.utils.ArrayConverter(template_array=None)[source]¶

Utility class for data-type agnostic processing.

Parameters

(np.ndarray or torch.Tensor or list or tuple or int or (template_array) – float, optional): Template array. Defaults to None.
template_array (Optional[Union[numpy.ndarray, torch.Tensor, list, tuple, int, float]]) –

Return type

convert(input_array, target_type=None, target_array=None)[source]¶

Convert input array to target data type.

Parameters

(np.ndarray or torch.Tensor or list or tuple or int or (input_array) – float): Input array.
target_type (Type, optional) – Type to which input array is converted. It should be np.ndarray or torch.Tensor. Defaults to None.
target_array (np.ndarray or torch.Tensor, optional) – Template array to which input array is converted. Defaults to None.
input_array (Union[numpy.ndarray, torch.Tensor, list, tuple, int, float]) –

Raises

ValueError – If input is list or tuple and cannot be converted to a NumPy array, a ValueError is raised.
TypeError – If input type does not belong to the above range, or the contents of a list or tuple do not share the same data type, a TypeError is raised.

Returns

The converted array.

Return type

np.ndarray or torch.Tensor

recover(input_array)[source]¶

Recover input type to original array type.

Parameters: input_array (np.ndarray or torch.Tensor) – Input array.
Returns: Converted array.
Return type: np.ndarray or torch.Tensor or int or float

set_template(array)[source]¶

Set template array.

Parameters

(np.ndarray or torch.Tensor or list or tuple or int or (array) – float): Template array.
array (Union[numpy.ndarray, torch.Tensor, list, tuple, int, float]) –

Raises

ValueError – If input is list or tuple and cannot be converted to a NumPy array, a ValueError is raised.
TypeError – If input type does not belong to the above range, or the contents of a list or tuple do not share the same data type, a TypeError is raised.

Return type