vis4d.op.detect.rpn
Faster RCNN RPN Head.
Functions
|
Get the default bounding box encoder and decoder for RPN. |
Classes
|
Generate Proposals (RoIs) from RPN network output. |
|
Faster RCNN RPN Head. |
|
Loss of region proposal network. |
|
RPN loss container. |
|
Output of RPN head. |
- class RPNOut(cls: list[torch.Tensor], box: list[torch.Tensor])[source]
Output of RPN head.
-
cls:
list[Tensor] Alias for field number 0
-
box:
list[Tensor] Alias for field number 1
-
cls:
- get_default_rpn_box_codec(target_means=(0.0, 0.0, 0.0, 0.0), target_stds=(1.0, 1.0, 1.0, 1.0))[source]
Get the default bounding box encoder and decoder for RPN.
- Return type:
- class RPNHead(num_anchors, num_convs=1, in_channels=256, feat_channels=256, start_level=2)[source]
Faster RCNN RPN Head.
Creates RPN network output from a multi-scale feature map input.
- __init__(num_anchors, num_convs=1, in_channels=256, feat_channels=256, start_level=2)[source]
Creates an instance of the class.
- Parameters:
num_anchors (int) – Number of anchors per cell.
num_convs (int, optional) – Number of conv layers before RPN heads. Defaults to 1.
in_channels (int, optional) – Feature channel size of input feature maps. Defaults to 256.
feat_channels (int, optional) – Feature channel size of conv layers. Defaults to 256.
start_level (int, optional) – starting level of feature maps. Defaults to 2.
- class RPN2RoI(anchor_generator, box_decoder=None, num_proposals_pre_nms_train=2000, num_proposals_pre_nms_test=1000, max_per_img=1000, proposal_nms_threshold=0.7, min_proposal_size=(0, 0))[source]
Generate Proposals (RoIs) from RPN network output.
This class acts as a stateless functor that does the following: 1. Create anchor grid for feature grids (classification and regression
outputs) at all scales.
- For each image
- For each level
- Get a topk pre-selection of flattened classification scores and
box energies from feature output before NMS.
Decode class scores and box energies into proposal boxes, apply NMS.
Return proposal boxes for all images.
- __init__(anchor_generator, box_decoder=None, num_proposals_pre_nms_train=2000, num_proposals_pre_nms_test=1000, max_per_img=1000, proposal_nms_threshold=0.7, min_proposal_size=(0, 0))[source]
Creates an instance of the class.
- Parameters:
anchor_generator (AnchorGenerator) – Creates anchor grid serving as for bounding box regression.
box_decoder (DeltaXYWHBBoxDecoder, optional) – decodes box energies predicted by the network into 2D bounding box parameters. Defaults to None. If None, uses the default decoder.
num_proposals_pre_nms_train (int, optional) – How many boxes are kept prior to NMS during training. Defaults to 2000.
num_proposals_pre_nms_test (int, optional) – How many boxes are kept prior to NMS during inference. Defaults to 1000.
max_per_img (int, optional) – Maximum boxes per image. Defaults to 1000.
proposal_nms_threshold (float, optional) – NMS threshold on proposal boxes. Defaults to 0.7.
min_proposal_size (tuple[int, int], optional) – Minimum size of a proposal box. Defaults to (0, 0).
- forward(class_outs, regression_outs, images_hw)[source]
Compute proposals from RPN network outputs.
Generate anchor grid for all scales. For each batch element:
Compute classification, regression and anchor pairs for all scales. Decode those pairs into proposals, post-process with NMS.
- Parameters:
class_outs (list[torch.Tensor]) – [N, 1 * A, H, W] per scale.
regression_outs (list[torch.Tensor]) – [N, 4 * A, H, W] per scale.
images_hw (list[tuple[int, int]]) – list of image sizes.
- Returns:
proposal boxes and scores.
- Return type:
- class RPNLosses(rpn_loss_cls: torch.Tensor, rpn_loss_bbox: torch.Tensor)[source]
RPN loss container.
-
rpn_loss_cls:
Tensor Alias for field number 0
-
rpn_loss_bbox:
Tensor Alias for field number 1
-
rpn_loss_cls:
- class RPNLoss(anchor_generator, box_encoder, matcher=None, sampler=None, loss_cls=<function binary_cross_entropy_with_logits>, loss_bbox=<function l1_loss>)[source]
Loss of region proposal network.
- __init__(anchor_generator, box_encoder, matcher=None, sampler=None, loss_cls=<function binary_cross_entropy_with_logits>, loss_bbox=<function l1_loss>)[source]
Creates an instance of the class.
- Parameters:
anchor_generator (AnchorGenerator) – Generates anchor grid priors.
box_encoder (DeltaXYWHBBoxEncoder) – Encodes bounding boxes to the desired network output.
matcher (Matcher) – Matches ground truth boxes to anchor grid priors. Defaults to None. If None, uses MaxIoUMatcher.
sampler (Sampler) – Samples anchors for training. Defaults to None. If None, uses RandomSampler.
loss_cls (TorchLossFunc) – Classification loss function. Defaults to F.binary_cross_entropy_with_logits.
loss_bbox (TorchLossFunc) – Regression loss function. Defaults to l1_loss.
- forward(cls_outs, reg_outs, target_boxes, images_hw, target_class_ids=None)[source]
Compute RPN classification and regression losses.
- Parameters:
cls_outs (list[torch.Tensor]) – Network classification outputs at all scales.
reg_outs (list[torch.Tensor]) – Network regression outputs at all scales.
target_boxes (list[torch.Tensor]) – Target bounding boxes.
images_hw (list[tuple[int, int]]) – Image dimensions without padding.
target_class_ids (list[torch.Tensor] | None) – Target class labels.
- Returns:
Classification and regression losses.
- Return type: