deepness.processing.models.detector.Detector

class Detector(model_file_path: str)

Bases: ModelBase

Class implements object detection features

Detector model is used for detection of objects in images. It is based on YOLOv5/YOLOv7 models style.

Methods

check_loaded_model_outputs

Check if model outputs are valid. Valid model are: - has 1 or 2 outputs layer - output layer shape length is 3 - batch size is 1.

crop_mask

get_channel_name

Get channel name by id if exists in model metadata

get_class_display_name

Get class display name

get_detector_type

Get detector type from metadata if exists

get_input_shape

Get shape of the input for the model

get_input_size_in_pixels

Get number of input pixels in x and y direction (the same value)

get_metadata_detection_confidence

Get detection confidence from metadata if exists

get_metadata_detection_iou_threshold

Get detection iou threshold from metadata if exists

get_metadata_model_type

Get model type from metadata

get_metadata_regression_output_scaling

Get regression output scaling from metadata if exists

get_metadata_resolution

Get resolution from metadata if exists

get_metadata_segmentation_small_segment

Get segmentation small segment from metadata if exists

get_metadata_segmentation_threshold

Get segmentation threshold from metadata if exists

get_metadata_standarization_parameters

Get standardization parameters from metadata if exists

get_metadata_tile_size

Get tile size from metadata if exists

get_metadata_tiles_overlap

Get tiles overlap from metadata if exists

get_model_batch_size

Get batch size of the model

get_model_type_from_metadata

Get model type from metadata

get_number_of_channels

Returns number of channels in the input layer

get_number_of_output_channels

Get number of output channels

get_output_shapes

Get shapes of the outputs for the model

get_outputs_channel_names

Get class names from metadata

non_max_suppression_fast

Apply non-maximum suppression to bounding boxes

postprocessing

Postprocess model output

preprocessing

Preprocess the batch of images for the model (resize, normalization, etc)

process

Process a single tile image

process_mask

set_inference_params

Set inference parameters

set_model_type_param

Set model type parameters

sigmoid

xywh2xyxy

Convert bounding box from (x,y,w,h) to (x1,y1,x2,y2) format

Attributes

confidence

Confidence threshold

iou_threshold

IoU threshold

model_type

Model type

check_loaded_model_outputs()

Check if model outputs are valid. Valid model are:

  • has 1 or 2 outputs layer

  • output layer shape length is 3

  • batch size is 1

confidence

Confidence threshold

Type:

float

get_channel_name(layer_id: int, channel_id: int) str

Get channel name by id if exists in model metadata

Parameters:

channel_id (int) – Channel id (means index in the output tensor)

Returns:

Channel name or empty string if not found

Return type:

str

classmethod get_class_display_name()

Get class display name

Returns:

Class display name

Return type:

str

get_detector_type() str | None

Get detector type from metadata if exists

Returns string value of DetectorType enum or None if not found

Optional[str]

Detector type or None if not found

get_input_shape() tuple

Get shape of the input for the model

Returns:

Shape of the input (batch_size, channels, height, width)

Return type:

tuple

get_input_size_in_pixels() int

Get number of input pixels in x and y direction (the same value)

Returns:

Number of pixels in x and y direction

Return type:

int

get_metadata_detection_confidence() float | None

Get detection confidence from metadata if exists

Returns:

Detection confidence or None if not found

Return type:

Optional[float]

get_metadata_detection_iou_threshold() float | None

Get detection iou threshold from metadata if exists

Returns:

Detection iou threshold or None if not found

Return type:

Optional[float]

get_metadata_model_type() str | None

Get model type from metadata

Returns:

Model type or None if not found

Return type:

Optional[str]

get_metadata_regression_output_scaling() float | None

Get regression output scaling from metadata if exists

Returns:

Regression output scaling or None if not found

Return type:

Optional[float]

get_metadata_resolution() float | None

Get resolution from metadata if exists

Returns:

Resolution or None if not found

Return type:

Optional[float]

get_metadata_segmentation_small_segment() int | None

Get segmentation small segment from metadata if exists

Returns:

Segmentation small segment or None if not found

Return type:

Optional[int]

get_metadata_segmentation_threshold() float | None

Get segmentation threshold from metadata if exists

Returns:

Segmentation threshold or None if not found

Return type:

Optional[float]

get_metadata_standarization_parameters() StandardizationParameters | None

Get standardization parameters from metadata if exists

Returns:

Standardization parameters or None if not found

Return type:

Optional[StandardizationParameters]

get_metadata_tile_size() int | None

Get tile size from metadata if exists

Returns:

Tile size or None if not found

Return type:

Optional[int]

get_metadata_tiles_overlap() int | None

Get tiles overlap from metadata if exists

Returns:

Tiles overlap or None if not found

Return type:

Optional[int]

get_model_batch_size() int | None

Get batch size of the model

Returns:

Batch size or None if not found (dynamic batch size)

Return type:

Optional[int] | None

classmethod get_model_type_from_metadata(model_file_path: str) str | None

Get model type from metadata

Parameters:

model_file_path (str) – Path to the model file

Returns:

Model type or None if not found

Return type:

Optional[str]

get_number_of_channels() int

Returns number of channels in the input layer

Returns:

Number of channels in the input layer

Return type:

int

get_number_of_output_channels()

Get number of output channels

Returns:

Number of output channels

Return type:

int

get_output_shapes() List[tuple]

Get shapes of the outputs for the model

Returns:

Shapes of the outputs (batch_size, channels, height, width)

Return type:

List[tuple]

get_outputs_channel_names() List[List[str]] | None

Get class names from metadata

Returns:

List of class names for each model output or None if not found

Return type:

List[List[str]] | None

iou_threshold

IoU threshold

Type:

float

model_type: DetectorType | None

Model type

Type:

DetectorType

static non_max_suppression_fast(boxes: ndarray, probs: ndarray, iou_threshold: float) List

Apply non-maximum suppression to bounding boxes

Based on: https://github.com/amusi/Non-Maximum-Suppression/blob/master/nms.py

Parameters:
  • boxes (np.ndarray) – Bounding boxes in (x1,y1,x2,y2) format

  • probs (np.ndarray) – Confidence scores

  • iou_threshold (float) – IoU threshold

Returns:

List of indexes of bounding boxes to keep

Return type:

List

postprocessing(model_output)

Postprocess model output

NOTE: Maybe refactor this, as it has many added layers of checks which can be simplified.

Parameters:

model_output (list) – Model output

Returns:

Batch of lists of detections

Return type:

list

preprocessing(tiles_batched: ndarray) ndarray

Preprocess the batch of images for the model (resize, normalization, etc)

Parameters:

image (np.ndarray) – Batch of images to preprocess (N,H,W,C), RGB, 0-255

Returns:

Preprocessed batch of image (N,C,H,W), RGB, 0-1

Return type:

np.ndarray

process(tiles_batched: ndarray)

Process a single tile image

Parameters:

img (np.ndarray) – Image to process ([TILE_SIZE x TILE_SIZE x channels], type uint8, values 0 to 255)

Returns:

Single prediction

Return type:

np.ndarray

set_inference_params(confidence: float, iou_threshold: float)

Set inference parameters

Parameters:
  • confidence (float) – Confidence threshold

  • iou_threshold (float) – IoU threshold

set_model_type_param(model_type: DetectorType)

Set model type parameters

Parameters:

model_type (str) – Model type

static xywh2xyxy(x: ndarray) ndarray

Convert bounding box from (x,y,w,h) to (x1,y1,x2,y2) format

Parameters:

x (np.ndarray) – Bounding box in (x,y,w,h) format

Returns:

Bounding box in (x1,y1,x2,y2) format

Return type:

np.ndarray