blocks.datasets

Classes

class blocks.datasets.IteraitDataset(config_str)[source]

Bases: datasets.BaseDataset

WARNING: must be adapted to Andy.

Base Iterait dataset providing methods for:
  • downloading annotations from Andy

  • creating symlinks to data @eric

  • building hipipe datasets

All Iterait dataset configs must contain data_root.

Available options to be configured are:
  • task_ids: a list of Andy task IDs to download the annotations for

  • dataset_ids: a list of Andy dataset IDs to make symlinks to

  • hipipe_dirs: a list of directories with hipipe CMake files to be build

  • hipipe_build_type: hipipe build type (optional, defaults to Debug)

  • annotator_url: Andy API url (optional, defaults to https://andy.iterait.com)

  • annotator_data_root: Andy data root (optional, defaults to /var/andy/data)

example usage in config
dataset:
  # ...
  data_root: data
  iterait:
    task_ids: [1, 2, 3]
    dataset_ids: [42]
    hipipe_dirs: dataset
    hipipe_build_type: Release

Tip

Start your work with emloop dataset init ... which will create the data_root, download the most up to date annotations, create symlinks to the data and build hipipe streams if necessary.

Tip

If you wrap a hipipe dataset, use blocks.utils.reflection.try_import() function. Otherwise you would not be able to create the dataset until you build it manually.

Inheritance diagram of IteraitDataset

ANNOTATOR_DATA_ROOT_CFGNAME = 'annotator_data_root'

Name of the Annotator data root configuration.

ANNOTATOR_PASSWORD_ENV_VARIABLE = 'ANDY_PASS'

Name of the Annotator password env. variable.

ANNOTATOR_URL_CFGNAME = 'annotator_url'

Name of the Annotator URL configuration.

ANNOTATOR_USERNAME_ENV_VARIABLE = 'ANDY_USER'

Name of the Annotator username env. variable.

DATASET_IDS_CFGNAME = 'dataset_ids'

Name of the Annotator dataset IDs configuration.

DATA_ROOT_CFGNAME = 'data_root'

Name of the data root configuration.

DEFAULT_ANNOTATOR_DATA_ROOT = '/var/annotator/data'

Default value for the Annotator data root configuration.

DEFAULT_ANNOTATOR_URL = 'https://andy.iterait.com'

Default value for the Annotator URL configuration.

DEFAULT_HIPIPE_BUILD_TYPE = 'Debug'

Default value for the hipipe build type configuration.

HIPIPE_BUILD_TYPE_CFGNAME = 'hipipe_build_type'

Name of the hipipe build type configuration.

HIPIPE_DIRS_CFGNAME = 'hipipe_dirs'

Name of the hipipe dirs configuration.

ITERAIT_SECTION_CFGNAME = 'iterait'

Name of the dataset iterait config section.

TASK_IDS_CFGNAME = 'task_ids'

Name of the Annotator tasks IDs configuration.

__init__(config_str)[source]

Create new dataset.

Decode the given YAML config string and pass the obtained **kwargs to _configure_dataset().

Parameters

config_str (str) – dataset configuration as YAML string

build()[source]

Build hipipe streams.

Return type

None

download_annotations()[source]

” Download the most up to date annotations for the configured task IDS.

Return type

None

init()[source]
Initialize the dataset, in particular:
  • create data_root dir if necessary

  • symlink all the data dirs if dataset_ids is specified

  • symlink all the annotations if task_ids is specified

  • build hipipe streams if hipipe_dirs is specified

Return type

None

Make symlinks to the data dirs for the configured dataset IDs.

Return type

None

class blocks.datasets.RPNDataset(config_str)[source]

Bases: blocks.datasets.iterait_dataset.IteraitDataset

Base RPN dataset wrapper for region classification and regression as the target.

This dataset may be adjusted for any region type such as rectangles or ellipses.

Inheritance diagram of RPNDataset

__init__(config_str)[source]

Create new RPNDataset.

Parameters

config_str (str) – yaml encoded configuration string

abstract _gen_anchors(pos=(0, 0))[source]

Generate anchors at the given position.

Parameters

pos – (x, y) anchor centers

Return type

Iterable[Sequence[float]]

Returns

anchors generator

_overlap_to_label(overlap)[source]

Map overlap ratio to a -1/0/1 label. The semantics are the following: - -1: neutral not positive nor negative anchor - 0: negative anchor (no object is within it) - 1: positive anchor (object region has high overlap with the anchor)

Parameters

overlap (float) – overlap ratio in 0-1 interval

Return type

int

Returns

anchor label

abstract anchor_region_overlap(anchor, region)[source]

Calculate anchor-region overlap.

Parameters
Return type

float

Returns

overlap ratio in 0-1 interval

abstract apply_diff(anchor, diff)[source]

Apply predicted diff to the given anchor and return the result.

Parameters
Return type

Sequence[float]

Returns

new anchor

configure_shape(features_spatial_dim, pool_amount)[source]

Configure the dataset with the features spatial dimension and amount of pooling the input images experience.

Note

The dataset has to be configured at least once prior to utilizing the streams.

Parameters
  • features_spatial_dim (Tuple[int, int]) – (height, width) feature spatial dimension

  • pool_amount (int) – amount of pooling of the input images (two maxpool-2 would yield 4)

Return type

None

abstract diffs_dim()[source]

Return trainable region-anchor diffs dimension. E.g. for rectangles this would be 4.

Return type

int

abstract get_diff(anchor, region)[source]

Get trainable difference of the given anchor to the given region.

Parameters
Return type

Sequence[float]

Returns

anchor-region trainable difference

abstract get_regions(batch, index)[source]

Return an iteration of regions for the given batch and example id.

Return type

Iterable[Any]

get_sensible_anchor_indices(region)[source]

Return an iteration of anchor indices (x, y) to be considered for the given region.

By default, all the anchor indices are returned which may be computationally expensive. You may want to limit the amount of returned indices, e.g.: return only anchors from certain radius around the region.

Return type

Iterable[Tuple[int, int]]

property n_anchors_per_position

The number of anchors for each position.

transform_batch(batch)[source]

Extend the given batch with anchors target.

..warning::

Works only for batches with batch size 1.

Parameters

batch (Mapping[str, Sequence[Any]]) – input batch of size 1

Return type

Mapping[str, Sequence[Any]]

Returns

batch extended with anchor targets

class blocks.datasets.RectangleRPNDataset(config_str)[source]

Bases: blocks.datasets.rpn.RPNDataset

RPNDataset embodiment for rectangles.

Warning

This dataset does not implement self.get_regions() method and hence, cannot be used directly.

Inheritance diagram of RectangleRPNDataset

_gen_anchors(pos=(0, 0))[source]

Generate anchors at the given position.

Parameters

pos – (x, y) anchor centers

Return type

Iterable[Sequence[float]]

Returns

anchors generator

anchor_region_overlap(anchor, region)[source]

Calculate rectangles overlap ratio.

Parameters
Return type

float

Returns

overlap ratio in 0-1 interval

apply_diff(anchor, diff)[source]

Apply predicted diff to the given anchor and return the result

Parameters
Return type

Sequence[float]

Returns

new anchor

diffs_dim()[source]

Return trainable region-anchor diffs dimension.

Return type

int

get_diff(anchor, target)[source]

Get trainable difference of the given anchor to the given region.

Parameters
Return type

Sequence[float]

Returns

anchor-region trainable difference