Dataset requirements

In order to ensure the highest quality of models, it is essential that the input data meets certain standards. Here, we have provided a detailed list of the requirements for the datasets that you plan to use. The first section lists the must-meet requirements, which must be fulfilled in order to use arivis AI toolkit. The second section contains a list of recommended practices that we highly recommend you follow to optimize your results.

Must-Meet Requirements

Requirements that have to be met in order for ML training and segmentation to run.

Homogeneous bitdepth

Bitdepth is the same for all training and subsequent segmentation images.

Homogeneous channel number

The number of channels is the same for all training and subsequent segmentation images. The alpha channel in png files is automatically removed and not used for training and segmentation.

Minimum image and dataset size

For training, the minimum image size is 128*128 pixel (recommended: 1024*1024 pixel) per plane and the minimum total number of pixels are 724*724 pixel.

Maximum image size

Currently, all data processing is done in memory, which limits the sizes of images that can be processed. The following are limits for a set of image parameters. The real limits for your application might differ depending on image sizes, dimensions, and properties.

Maximum processable 2D plane size for images without Time or Z dimension

Image format	Image size (pixels)
RGB 8 bit	16 000 x 16 000
Grayscale 16 bit	17 000 x 17 000

Maximum number of planes along the Time or Z dimension for the stack of images

Stack of RGB 8-bit images

Image size (pixels)	Number of planes
512 x 512	4000
1024 x 1024	3600
2646 x 2056	750

Stack of grayscale 16-bit images

Image size (pixels)	Number of planes
512 x 512	6000
1024 x 1024	5500
2464 x 2056	1100

Best practices

Requirements that should be met to produce optimal results.

Same acquisition mode

Images are acquired in the same acquisition mode for both training and segmentation (e.g. both times with reflection mode, not training with reflected image and segmentation with transmitted image).

Size of objects/representative region

Objects or representative regions should be of similar pixel size across training and segmentation images.
For optimal results, object sizes should be no larger than 320*320 pixels.

Image size

For optimal performance, we recommend keeping the image sizes homogeneous during training, as the smallest image axis across all training images determines the area that the trained model "sees". This ensures that valuable image context is not excluded, which can affect the accuracy of the segmentation. However, if all training images are above 1024*1024 pixels, this recommendation does not apply.
For segmentation and continuing training, we recommend using images with a minimum size of 1024*1024 pixels, or the size of the smallest image axis across all training images used in the first training, whichever is smaller. This minimum size threshold ensures that the trained model can effectively capture the necessary features in the image, leading to better segmentation results.

Have questions? We're here to help.

Have a question about dataset requirements? Feel free to reach out to our support team.