Dataset requirements
In order to ensure the highest quality of models, it is essential that the input data meets certain standards. Here, we have provided a detailed list of the requirements for the datasets that you plan to use. The first section lists the must-meet requirements, which must be fulfilled in order to use arivis AI toolkit. The second section contains a list of recommended practices that we highly recommend you follow to optimize your results.
Must-Meet Requirements
Requirements that have to be met in order for ML training and segmentation to run.
Homogeneous bitdepth
Bitdepth is the same for all training and subsequent segmentation images.
Homogeneous channel number
The number of channels is the same for all training and subsequent segmentation images. The alpha channel in png files is automatically removed and not used for training and segmentation.
Minimum image and dataset size
For training, the minimum image size is 128*128 pixel (recommended: 1024*1024 pixel) per plane and the minimum total number of pixels are 724*724 pixel.
Maximum image size
Currently, all data processing is done in memory, which limits the sizes of images that can be processed. The following are limits for a set of image parameters. The real limits for your application might differ depending on image sizes, dimensions, and properties.
- Maximum processable 2D plane size for images without Time or Z dimension
|
Image format |
Image size (pixels) |
|---|---|
|
RGB 8 bit |
16 000 x 16 000 |
|
Grayscale 16 bit |
17 000 x 17 000 |
- Maximum number of planes along the Time or Z dimension for the stack of images
Stack of RGB 8-bit images
|
Image size (pixels) |
Number of planes |
|---|---|
|
512 x 512 |
4000 |
|
1024 x 1024 |
3600 |
|
2646 x 2056 |
750 |
Stack of grayscale 16-bit images
|
Image size (pixels) |
Number of planes |
|---|---|
|
512 x 512 |
6000 |
|
1024 x 1024 |
5500 |
|
2464 x 2056 |
1100 |
Best practices
Requirements that should be met to produce optimal results.
Same acquisition mode
Images are acquired in the same acquisition mode for both training and segmentation (e.g. both times with reflection mode, not training with reflected image and segmentation with transmitted image).
Size of objects/representative region
- Objects or representative regions should be of similar pixel size across training and segmentation images.
- For optimal results, object sizes should be no larger than 320*320 pixels.
Image size
- For optimal performance, we recommend keeping the image sizes homogeneous during training, as the smallest image axis across all training images determines the area that the trained model "sees". This ensures that valuable image context is not excluded, which can affect the accuracy of the segmentation. However, if all training images are above 1024*1024 pixels, this recommendation does not apply.
- For segmentation and continuing training, we recommend using images with a minimum size of 1024*1024 pixels, or the size of the smallest image axis across all training images used in the first training, whichever is smaller. This minimum size threshold ensures that the trained model can effectively capture the necessary features in the image, leading to better segmentation results.
Have questions? We're here to help.
Have a question about dataset requirements? Feel free to reach out to our support team.