Dataloop’s SDK and CLI documentation

Drive your AI to production with end-to-end data management, automation pipelines and a quality-first data labeling platform

Command Line Interface

Options:

CLI for Dataloop

usage: dlp [-h] [-v]
           {shell,upgrade,logout,login,login-token,login-secret,login-m2m,init,checkout-state,help,version,api,projects,datasets,items,videos,app,services,triggers,deploy,generate,packages,ls,pwd,cd,mkdir,clear,exit}
           ...

Positional Arguments

operation

Possible choices: shell, upgrade, logout, login, login-token, login-secret, login-m2m, init, checkout-state, help, version, api, projects, datasets, items, videos, app, services, triggers, deploy, generate, packages, ls, pwd, cd, mkdir, clear, exit

supported operations

Named Arguments

-v, --version

dtlpy version

Default: False

Sub-commands:

shell

Open interactive Dataloop shell

dlp shell [-h]

upgrade

Update dtlpy package

dlp upgrade [-h] [-u ]
optional named arguments
-u, --url

Package url. default ‘dtlpy’

logout

Logout

dlp logout [-h]

login

Login using web Auth0 interface

dlp login [-h]

login-token

Login by passing a valid token

dlp login-token [-h] -t 
required named arguments
-t, --token

valid token

login-secret

Login client id and secret

dlp login-secret [-h] [-e ] [-p ] [-i ] [-s ]
required named arguments
-e, --email

user email

-p, --password

user password

-i, --client-id

client id

-s, --client-secret

client secret

login-m2m

Login client id and secret

dlp login-m2m [-h] [-e ] [-p ] [-i ] [-s ]
required named arguments
-e, --email

user email

-p, --password

user password

-i, --client-id

client id

-s, --client-secret

client secret

init

Initialize a .dataloop context

dlp init [-h]

checkout-state

Print checkout state

dlp checkout-state [-h]

help

Get help

dlp help [-h]

version

DTLPY SDK version

dlp version [-h]

api

Connection and environment

dlp api [-h] {info,setenv} ...
Positional Arguments
api

Possible choices: info, setenv

gate operations

Sub-commands:
info

Print api information

dlp api info [-h]
setenv

Set platform environment

dlp api setenv [-h] -e 
required named arguments
-e, --env

working environment

projects

Operations with projects

dlp projects [-h] {ls,create,checkout,web} ...
Positional Arguments
projects

Possible choices: ls, create, checkout, web

projects operations

Sub-commands:
ls

List all projects

dlp projects ls [-h]
create

Create a new project

dlp projects create [-h] [-p ]
required named arguments
-p, --project-name

project name

checkout

checkout a project

dlp projects checkout [-h] [-p ]
required named arguments
-p, --project-name

project name

web

Open in web browser

dlp projects web [-h] [-p ]
optional named arguments
-p, --project-name

project name

datasets

Operations with datasets

dlp datasets [-h] {web,ls,create,checkout} ...
Positional Arguments
datasets

Possible choices: web, ls, create, checkout

datasets operations

Sub-commands:
web

Open in web browser

dlp datasets web [-h] [-p ] [-d ]
optional named arguments
-p, --project-name

project name

-d, --dataset-name

dataset name

ls

List of datasets in project

dlp datasets ls [-h] [-p ]
optional named arguments
-p, --project-name

project name. Default taken from checked out (if checked out)

create

Create a new dataset

dlp datasets create [-h] -d  [-p ] [-c]
required named arguments
-d, --dataset-name

dataset name

optional named arguments
-p, --project-name

project name. Default taken from checked out (if checked out)

-c, --checkout

checkout the new dataset

Default: False

checkout

checkout a dataset

dlp datasets checkout [-h] [-d ] [-p ]
required named arguments
-d, --dataset-name

dataset name

optional named arguments
-p, --project-name

project name. Default taken from checked out (if checked out)

items

Operations with items

dlp items [-h] {web,ls,upload,download} ...
Positional Arguments
items

Possible choices: web, ls, upload, download

items operations

Sub-commands:
web

Open in web browser

dlp items web [-h] [-r ] [-p ] [-d ]
required named arguments
-r, --remote-path

remote path

optional named arguments
-p, --project-name

project name

-d, --dataset-name

dataset name

ls

List of items in dataset

dlp items ls [-h] [-p ] [-d ] [-o ] [-r ] [-t ]
optional named arguments
-p, --project-name

project name. Default taken from checked out (if checked out)

-d, --dataset-name

dataset name. Default taken from checked out (if checked out)

-o, --page

page number (integer)

Default: 0

-r, --remote-path

remote path

-t, --type

Item type

upload

Upload directory to dataset

dlp items upload [-h] -l  [-p ] [-d ] [-r ] [-f ] [-lap ] [-ow]
required named arguments
-l, --local-path

local path

optional named arguments
-p, --project-name

project name. Default taken from checked out (if checked out)

-d, --dataset-name

dataset name. Default taken from checked out (if checked out)

-r, --remote-path

remote path to upload to. default: /

-f, --file-types

Comma separated list of file types to upload, e.g “.jpg,.png”. default: all

-lap, --local-annotations-path

Path for local annotations to upload with items

-ow, --overwrite

Overwrite existing item

Default: False

download

Download dataset to a local directory

dlp items download [-h] [-p ] [-d ] [-ao ] [-aft ] [-afl ] [-r ] [-ow]
                   [-t] [-wt] [-th ] [-l ] [-wb]
optional named arguments
-p, --project-name

project name. Default taken from checked out (if checked out)

-d, --dataset-name

dataset name. Default taken from checked out (if checked out)

-ao, --annotation-options

which annotation to download. options: json,instance,mask

-aft, --annotation-filter-type

annotation type filter when downloading annotations. options: box,segment,binary etc

-afl, --annotation-filter-label

labels filter when downloading annotations.

-r, --remote-path

remote path to upload to. default: /

-ow, --overwrite

Overwrite existing item

Default: False

-t, --not-items-folder

Download WITHOUT ‘items’ folder

Default: False

-wt, --with-text

Annotations will have text in mask

Default: False

-th, --thickness

Annotation line thickness

Default: “1”

-l, --local-path

local path

-wb, --without-binaries

Don’t download item binaries

Default: False

videos

Operations with videos

dlp videos [-h] {play,upload} ...
Positional Arguments
videos

Possible choices: play, upload

videos operations

Sub-commands:
play

Play video

dlp videos play [-h] [-l ] [-p ] [-d ]
optional named arguments
-l, --item-path

Video remote path in platform. e.g /dogs/dog.mp4

-p, --project-name

project name. Default taken from checked out (if checked out)

-d, --dataset-name

dataset name. Default taken from checked out (if checked out)

upload

Upload a single video

dlp videos upload [-h] -f  -p  -d  [-r ] [-sc ] [-ss ] [-st ] [-e]
required named arguments
-f, --filename

local filename to upload

-p, --project-name

project name

-d, --dataset-name

dataset name

optional named arguments
-r, --remote-path

remote path

Default: “/”

-sc, --split-chunks

Video splitting parameter: Number of chunks to split

-ss, --split-seconds

Video splitting parameter: Seconds of each chuck

-st, --split-times

Video splitting parameter: List of seconds to split at. e.g 600,1800,2000

-e, --encode

encode video to mp4, remove bframes and upload

Default: False

app

Operations with application

dlp app [-h] {init,pack,publish,update,install,pull} ...
Positional Arguments
app

Possible choices: init, pack, publish, update, install, pull

application operations

Sub-commands:
init

Initialize the structure in order to deploy a dpk

dlp app init [-h] [--name NAME] [--description DESCRIPTION]
             [--categories CATEGORIES] [--icon ICON] [--scope SCOPE]
Optional named arguments
--name

the name of the app

--description

the description of the app

--categories

the categories of the app (comma seperated)

--icon

the icon of the app

--scope

the scope of the app (default is organization)

pack

Pack the project as dpk file

dlp app pack [-h]
publish

Publish the app

dlp app publish [-h] --project-name PROJECT_NAME
Required named arguments
--project-name

The name of the project

update

Update the app

dlp app update [-h] --app-name APP_NAME --new-version NEW_VERSION
               --project-name PROJECT_NAME
Required named arguments
--app-name

Locates the app by the name

--new-version

Sets the new version of the specified app

--project-name

The name of the project

install

Install the app to the platform

dlp app install [-h] --dpk-id DPK_ID [--project-name PROJECT_NAME]
                [--org-id ORG_ID]
Required named arguments
--dpk-id

The id of the dpk

--project-name

The name of the project

Optional named arguments
--org-id

The name of the org

pull

Pull the app from the marketplace

dlp app pull [-h] --dpk-name APP_NAME
Required named arguments
--dpk-name

The name of the dpk

services

Operations with services

dlp services [-h] {execute,tear-down,ls,log,delete} ...
Positional Arguments
services

Possible choices: execute, tear-down, ls, log, delete

services operations

Sub-commands:
execute

Create an execution

dlp services execute [-h] [-f FUNCTION_NAME] [-s SERVICE_NAME]
                     [-pr PROJECT_NAME] [-as] [-i ITEM_ID] [-d DATASET_ID]
                     [-a ANNOTATION_ID] [-in INPUTS]
optional named arguments
-f, --function-name

which function to run

-s, --service-name

which service to run

-pr, --project-name

Project name

-as, --async

Async execution

Default: True

-i, --item-id

Item input

-d, --dataset-id

Dataset input

-a, --annotation-id

Annotation input

-in, --inputs

Dictionary string input

Default: “{}”

tear-down

tear-down service of service.json file

dlp services tear-down [-h] [-l LOCAL_PATH] [-pr PROJECT_NAME]
optional named arguments
-l, --local-path

path to service.json file

-pr, --project-name

Project name

ls

List project’s services

dlp services ls [-h] [-pr PROJECT_NAME] [-pkg PACKAGE_NAME]
optional named arguments
-pr, --project-name

Project name

-pkg, --package-name

Package name

log

Get services log

dlp services log [-h] [-pr PROJECT_NAME] [-f SERVICE_NAME] [-t START]
required named arguments
-pr, --project-name

Project name

-f, --service-name

Project name

-t, --start

Log start time

delete

Delete Service

dlp services delete [-h] [-f SERVICE_NAME] [-p PROJECT_NAME]
                    [-pkg PACKAGE_NAME]
optional named arguments
-f, --service-name

Service name

-p, --project-name

Project name

-pkg, --package-name

Package name

triggers

Operations with triggers

dlp triggers [-h] {create,delete,ls} ...
Positional Arguments
triggers

Possible choices: create, delete, ls

triggers operations

Sub-commands:
create

Create a Service Trigger

dlp triggers create [-h] -r RESOURCE -a ACTIONS [-p PROJECT_NAME]
                    [-pkg PACKAGE_NAME] [-f SERVICE_NAME] [-n NAME]
                    [-fl FILTERS] [-fn FUNCTION_NAME]
required named arguments
-r, --resource

Resource name

-a, --actions

Actions

optional named arguments
-p, --project-name

Project name

-pkg, --package-name

Package name

-f, --service-name

Service name

-n, --name

Trigger name

-fl, --filters

Json filter

Default: “{}”

-fn, --function-name

Function name

Default: “run”

delete

Delete Trigger

dlp triggers delete [-h] -t TRIGGER_NAME [-f SERVICE_NAME] [-p PROJECT_NAME]
                    [-pkg PACKAGE_NAME]
required named arguments
-t, --trigger-name

Trigger name

optional named arguments
-f, --service-name

Service name

-p, --project-name

Project name

-pkg, --package-name

Package name

ls

List triggers

dlp triggers ls [-h] [-pr PROJECT_NAME] [-pkg PACKAGE_NAME] [-s SERVICE_NAME]
optional named arguments
-pr, --project-name

Project name

-pkg, --package-name

Package name

-s, --service-name

Service name

deploy

deploy with json file

dlp deploy [-h] [-f JSON_FILE] [-p PROJECT_NAME]
required named arguments
-f

Path to json file

-p

Project name

generate

generate a json file

dlp generate [-h] [--option PACKAGE_TYPE] [-p PACKAGE_NAME]
optional named arguments
--option

cataluge of examples

-p, --package-name

Package name

packages

Operations with packages

dlp packages [-h] {ls,push,deploy,test,checkout,delete} ...
Positional Arguments
packages

Possible choices: ls, push, deploy, test, checkout, delete

package operations

Sub-commands:
ls

List packages

dlp packages ls [-h] [-p PROJECT_NAME]
optional named arguments
-p, --project-name

Project name

push

Create package in platform

dlp packages push [-h] [-src ] [-cid ] [-pr ] [-p ] [-c]
optional named arguments
-src, --src-path

Revision to deploy if selected True

-cid, --codebase-id

Revision to deploy if selected True

-pr, --project-name

Project name

-p, --package-name

Package name

-c, --checkout

checkout the new package

Default: False

deploy

Deploy package to platform

dlp packages deploy [-h] [-p ] [-pr ] [--module-name ] [-c]
optional named arguments
-p, --package-name

Package name

-pr, --project-name

Project name

--module-name

Package module name

Default: “default_module”

-c, --checkout

checkout the new package

Default: False

test

Tests that Package locally using mock.json

dlp packages test [-h] [-c ] [-f ]
optional named arguments
-c, --concurrency

Revision to deploy if selected True

Default: 10

-f, --function-name

Function to test

Default: “run”

checkout

checkout a package

dlp packages checkout [-h] [-p ]
required named arguments
-p, --package-name

package name

delete

Delete Package

dlp packages delete [-h] [-pkg PACKAGE_NAME] [-p PROJECT_NAME]
optional named arguments
-pkg, --package-name

Package name

-p, --project-name

Project name

ls

List directories

dlp ls [-h]

pwd

Get current working directory

dlp pwd [-h]

cd

Change current working directory

dlp cd [-h] dir
Positional Arguments
dir

mkdir

Make directory

dlp mkdir [-h] name
Positional Arguments
name

clear

Clear shell

dlp clear [-h]

exit

Exit interactive shell

dlp exit [-h]

Repositories

Organizations

class Organizations(client_api: ApiClient)[source]

Bases: object

Organizations Repository

Read our documentation and SDK documentation to learn more about Organizations in the Dataloop platform.

add_member(email: str, role: MemberOrgRole = MemberOrgRole.MEMBER, organization_id: Optional[str] = None, organization_name: Optional[str] = None, organization: Optional[Organization] = None)[source]

Add members to your organization. Read about members and groups here.

Prerequisities: To add members to an organization, you must be an owner in that organization.

You must provide at least ONE of the following params: organization, organization_name, or organization_id.

Parameters
  • email (str) – the member’s email

  • role (str) – MemberOrgRole.ADMIN, MemberOrgRole.OWNER, MemberOrgRole.MEMBER

  • organization_id (str) – Organization id

  • organization_name (str) – Organization name

  • organization (entities.Organization) – Organization object

Returns

True if successful or error if unsuccessful

Return type

bool

Example:

dl.organizations.add_member(email='user@domain.com',
                            organization_id='organization_id',
                            role=dl.MemberOrgRole.MEMBER)
cache_action(organization_id: Optional[str] = None, organization_name: Optional[str] = None, organization: Optional[Organization] = None, mode=CacheAction.APPLY, pod_type=PodType.SMALL)[source]

Add or remove Cache for the org

Prerequisites: You must be an organization owner

You must provide at least ONE of the following params: organization, organization_name, or organization_id.

Parameters
  • organization_id (str) – Organization id

  • organization_name (str) – Organization name

  • organization (entities.Organization) – Organization object

  • mode (str) – dl.CacheAction.APPLY or dl.CacheAction.DESTROY

  • pod_type (entities.PodType) – dl.PodType.SMALL, dl.PodType.MEDIUM, dl.PodType.HIGH

Returns

True if success

Return type

bool

Example:

dl.organizations.enable_cache(organization_id='organization_id',
                              mode=dl.CacheAction.APPLY)
delete_member(user_id: str, organization_id: Optional[str] = None, organization_name: Optional[str] = None, organization: Optional[Organization] = None, sure: bool = False, really: bool = False) bool[source]

Delete member from the Organization.

Prerequisites: Must be an organization owner to delete members.

You must provide at least ONE of the following params: organization_id, organization_name, organization.

Parameters
  • user_id (str) – user id

  • organization_id (str) – Organization id

  • organization_name (str) – Organization name

  • organization (entities.Organization) – Organization object

  • sure (bool) – Are you sure you want to delete?

  • really (bool) – Really really sure?

Returns

True if success and error if not

Return type

bool

Example:

dl.organizations.delete_member(user_id='user_id',
                                organization_id='organization_id',
                                sure=True,
                                really=True)
get(organization_id: Optional[str] = None, organization_name: Optional[str] = None, fetch: Optional[bool] = None) Organization[source]

Get Organization object to be able to use it in your code.

Prerequisites: You must be a superuser to use this method.

You must provide at least ONE of the following params: organization_name or organization_id.

Parameters
  • organization_id (str) – optional - search by id

  • organization_name (str) – optional - search by name

  • fetch – optional - fetch entity from platform, default taken from cookie

Returns

Organization object

Return type

dtlpy.entities.organization.Organization

Example:

dl.organizations.get(organization_id='organization_id')
list() List[Organization][source]

Lists all the organizations in Dataloop.

Prerequisites: You must be a superuser to use this method.

Returns

List of Organization objects

Return type

list

Example:

dl.organizations.list()
list_groups(organization: Optional[Organization] = None, organization_id: Optional[str] = None, organization_name: Optional[str] = None)[source]

List all organization groups (groups that were created within the organization).

Prerequisites: You must be an organization owner to use this method.

You must provide at least ONE of the following params: organization, organization_name, or organization_id.

Parameters
  • organization (entities.Organization) – Organization object

  • organization_id (str) – Organization id

  • organization_name (str) – Organization name

Returns

groups list

Return type

list

Example:

dl.organizations.list_groups(organization_id='organization_id')
list_integrations(organization: Optional[Organization] = None, organization_id: Optional[str] = None, organization_name: Optional[str] = None, only_available=False)[source]

List all organization integrations with external cloud storage.

Prerequisites: You must be an organization owner to use this method.

You must provide at least ONE of the following params: organization_id, organization_name, or organization.

Parameters
  • organization (entities.Organization) – Organization object

  • organization_id (str) – Organization id

  • organization_name (str) – Organization name

  • only_available (bool) – if True list only the available integrations

Returns

integrations list

Return type

list

Example:

dl.organizations.list_integrations(organization='organization-entity',
                                    only_available=True)
list_members(organization: Optional[Organization] = None, organization_id: Optional[str] = None, organization_name: Optional[str] = None, role: Optional[MemberOrgRole] = None)[source]

List all organization members.

Prerequisites: You must be an organization owner to use this method.

You must provide at least ONE of the following params: organization_id, organization_name, or organization.

Parameters
  • organization (entities.Organization) – Organization object

  • organization_id (str) – Organization id

  • organization_name (str) – Organization name

  • role (entities.MemberOrgRole) – MemberOrgRole.ADMIN, MemberOrgRole.OWNER, MemberOrgRole.MEMBER

Returns

projects list

Return type

list

Example:

dl.organizations.list_members(organization='organization-entity',
                            role=dl.MemberOrgRole.MEMBER)
update(plan: str, organization: Optional[Organization] = None, organization_id: Optional[str] = None, organization_name: Optional[str] = None) Organization[source]

Update an organization.

Prerequisites: You must be a superuser to update an organization.

You must provide at least ONE of the following params: organization, organization_name, or organization_id.

Parameters
  • plan (str) – OrganizationsPlans.FREEMIUM, OrganizationsPlans.PREMIUM

  • organization (entities.Organization) – Organization object

  • organization_id (str) – Organization id

  • organization_name (str) – Organization name

Returns

organization object

Return type

dtlpy.entities.organization.Organization

Example:

dl.organizations.update(organization='organization-entity',
                        plan=dl.OrganizationsPlans.FREEMIUM)
update_member(email: str, role: MemberOrgRole = MemberOrgRole.MEMBER, organization_id: Optional[str] = None, organization_name: Optional[str] = None, organization: Optional[Organization] = None)[source]

Update member role.

Prerequisites: You must be an organization owner to update a member’s role.

You must provide at least ONE of the following params: organization, organization_name, or organization_id.

Parameters
  • email (str) – the member’s email

  • role (str) – MemberOrgRole.ADMIN, MemberOrgRole.OWNER, MemberOrgRole.MEMBER

  • organization_id (str) – Organization id

  • organization_name (str) – Organization name

  • organization (entities.Organization) – Organization object

Returns

json of the member fields

Return type

dict

Example:

dl.organizations.update_member(email='user@domain.com',
                                organization_id='organization_id',
                                 role=dl.MemberOrgRole.MEMBER)

Integrations

Integrations Repository

class Integrations(client_api: ApiClient, org: Optional[Organization] = None, project: Optional[Project] = None)[source]

Bases: object

Integrations Repository

The Integrations class allows you to manage data integrtion from your external storage (e.g., S3, GCS, Azure) into your Dataloop’s Dataset storage, as well as sync data in your Dataloop’s Datasets with data in your external storage.

For more information on Organization Storgae Integration see the Dataloop documentation and SDK External Storage.

create(integrations_type: ExternalStorage, name: str, options: dict)[source]

Create an integration between an external storage and the organization.

Examples for options include: s3 - {key: “”, secret: “”}; gcs - {key: “”, secret: “”, content: “”}; azureblob - {key: “”, secret: “”, clientId: “”, tenantId: “”}; key_value - {key: “”, value: “”} aws-sts - {key: “”, secret: “”, roleArns: “”}

Prerequisites: You must be an owner in the organization.

Parameters
  • integrations_type (str) – integrations type dl.ExternalStorage

  • name (str) – integrations name

  • options (dict) – dict of storage secrets

Returns

success

Return type

bool

Example:

project.integrations.create(integrations_type=dl.ExternalStorage.S3,
                name='S3ntegration',
                options={key: "Access key ID", secret: "Secret access key"})
delete(integrations_id: str, sure: bool = False, really: bool = False) bool[source]

Delete integrations from the organization.

Prerequisites: You must be an organization owner to delete an integration.

Parameters
  • integrations_id (str) – integrations id

  • sure (bool) – Are you sure you want to delete?

  • really (bool) – Really really sure?

Returns

success

Return type

bool

Example:

project.integrations.delete(integrations_id='integrations_id', sure=True, really=True)
get(integrations_id: str)[source]

Get organization integrations. Use this method to access your integration and be able to use it in your code.

Prerequisites: You must be an owner in the organization.

Parameters

integrations_id (str) – integrations id

Returns

Integration object

Return type

dtlpy.entities.integration.Integration

Example:

project.integrations.get(integrations_id='integrations_id')
list(only_available=False)[source]

List all the organization’s integrations with external storage.

Prerequisites: You must be an owner in the organization.

Parameters

only_available (bool) – if True list only the available integrations.

Returns

groups list

Return type

list

Example:

project.integrations.list(only_available=True)
update(new_name: str, integrations_id: str)[source]

Update the integration’s name.

Prerequisites: You must be an owner in the organization.

Parameters
  • new_name (str) – new name

  • integrations_id (str) – integrations id

Returns

Integration object

Return type

dtlpy.entities.integration.Integration

Example:

project.integrations.update(integrations_id='integrations_id', new_name="new_integration_name")

Projects

class Projects(client_api: ApiClient, org=None)[source]

Bases: object

Projects Repository

The Projects class allows the user to manage projects and their properties.

For more information on Projects see the Dataloop documentation and SDK documentation.

add_member(email: str, project_id: str, role: MemberRole = MemberRole.DEVELOPER)[source]

Add a member to the project.

Prerequisites: You must be in the role of an owner to add a member to a project.

Parameters
  • email (str) – member email

  • project_id (str) – The Id of the project

  • role – The required role for the user. Use the enum dl.MemberRole

Returns

dict that represent the user

Return type

dict

Example:

dl.projects.add_member(project_id='project_id', email='user@dataloop.ai', role=dl.MemberRole.DEVELOPER)
checkout(identifier: Optional[str] = None, project_name: Optional[str] = None, project_id: Optional[str] = None, project: Optional[Project] = None)[source]

Checkout (switch) to a project to work on.

Prerequisites: All users can open a project in the web.

You must provide at least ONE of the following params: project_id, project_name.

Parameters
  • identifier (str) – project name or partial id that you wish to switch

  • project_name (str) – The Name of the project

  • project_id (str) – The Id of the project

  • project (dtlpy.entities.project.Project) – project object

Example:

dl.projects.checkout(project_id='project_id')
create(project_name: str, checkout: bool = False) Project[source]

Create a new project.

Prerequisites: Any user can create a project.

Parameters
  • project_name (str) – The Name of the project

  • checkout (bool) – set the project as a default project object (cookies)

Returns

Project object

Return type

dtlpy.entities.project.Project

Example:

dl.projects.create(project_name='project_name')
delete(project_name: Optional[str] = None, project_id: Optional[str] = None, sure: bool = False, really: bool = False) bool[source]

Delete a project forever!

Prerequisites: You must be in the role of an owner to delete a project.

Parameters
  • project_name (str) – optional - search by name

  • project_id (str) – optional - search by id

  • sure (bool) – Are you sure you want to delete?

  • really (bool) – Really really sure?

Returns

True if success, error if not

Return type

bool

Example:

dl.projects.delete(project_id='project_id', sure=True, really=True)
get(project_name: Optional[str] = None, project_id: Optional[str] = None, checkout: bool = False, fetch: Optional[bool] = None, log_error=True) Project[source]

Get a Project object.

Prerequisites: You must be in the role of an owner to get a project object.

You must check out to a project or provide at least one of the following params: project_id, project_name

Parameters
  • project_name (str) – optional - search by name

  • project_id (str) – optional - search by id

  • checkout (bool) – set the project as a default project object (cookies)

  • fetch (bool) – optional - fetch entity from platform (True), default taken from cookie

  • log_error (bool) – optional - show the logs errors

Returns

Project object

Return type

dtlpy.entities.project.Project

Example:

dl.projects.get(project_id='project_id')
list() List[Project][source]

Get the user’s project list

Prerequisites: You must be a superuser to list all users’ projects.

Returns

List of Project objects

Example:

dl.projects.list()
list_members(project: Project, role: Optional[MemberRole] = None)[source]

Get a list of the project members.

Prerequisites: You must be in the role of an owner to list project members.

Parameters
Returns

list of the project members

Return type

list

Example:

dl.projects.list_members(project_id='project_id', role=dl.MemberRole.DEVELOPER)
open_in_web(project_name: Optional[str] = None, project_id: Optional[str] = None, project: Optional[Project] = None)[source]

Open the project in our web platform.

Prerequisites: All users can open a project in the web.

Parameters

Example:

dl.projects.open_in_web(project_id='project_id')
remove_member(email: str, project_id: str)[source]

Remove a member from the project.

Prerequisites: You must be in the role of an owner to delete a member from a project.

Parameters
  • email (str) – member email

  • project_id (str) – The Id of the project

Returns

dict that represents the user

Return type

dict

Example:

dl.projects.remove_member(project_id='project_id', email='user@dataloop.ai')
update(project: Project, system_metadata: bool = False) Project[source]

Update a project information (e.g., name, member roles, etc.).

Prerequisites: You must be in the role of an owner to add a member to a project.

Parameters
Returns

Project object

Return type

dtlpy.entities.project.Project

Example:

dl.projects.delete(project='project_entity')
update_member(email: str, project_id: str, role: MemberRole = MemberRole.DEVELOPER)[source]

Update member’s information/details in the project.

Prerequisites: You must be in the role of an owner to update a member.

Parameters
  • email (str) – member email

  • project_id (str) – The Id of the project

  • role – The required role for the user. Use the enum dl.MemberRole

Returns

dict that represent the user

Return type

dict

Example:

dl.projects.update_member(project_id='project_id', email='user@dataloop.ai', role=dl.MemberRole.DEVELOPER)

Datasets

Datasets Repository

class Datasets(client_api: ApiClient, project: Optional[Project] = None)[source]

Bases: object

Datasets Repository

The Datasets class allows the user to manage datasets. Read more about datasets in our documentation and SDK documentation.

checkout(identifier: Optional[str] = None, dataset_name: Optional[str] = None, dataset_id: Optional[str] = None, dataset: Optional[Dataset] = None)[source]

Checkout (switch) to a dataset to work on it.

Prerequisites: You must be an owner or developer to use this method.

You must provide at least ONE of the following params: dataset_id, dataset_name.

Parameters
  • identifier (str) – project name or partial id that you wish to switch

  • dataset_name (str) – The Name of the dataset

  • dataset_id (str) – The Id of the dataset

  • dataset (dtlpy.entities.dataset.Dataset) – dataset object

Example:

project.datasets.checkout(dataset_id='dataset_id')
clone(dataset_id: str, clone_name: str, filters: Optional[Filters] = None, with_items_annotations: bool = True, with_metadata: bool = True, with_task_annotations_status: bool = True)[source]

Clone a dataset. Read more about cloning datatsets and items in our documentation and SDK documentation.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • dataset_id (str) – id of the dataset you wish to clone

  • clone_name (str) – new dataset name

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a query dict

  • with_items_annotations (bool) – true to clone with items annotations

  • with_metadata (bool) – true to clone with metadata

  • with_task_annotations_status (bool) – true to clone with task annotations’ status

Returns

dataset object

Return type

dtlpy.entities.dataset.Dataset

Example:

project.datasets.clone(dataset_id='dataset_id',
                      clone_name='dataset_clone_name',
                      with_metadata=True,
                      with_items_annotations=False,
                      with_task_annotations_status=False)
create(dataset_name: str, labels=None, attributes=None, ontology_ids=None, driver: Optional[Driver] = None, driver_id: Optional[str] = None, checkout: bool = False, expiration_options: Optional[ExpirationOptions] = None, index_driver: Optional[IndexDriver] = None, recipe_id: Optional[str] = None) Dataset[source]

Create a new dataset

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • dataset_name (str) – The Name of the dataset

  • labels (list) – dictionary of {tag: color} or list of label entities

  • attributes (list) – dataset’s ontology’s attributes

  • ontology_ids (list) – optional - dataset ontology

  • driver (dtlpy.entities.driver.Driver) – optional - storage driver Driver object or driver name

  • driver_id (str) – optional - driver id

  • checkout (bool) – set the dataset as a default dataset object (cookies)

  • expiration_options (ExpirationOptions) – dl.ExpirationOptions object that contain definitions for dataset like MaxItemDays

  • index_driver (str) – dl.IndexDriver, dataset driver version

  • recipe_id (str) – optional - recipe id

Returns

Dataset object

Return type

dtlpy.entities.dataset.Dataset

Example:

project.datasets.create(dataset_name='dataset_name', ontology_ids='ontology_ids')
delete(dataset_name: Optional[str] = None, dataset_id: Optional[str] = None, sure: bool = False, really: bool = False)[source]

Delete a dataset forever!

Prerequisites: You must be an owner or developer to use this method.

Example:

project.datasets.delete(dataset_id='dataset_id', sure=True, really=True)
Parameters
  • dataset_name (str) – optional - search by name

  • dataset_id (str) – optional - search by id

  • sure (bool) – Are you sure you want to delete?

  • really (bool) – Really really sure?

Returns

True is success

Return type

bool

directory_tree(dataset: Optional[Dataset] = None, dataset_name: Optional[str] = None, dataset_id: Optional[str] = None)[source]

Get dataset’s directory tree.

Prerequisites: You must be an owner or developer to use this method.

You must provide at least ONE of the following params: dataset, dataset_name, dataset_id.

Parameters
Returns

DirectoryTree

Example:

project.datasets.directory_tree(dataset='dataset_entity')
static download_annotations(dataset: Dataset, local_path: Optional[str] = None, filters: Optional[Filters] = None, annotation_options: Optional[ViewAnnotationOptions] = None, annotation_filters: Optional[Filters] = None, overwrite: bool = False, thickness: int = 1, with_text: bool = False, remote_path: Optional[str] = None, include_annotations_in_output: bool = True, export_png_files: bool = False, filter_output_annotations: bool = False, alpha: Optional[float] = None, export_version=ExportVersion.V1) str[source]

Download dataset’s annotations by filters.

You may filter the dataset both for items and for annotations and download annotations.

Optional – download annotations as: mask, instance, image mask of the item.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • dataset (dtlpy.entities.dataset.Dataset) – dataset object

  • local_path (str) – local folder or filename to save to.

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • annotation_options (list) – type of download annotations: list(dl.ViewAnnotationOptions)

  • annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download

  • overwrite (bool) – optional - default = False to overwrite the existing files

  • thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1

  • with_text (bool) – optional - add text to annotations, default = False

  • remote_path (str) – DEPRECATED and ignored

  • include_annotations_in_output (bool) – default - False , if export should contain annotations

  • export_png_files (bool) – default - if True, semantic annotations should be exported as png files

  • filter_output_annotations (bool) – default - False, given an export by filter - determine if to filter out annotations

  • alpha (float) – opacity value [0 1], default 1

  • export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames

Returns

local_path of the directory where all the downloaded item

Return type

str

Example:

project.datasets.download_annotations(dataset='dataset_entity',
                                     local_path='local_path',
                                     annotation_options=dl.ViewAnnotationOptions,
                                     overwrite=False,
                                     thickness=1,
                                     with_text=False,
                                     alpha=1
                                     )
get(dataset_name: Optional[str] = None, dataset_id: Optional[str] = None, checkout: bool = False, fetch: Optional[bool] = None) Dataset[source]

Get dataset by name or id.

Prerequisites: You must be an owner or developer to use this method.

You must provide at least ONE of the following params: dataset_id, dataset_name.

Parameters
  • dataset_name (str) – optional - search by name

  • dataset_id (str) – optional - search by id

  • checkout (bool) – set the dataset as a default dataset object (cookies)

  • fetch (bool) – optional - fetch entity from platform (True), default taken from cookie

Returns

Dataset object

Return type

dtlpy.entities.dataset.Dataset

Example:

project.datasets.get(dataset_id='dataset_id')
list(name=None, creator=None) List[Dataset][source]

List all datasets.

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • name (str) – list by name

  • creator (str) – list by creator

Returns

List of datasets

Return type

list

Example:

project.datasets.list(name='name')
merge(merge_name: str, dataset_ids: list, project_ids: str, with_items_annotations: bool = True, with_metadata: bool = True, with_task_annotations_status: bool = True, wait: bool = True)[source]

Merge a dataset. See our SDK docs for more information.

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • merge_name (str) – new dataset name

  • dataset_ids (list) – list id’s of the datatsets you wish to merge

  • project_ids (str) – the project id that include the datasets

  • with_items_annotations (bool) – true to merge with items annotations

  • with_metadata (bool) – true to merge with metadata

  • with_task_annotations_status (bool) – true to merge with task annotations’ status

  • wait (bool) – wait for the command to finish

Returns

True if success

Return type

bool

Example:

project.datasets.merge(dataset_ids=['dataset_id1','dataset_id2'],
                      merge_name='dataset_merge_name',
                      with_metadata=True,
                      with_items_annotations=False,
                      with_task_annotations_status=False)
open_in_web(dataset_name: Optional[str] = None, dataset_id: Optional[str] = None, dataset: Optional[Dataset] = None)[source]

Open the dataset in web platform.

Prerequisites: You must be an owner or developer to use this method.

Parameters

Example:

project.datasets.open_in_web(dataset_id='dataset_id')
set_readonly(state: bool, dataset: Dataset)[source]

Set dataset readonly mode.

Prerequisites: You must be in the role of an owner or developer.

Parameters

Example:

project.datasets.set_readonly(dataset='dataset_entity', state=True)
sync(dataset_id: str, wait: bool = True)[source]

Sync dataset with external storage.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • dataset_id (str) – The Id of the dataset to sync

  • wait (bool) – wait for the command to finish

Returns

True if success

Return type

bool

Example:

project.datasets.sync(dataset_id='dataset_id')
update(dataset: Dataset, system_metadata: bool = False, patch: Optional[dict] = None) Dataset[source]

Update dataset field.

Prerequisites: You must be an owner or developer to use this method.

Parameters
Returns

Dataset object

Return type

dtlpy.entities.dataset.Dataset

Example:

project.datasets.update(dataset='dataset_entity')
upload_annotations(dataset, local_path, filters: Optional[Filters] = None, clean=False, remote_root_path='/', export_version=ExportVersion.V1)[source]

Upload annotations to dataset.

Example for remote_root_path: If the item filepath is a/b/item and remote_root_path is /a the start folder will be b instead of a

Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.

Parameters
  • dataset (dtlpy.entities.dataset.Dataset) – dataset to upload to

  • local_path (str) – str - local folder where the annotations files is

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • clean (bool) – True to remove the old annotations

  • remote_root_path (str) – the remote root path to match remote and local items

  • export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames

Example:

project.datasets.upload_annotations(dataset='dataset_entity',
                                     local_path='local_path',
                                     clean=False,
                                     export_version=dl.ExportVersion.V1
                                     )

Drivers

class Drivers(client_api: ApiClient, project: Optional[Project] = None)[source]

Bases: object

Drivers Repository

The Drivers class allows users to manage drivers that are used to connect with external storage. Read more about external storage in our documentation and SDK documentation.

create(name: str, driver_type: ExternalStorage, integration_id: str, bucket_name: str, integration_type: ExternalStorage, project_id: Optional[str] = None, allow_external_delete: bool = True, region: Optional[str] = None, storage_class: str = '', path: str = '')[source]

Create a storage driver.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • name (str) – the driver name

  • driver_type (str) – ExternalStorage.S3, ExternalStorage.GCS, ExternalStorage.AZUREBLOB

  • integration_id (str) – the integration id

  • bucket_name (str) – the external bucket name

  • integration_type (str) – ExternalStorage.S3, ExternalStorage.GCS, ExternalStorage.AZUREBLOB, ExternalStorage.AWS_STS

  • project_id (str) – project id

  • allow_external_delete (bool) – true to allow deleting files from external storage when files are deleted in your Dataloop storage

  • region (str) – relevant only for s3 - the bucket region

  • storage_class (str) – rilevante only for s3

  • path (str) – Optional. By default path is the root folder. Path is case sensitive integration

Returns

driver object

Return type

dtlpy.entities.driver.Driver

Example:

project.drivers.create(name='driver_name',
           driver_type=dl.ExternalStorage.S3,
           integration_id='integration_id',
           bucket_name='bucket_name',
           project_id='project_id',
           region='ey-west-1')
delete(driver_name: Optional[str] = None, driver_id: Optional[str] = None, sure: bool = False, really: bool = False)[source]

Delete a driver forever!

Prerequisites: You must be an owner or developer to use this method.

Example:

project.drivers.delete(dataset_id='dataset_id', sure=True, really=True)
Parameters
  • driver_name (str) – optional - search by name

  • driver_id (str) – optional - search by id

  • sure (bool) – Are you sure you want to delete?

  • really (bool) – Really really sure?

Returns

True if success

Return type

bool

get(driver_name: Optional[str] = None, driver_id: Optional[str] = None) Driver[source]

Get a Driver object to use in your code.

Prerequisites: You must be in the role of an owner or developer.

You must provide at least ONE of the following params: driver_name, driver_id.

Parameters
  • driver_name (str) – optional - search by name

  • driver_id (str) – optional - search by id

Returns

Driver object

Return type

dtlpy.entities.driver.Driver

Example:

project.drivers.get(driver_id='driver_id')
list() List[Driver][source]

Get the project’s drivers list.

Prerequisites: You must be in the role of an owner or developer.

Returns

List of Drivers objects

Return type

list

Example:

project.drivers.list()

Items

class Items(client_api: ApiClient, datasets: Optional[Datasets] = None, dataset: Optional[Dataset] = None, dataset_id=None, items_entity=None, project=None)[source]

Bases: object

Items Repository

The Items class allows you to manage items in your datasets. For information on actions related to items see Organizing Your Dataset, Item Metadata, and Item Metadata-Based Filtering.

clone(item_id: str, dst_dataset_id: str, remote_filepath: Optional[str] = None, metadata: Optional[dict] = None, with_annotations: bool = True, with_metadata: bool = True, with_task_annotations_status: bool = False, allow_many: bool = False, wait: bool = True)[source]

Clone item. Read more about cloning datatsets and items in our documentation and SDK documentation.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • item_id (str) – item to clone

  • dst_dataset_id (str) – destination dataset id

  • remote_filepath (str) – complete filepath

  • metadata (dict) – new metadata to add

  • with_annotations (bool) – clone annotations

  • with_metadata (bool) – clone metadata

  • with_task_annotations_status (bool) – clone task annotations status

  • allow_many (bool) – bool if True, using multiple clones in single dataset is allowed, (default=False)

  • wait (bool) – wait for the command to finish

Returns

Item object

Return type

dtlpy.entities.item.Item

Example:

dataset.items.clone(item_id='item_id',
        dst_dataset_id='dist_dataset_id',
        with_metadata=True,
        with_task_annotations_status=False,
        with_annotations=False)
delete(filename: Optional[str] = None, item_id: Optional[str] = None, filters: Optional[Filters] = None)[source]

Delete item from platform.

Prerequisites: You must be in the role of an owner or developer.

You must provide at least ONE of the following params: item id, filename, filters.

Parameters
  • filename (str) – optional - search item by remote path

  • item_id (str) – optional - search item by id

  • filters (dtlpy.entities.filters.Filters) – optional - delete items by filter

Returns

True if success

Return type

bool

Example:

dataset.items.delete(item_id='item_id')
download(filters: Optional[Filters] = None, items=None, local_path: Optional[str] = None, file_types: Optional[list] = None, save_locally: bool = True, to_array: bool = False, annotation_options: Optional[ViewAnnotationOptions] = None, annotation_filters: Optional[Filters] = None, overwrite: bool = False, to_items_folder: bool = True, thickness: int = 1, with_text: bool = False, without_relative_path=None, avoid_unnecessary_annotation_download: bool = False, include_annotations_in_output: bool = True, export_png_files: bool = False, filter_output_annotations: bool = False, alpha: float = 1, export_version=ExportVersion.V1)[source]

Download dataset items by filters.

Filters the dataset for items and saves them locally.

Optional – download annotation, mask, instance, and image mask of the item.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • items (List[dtlpy.entities.item.Item] or dtlpy.entities.item.Item) – download Item entity or item_id (or a list of item)

  • local_path (str) – local folder or filename to save to.

  • file_types (list) – a list of file type to download. e.g [‘video/webm’, ‘video/mp4’, ‘image/jpeg’, ‘image/png’]

  • save_locally (bool) – bool. save to disk or return a buffer

  • to_array (bool) – returns Ndarray when True and local_path = False

  • annotation_options (list) – download annotations options: list(dl.ViewAnnotationOptions)

  • annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download

  • overwrite (bool) – optional - default = False

  • to_items_folder (bool) – Create ‘items’ folder and download items to it

  • thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1

  • with_text (bool) – optional - add text to annotations, default = False

  • without_relative_path (bool) – bool - download items without the relative path from platform

  • avoid_unnecessary_annotation_download (bool) – default - False

  • include_annotations_in_output (bool) – default - False , if export should contain annotations

  • export_png_files (bool) – default - if True, semantic annotations should be exported as png files

  • filter_output_annotations (bool) – default - False, given an export by filter - determine if to filter out annotations

  • alpha (float) – opacity value [0 1], default 1

  • export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames

Returns

generator of local_path per each downloaded item

Return type

generator or single item

Example:

dataset.items.download(local_path='local_path',
                     annotation_options=dl.ViewAnnotationOptions,
                     overwrite=False,
                     thickness=1,
                     with_text=False,
                     alpha=1,
                     save_locally=True
                     )
get(filepath: Optional[str] = None, item_id: Optional[str] = None, fetch: Optional[bool] = None, is_dir: bool = False) Item[source]

Get Item object

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • filepath (str) – optional - search by remote path

  • item_id (str) – optional - search by id

  • fetch (bool) – optional - fetch entity from platform, default taken from cookie

  • is_dir (bool) – True if you want to get an item from dir type

Returns

Item object

Return type

dtlpy.entities.item.Item

Example:

dataset.items.get(item_id='item_id')
get_all_items(filters: Optional[Filters] = None) [<class 'dtlpy.entities.item.Item'>][source]

Get all items in dataset.

Prerequisites: You must be in the role of an owner or developer.

Parameters

filters (dtlpy.entities.filters.Filters) – dl.Filters entity to filters items

Returns

list of all items

Return type

list

Example:

dataset.items.get_all_items()
list(filters: Optional[Filters] = None, page_offset: Optional[int] = None, page_size: Optional[int] = None) PagedEntities[source]

List items in a dataset.

Prerequisites: You must be in the role of an owner or developer.

Parameters
Returns

Pages object

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

dataset.items.list(page_offset=0, page_size=100)
make_dir(directory, dataset: Optional[Dataset] = None) Item[source]

Create a directory in a dataset.

Prerequisites: All users.

Parameters
Returns

Item object

Return type

dtlpy.entities.item.Item

Example:

dataset.items.make_dir(directory='directory_name')
move_items(destination: str, filters: Optional[Filters] = None, items=None, dataset: Optional[Dataset] = None) bool[source]

Move items to another directory. If directory does not exist we will create it

Prerequisites: You must be in the role of an owner or developer.

Parameters
Returns

True if success

Return type

bool

Example:

dataset.items.move_items(destination='directory_name')
open_in_web(filepath=None, item_id=None, item=None)[source]

Open the item in web platform

Prerequisites: You must be in the role of an owner or developer or be an annotation manager/annotator with access to that item through task.

Parameters

Example:

dataset.items.open_in_web(item_id='item_id')
set_items_entity(entity)[source]

Set the item entity type to Artifact, Item, or Codebase.

Parameters

entity (entities.Item, entities.Artifact, entities.Codebase) – entity type [entities.Item, entities.Artifact, entities.Codebase]

update(item: Optional[Item] = None, filters: Optional[Filters] = None, update_values=None, system_update_values=None, system_metadata: bool = False)[source]

Update item metadata.

Prerequisites: You must be in the role of an owner or developer.

You must provide at least ONE of the following params: update_values, system_update_values.

Parameters
  • item (dtlpy.entities.item.Item) – Item object

  • filters (dtlpy.entities.filters.Filters) – optional update filtered items by given filter

  • update_values – optional field to be updated and new values

  • system_update_values – values in system metadata to be updated

  • system_metadata (bool) – True, if you want to update the metadata system

Returns

Item object

Return type

dtlpy.entities.item.Item

Example:

dataset.items.update(item='item_entity')
update_status(status: ItemStatus, items=None, item_ids=None, filters=None, dataset=None, clear=False)[source]

Update item status in task

Prerequisites: You must be in the role of an owner or developer or annotation manager who has been assigned a task with the item.

You must provide at least ONE of the following params: items, item_ids, filters.

Parameters

Example:

dataset.items.update_status(item_ids='item_id', status=dl.ItemStatus.COMPLETED)
upload(local_path: str, local_annotations_path: ~typing.Optional[str] = None, remote_path: str = '/', remote_name: ~typing.Optional[str] = None, file_types: ~typing.Optional[~dtlpy.repositories.items.Items.list] = None, overwrite: bool = False, item_metadata: ~typing.Optional[dict] = None, output_entity=<class 'dtlpy.entities.item.Item'>, no_output: bool = False, export_version: str = ExportVersion.V1, item_description: ~typing.Optional[str] = None)[source]

Upload local file to dataset. Local filesystem will remain unchanged. If “*” at the end of local_path (e.g. “/images/*”) items will be uploaded without the head directory.

Prerequisites: Any user can upload items.

Parameters
  • local_path (str) – list of local file, local folder, BufferIO, numpy.ndarray or url to upload

  • local_annotations_path (str) – path to dataloop format annotations json files.

  • remote_path (str) – remote path to save.

  • remote_name (str) – remote base name to save. when upload numpy.ndarray as local path, remote_name with .jpg or .png ext is mandatory

  • file_types (list) – list of file type to upload. e.g [‘.jpg’, ‘.png’]. default is all

  • item_metadata (dict) – metadata dict to upload to item or ExportMetadata option to export metadata from annotation file

  • overwrite (bool) – optional - default = False

  • output_entity – output type

  • no_output (bool) – do not return the items after upload

  • export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames

  • item_description (str) – add a string description to the uploaded item

Returns

Output (generator/single item)

Return type

generator or single item

Example:

dataset.items.upload(local_path='local_path',
                     local_annotations_path='local_annotations_path',
                     overwrite=True,
                     item_metadata={'Hellow': 'Word'}
                     )

Annotations

class Annotations(client_api: ApiClient, item=None, dataset=None, dataset_id=None)[source]

Bases: object

Annotations Repository

The Annotation class allows you to manage the annotations of data items. For information on annotations explore our documentation at: Classification SDK, Annotation Labels and Attributes, Show Video with Annotations.

builder()[source]

Create Annotation collection.

Prerequisites: You must have an item to be annotated. You must have the role of an owner or developer

or be assigned a task that includes that item as an annotation manager or annotator.

Returns

Annotation collection object

Return type

dtlpy.entities.annotation_collection.AnnotationCollection

Example:

item.annotations.builder()
delete(annotation: Optional[Annotation] = None, annotation_id: Optional[str] = None, filters: Optional[Filters] = None) bool[source]

Remove an annotation from item.

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.

Parameters
Returns

True/False

Return type

bool

Example:

item.annotations.delete(annotation_id='annotation_id')
download(filepath: str, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.JSON, img_filepath: Optional[str] = None, height: Optional[float] = None, width: Optional[float] = None, thickness: int = 1, with_text: bool = False, alpha: float = 1)[source]

Save annotation to file.

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.

Parameters
  • filepath (str) – Target download directory

  • annotation_format (str) – the format that want to download ,options: list(dl.ViewAnnotationOptions)

  • img_filepath (str) – img file path - needed for img_mask

  • height (float) – optional - image height

  • width (float) – optional - image width

  • thickness (int) – optional - line thickness, default=1

  • with_text (bool) – optional - draw annotation with text, default = False

  • alpha (float) – opacity value [0 1], default 1

Returns

file path to where save the annotations

Return type

str

Example:

item.annotations.download(
              filepath='file_path',
              annotation_format=dl.ViewAnnotationOptions.MASK,
              img_filepath='img_filepath',
              height=100,
              width=100,
              thickness=1,
              with_text=False,
              alpha=1)
get(annotation_id: str) Annotation[source]

Get a single annotation.

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.

Parameters

annotation_id (str) – The id of the annotation

Returns

Annotation object or None

Return type

dtlpy.entities.annotation.Annotation

Example:

item.annotations.get(annotation_id='annotation_id')
list(filters: Optional[Filters] = None, page_offset: Optional[int] = None, page_size: Optional[int] = None)[source]

List Annotations of a specific item. You must get the item first and then list the annotations with the desired filters.

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.

Parameters
Returns

Pages object

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

item.annotations.list(filters=dl.Filters(
                             resource=dl.FiltersResource.ANNOTATION,
                             field='type',
                             values='box'),
          page_size=100,
          page_offset=0)
show(image=None, thickness: int = 1, with_text: bool = False, height: Optional[float] = None, width: Optional[float] = None, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.MASK, alpha: float = 1)[source]

Show annotations. To use this method, you must get the item first and then show the annotations with the desired filters. The method returns an array showing all the annotations.

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.

Parameters
  • image (ndarray) – empty or image to draw on

  • thickness (int) – optional - line thickness, default=1

  • with_text (bool) – add label to annotation

  • height (float) – item height

  • width (float) – item width

  • annotation_format (str) – the format that want to show ,options: list(dl.ViewAnnotationOptions)

  • alpha (float) – opacity value [0 1], default 1

Returns

ndarray of the annotations

Return type

ndarray

Example:

item.annotations.show(image='nd array',
          thickness=1,
          with_text=False,
          height=100,
          width=100,
          annotation_format=dl.ViewAnnotationOptions.MASK,
          alpha=1)
update(annotations, system_metadata=False)[source]

Update an existing annotation. For example, you may change the annotation’s label and then use the update method.

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or

developer or be assigned a task that includes that item as an annotation manager or annotator.

Parameters
Returns

True if successful or error if unsuccessful

Return type

bool

Example:

item.annotations.update(annotation='annotation')
update_status(annotation: Optional[Annotation] = None, annotation_id: Optional[str] = None, status: AnnotationStatus = AnnotationStatus.ISSUE) Annotation[source]

Set status on annotation.

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager.

Parameters
Returns

Annotation object

Return type

dtlpy.entities.annotation.Annotation

Example:

item.annotations.update_status(annotation_id='annotation_id', status=dl.AnnotationStatus.ISSUE)
upload(annotations) AnnotationCollection[source]

Upload a new annotation/annotations. You must first create the annotation using the annotation builder method.

Prerequisites: Any user can upload annotations.

Parameters

annotations (List[dtlpy.entities.annotation.Annotation] or dtlpy.entities.annotation.Annotation) – list or

single annotation of type Annotation :return: list of annotation objects :rtype: entities.AnnotationCollection

Example:

item.annotations.upload(annotations='builder')

Recipes

class Recipes(client_api: ApiClient, dataset: Optional[Dataset] = None, project: Optional[Project] = None, project_id: Optional[str] = None)[source]

Bases: object

Recipes Repository

The Recipes class allows you to manage recipes and their properties. For more information on Recipes, see our documentation and SDK documentation.

clone(recipe: Optional[Recipe] = None, recipe_id: Optional[str] = None, shallow: bool = False)[source]

Clone recipe.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • recipe (dtlpy.entities.recipe.Recipe) – Recipe object

  • recipe_id (str) – Recipe id

  • shallow (bool) – If True, link to existing ontology, clones all ontologies that are linked to the recipe as well

Returns

Cloned ontology object

Return type

dtlpy.entities.recipe.Recipe

Example:

dataset.recipes.clone(recipe_id='recipe_id')
create(project_ids=None, ontology_ids=None, labels=None, recipe_name=None, attributes=None, annotation_instruction_file=None) Recipe[source]

Create a new Recipe. Note: If the param ontology_ids is None, an ontology will be created first.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • project_ids (str) – project ids

  • ontology_ids (str or list) – ontology ids

  • labels – labels

  • recipe_name (str) – recipe name

  • attributes – attributes

  • annotation_instruction_file (str) – file path or url of the recipe instruction

Returns

Recipe entity

Return type

dtlpy.entities.recipe.Recipe

Example:

dataset.recipes.create(recipe_name='My Recipe', labels=labels))
delete(recipe_id: str, force: bool = False)[source]

Delete recipe from platform.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • recipe_id (str) – recipe id

  • force (bool) – force delete recipe

Returns

True if success

Return type

bool

Example:

dataset.recipes.delete(recipe_id='recipe_id')
get(recipe_id: str) Recipe[source]

Get a Recipe object to use in your code.

Prerequisites: You must be in the role of an owner or developer.

Parameters

recipe_id (str) – recipe id

Returns

Recipe object

Return type

dtlpy.entities.recipe.Recipe

Example:

dataset.recipes.get(recipe_id='recipe_id')
list(filters: Optional[Filters] = None) List[Recipe][source]

List recipes for a dataset.

Prerequisites: You must be in the role of an owner or developer.

Parameters

filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

Returns

list of all recipes

Retype

list

Example:

dataset.recipes.list()
open_in_web(recipe: Optional[Recipe] = None, recipe_id: Optional[str] = None)[source]

Open the recipe in web platform.

Prerequisites: All users.

Parameters

Example:

dataset.recipes.open_in_web(recipe_id='recipe_id')
update(recipe: Recipe, system_metadata=False) Recipe[source]

Update recipe.

Prerequisites: You must be in the role of an owner or developer.

Parameters
Returns

Recipe object

Return type

dtlpy.entities.recipe.Recipe

Example:

dataset.recipes.update(recipe='recipe_entity')

Ontologies

class Ontologies(client_api: ApiClient, recipe: Optional[Recipe] = None, project: Optional[Project] = None, dataset: Optional[Dataset] = None)[source]

Bases: object

Ontologies Repository

The Ontologies class allows users to manage ontologies and their properties. Read more about ontology in our SDK docs.

create(labels, title=None, project_ids=None, attributes=None) Ontology[source]

Create a new ontology.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • labels – recipe tags

  • title (str) – ontology title, name

  • project_ids (list) – recipe project/s

  • attributes (list) – recipe attributes

Returns

Ontology object

Return type

dtlpy.entities.ontology.Ontology

Example:

recipe.ontologies.create(labels='labels_entity',
                      title='new_ontology',
                      project_ids='project_ids')
delete(ontology_id)[source]

Delete Ontology from the platform.

Prerequisites: You must be in the role of an owner or developer.

Parameters

ontology_id – ontology id

Returns

True if success

Return type

bool

Example:

recipe.ontologies.delete(ontology_id='ontology_id')
delete_attributes(ontology_id, keys: list)[source]

Delete a bulk of attributes

Parameters
  • ontology_id (str) – ontology id

  • keys (list) – Keys of attributes to delete

Returns

True if success

Return type

bool

Example:

ontology.delete_attributes(['1'])
get(ontology_id: str) Ontology[source]

Get Ontology object to use in your code.

Prerequisites: You must be in the role of an owner or developer.

Parameters

ontology_id (str) – ontology id

Returns

Ontology object

Return type

dtlpy.entities.ontology.Ontology

Example:

recipe.ontologies.get(ontology_id='ontology_id')
static labels_to_roots(labels)[source]

Converts labels dictionary to a list of platform representation of labels.

Parameters

labels (dict) – labels dict

Returns

platform representation of labels

list(project_ids=None) List[Ontology][source]

List ontologies for recipe

Prerequisites: You must be in the role of an owner or developer.

Parameters

project_ids

Returns

list of all the ontologies

Example:

recipe.ontologies.list(project_ids='project_ids')
update(ontology: Ontology, system_metadata=False) Ontology[source]

Update the Ontology metadata.

Prerequisites: You must be in the role of an owner or developer.

Parameters
Returns

Ontology object

Return type

dtlpy.entities.ontology.Ontology

Example:

recipe.ontologies.delete(ontology='ontology_entity')
update_attributes(ontology_id: str, title: str, key: str, attribute_type: AttributesTypes, scope: Optional[list] = None, optional: Optional[bool] = None, values: Optional[list] = None, attribute_range: Optional[AttributesRange] = None)[source]

ADD a new attribute or update if exist

Parameters
  • ontology_id (str) – ontology_id

  • title (str) – attribute title

  • key (str) – the key of the attribute must br unique

  • attribute_type (AttributesTypes) – dl.AttributesTypes your attribute type

  • scope (list) – list of the labels or * for all labels

  • optional (bool) – optional attribute

  • values (list) – list of the attribute values ( for checkbox and radio button)

  • attribute_range (dict or AttributesRange) – dl.AttributesRange object

Returns

true in success

Return type

bool

Example:

ontology.update_attributes(key='1',
                           title='checkbox',
                           attribute_type=dl.AttributesTypes.CHECKBOX,
                           values=[1,2,3])

Tasks

class Tasks(client_api: ApiClient, project: Optional[Project] = None, dataset: Optional[Dataset] = None, project_id: Optional[str] = None)[source]

Bases: object

Tasks Repository

The Tasks class allows the user to manage tasks and their properties. For more information, read in our SDK documentation about Creating Tasks, Redistributing and Reassigning Tasks, and Task Assignment.

add_items(task: Optional[Task] = None, task_id=None, filters: Optional[Filters] = None, items=None, assignee_ids=None, query=None, workload=None, limit=None, wait=True) Task[source]

Add items to a Task.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.

Parameters
  • task (dtlpy.entities.task.Task) – task object

  • task_id (str) – the Id of the task

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • items (list) – list of items (item Ids or objects) to add to the task

  • assignee_ids (list) – list to assignee who works in the task

  • query (dict) – query to filter the items for the task

  • workload (list) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]

  • limit (int) – the limit items that task can include

  • wait (bool) – wait until add items will to finish

Returns

task entity

Return type

dtlpy.entities.task.Task

Example:

dataset.tasks.add_items(task= 'task_entity',
                    items = [items])
create(task_name, due_date=None, assignee_ids=None, workload=None, dataset=None, task_owner=None, task_type='annotation', task_parent_id=None, project_id=None, recipe_id=None, assignments_ids=None, metadata=None, filters=None, items=None, query=None, available_actions=None, wait=True, check_if_exist: Filters = False, limit=None, batch_size=None, max_batch_workload=None, allowed_assignees=None, priority=TaskPriority.MEDIUM) Task[source]

Create a new Task (Annotation or QA).

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.

Parameters
  • task_name (str) – the name of the task

  • due_date (float) – date by which the task should be finished; for example, due_date=datetime.datetime(day=1, month=1, year=2029).timestamp()

  • assignee_ids (list) – list the task assignees (contributors) that should be working on the task. Provide a list of users’ emails

  • workload (List[WorkloadUnit] List[WorkloadUnit]) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]

  • dataset (entities.Dataset) – dataset object, the dataset that refer to the task

  • task_owner (str) – task owner. Provide user email

  • task_type (str) – task type “annotation” or “qa”

  • task_parent_id (str) – optional if type is qa - parent annotation task id

  • project_id (str) – the Id of the project where task will be created

  • recipe_id (str) – recipe id for the task

  • assignments_ids (list) – assignments ids to the task

  • metadata (dict) – metadata for the task

  • filters (entities.Filters) – dl.Filters entity to filter items for the task

  • items (List[entities.Item]) – list of items (item Id or objects) to insert to the task

  • query (dict DQL) – filter items for the task

  • available_actions (list) – list of available actions (statuses) that will be available for the task items; The default statuses are: “Completed” and “Discarded”

  • wait (bool) – wait until create task finish

  • check_if_exist (entities.Filters) – dl.Filters check if task exist according to filter

  • limit (int) – the limit items that the task can include

  • batch_size (int) – Pulling batch size (items), use with pulling allocation method. Restrictions - Min 3, max 100

  • max_batch_workload (int) – max_batch_workload: Max items in assignment, use with pulling allocation method. Restrictions - Min batchSize + 2, max batchSize * 2

  • allowed_assignees (list) – list the task assignees (contributors) that should be working on the task. Provide a list of users’ emails

  • priority (entities.TaskPriority) – priority of the task options in entities.TaskPriority

Returns

Task object

Return type

dtlpy.entities.task.Task

Example:

dataset.tasks.create(task= 'task_entity',
                    due_date = datetime.datetime(day= 1, month= 1, year= 2029).timestamp(),
                    assignee_ids =[ 'annotator1@dataloop.ai', 'annotator2@dataloop.ai'])
create_qa_task(task: Task, assignee_ids, due_date=None, filters=None, items=None, query=None, workload=None, metadata=None, available_actions=None, wait=True, batch_size=None, max_batch_workload=None, allowed_assignees=None, priority=TaskPriority.MEDIUM) Task[source]

Create a new QA Task.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.

Parameters
  • task (dtlpy.entities.task.Task) – the parent annotation task object

  • assignee_ids (list) – list the QA task assignees (contributors) that should be working on the task. Provide a list of users’ emails

  • due_date (float) – date by which the QA task should be finished; for example, due_date=datetime.datetime(day=1, month=1, year=2029).timestamp()

  • filters (entities.Filters) – dl.Filters entity to filter items for the task

  • items (List[entities.Item]) – list of items (item Id or objects) to insert to the task

  • query (dict DQL) – filter items for the task

  • workload (List[WorkloadUnit]) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]

  • metadata (dict) – metadata for the task

  • available_actions (list) – list of available actions (statuses) that will be available for the task items; The default statuses are: “Approved” and “Discarded”

  • wait (bool) – wait until create task finish

  • batch_size (int) – Pulling batch size (items), use with pulling allocation method. Restrictions - Min 3, max 100

  • max_batch_workload (int) – Max items in assignment, use with pulling allocation method. Restrictions - Min batchSize + 2, max batchSize * 2

  • allowed_assignees (list) – list the task assignees (contributors) that should be working on the task. Provide a list of users’ emails

  • priority (entities.TaskPriority) – priority of the task options in entities.TaskPriority

Returns

task object

Return type

dtlpy.entities.task.Task

Example:

dataset.tasks.create_qa_task(task= 'task_entity',
                            due_date = datetime.datetime(day= 1, month= 1, year= 2029).timestamp(),
                            assignee_ids =[ 'annotator1@dataloop.ai', 'annotator2@dataloop.ai'])
delete(task: Optional[Task] = None, task_name: Optional[str] = None, task_id: Optional[str] = None, wait: bool = True)[source]

Delete the Task.

Prerequisites: You must be in the role of an owner or developer or annotation manager who created that task.

Parameters
  • task (dtlpy.entities.task.Task) – the task object

  • task_name (str) – the name of the task

  • task_id (str) – the Id of the task

  • wait (bool) – wait until delete task finish

Returns

True is success

Return type

bool

Example:

dataset.tasks.delete(task_id='task_id')
get(task_name=None, task_id=None) Task[source]

Get a Task object to use in your code.

Prerequisites: You must be in the role of an owner or developer or annotation manager who has been assigned the task.

Parameters
  • task_name (str) – optional - search by name

  • task_id (str) – optional - search by id

Returns

task object

Return type

dtlpy.entities.task.Task

Example:

dataset.tasks.get(task_id='task_id')
get_items(task_id: Optional[str] = None, task_name: Optional[str] = None, dataset: Optional[Dataset] = None, filters: Optional[Filters] = None) PagedEntities[source]

Get the task items to use in your code.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.

If a filters param is provided, you will receive a PagedEntity output of the task items. If no filter is provided, you will receive a list of the items.

Parameters
Returns

list of the items or PagedEntity output of items

Return type

list or dtlpy.entities.paged_entities.PagedEntities

Example:

dataset.tasks.get_items(task_id= 'task_id')
list(project_ids=None, status=None, task_name=None, pages_size=None, page_offset=None, recipe=None, creator=None, assignments=None, min_date=None, max_date=None, filters: Optional[Filters] = None) Union[List[Task], PagedEntities][source]

List all tasks.

Prerequisites: You must be in the role of an owner or developer or annotation manager who has been assigned the task.

Parameters
  • project_ids – search tasks by given list of project ids

  • status (str) – search tasks by a given task status

  • task_name (str) – search tasks by a given task name

  • pages_size (int) – pages size of the output generator

  • page_offset (int) – page offset of the output generator

  • recipe (dtlpy.entities.recipe.Recipe) – Search tasks that use a given recipe. Provide the required recipe object

  • creator (str) – search tasks created by a given creator (user email)

  • assignments (dtlpy.entities.assignment.Assignment recipe) – assignments object

  • min_date (double) – search all tasks created AFTER a given date, use a milliseconds format. For example: 1661780622008

  • max_date (double) – search all tasks created BEFORE a given date, use a milliseconds format. For example: 1661780622008

  • filters (dtlpy.entities.filters.Filters) – dl.Filters entity to filters tasks using DQL

Returns

List of Task objects

Example:

dataset.tasks.list(project_ids='project_ids',pages_size=100, page_offset=0)
open_in_web(task_name: Optional[str] = None, task_id: Optional[str] = None, task: Optional[Task] = None)[source]

Open the task in the web platform.

Prerequisites: You must be in the role of an owner or developer or annotation manager who has been assigned the task.

Parameters

Example:

dataset.tasks.open_in_web(task_id='task_id')
query(filters=None, project_ids=None)[source]

List all tasks by filter.

Prerequisites: You must be in the role of an owner or developer or annotation manager who has been assigned the task.

Parameters
  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • project_ids (list) – list of project ids of the required tasks

Returns

Paged entity - task pages generator

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

dataset.tasks.query(project_ids='project_ids')
remove_items(task: Optional[Task] = None, task_id=None, filters: Optional[Filters] = None, query=None, items=None, wait=True)[source]

remove items from Task.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.

Parameters
Returns

True if success and an error if failed

Return type

bool

Examples:

dataset.tasks.remove_items(task= 'task_entity',
                            items = [items])
set_status(status: str, operation: str, task_id: str, item_ids: List[str])[source]

Update an item status within a task.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.

Parameters
  • status (str) – string the describes the status

  • operation (str) – the status action need ‘create’ or ‘delete’

  • task_id (str) – the Id of the task

  • item_ids (list) – List[str] id items ids

Returns

True if success

Return type

bool

Example:

dataset.tasks.set_status(task_id= 'task_id', status='complete', operation='create')
update(task: Optional[Task] = None, system_metadata=False) Task[source]

Update a Task.

Prerequisites: You must be in the role of an owner or developer or annotation manager who created that task.

Parameters
Returns

Task object

Return type

dtlpy.entities.task.Task

Example:

dataset.tasks.update(task='task_entity')

Assignments

class Assignments(client_api: ApiClient, project: Optional[Project] = None, task: Optional[Task] = None, dataset: Optional[Dataset] = None, project_id=None)[source]

Bases: object

Assignments Repository

The Assignments class allows users to manage assignments and their properties. Read more about Task Assignment in our SDK documentation.

create(assignee_id: str, task: Optional[Task] = None, filters: Optional[Filters] = None, items: Optional[list] = None) Assignment[source]

Create a new assignment.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters
  • assignee_id (str) – the email of the user that want to assign the assignment

  • task (dtlpy.entities.task.Task) – the task object that include the assignment

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • items (list) – list of items (item Id or objects) to insert to the assignment

Returns

Assignment object

Return type

dtlpy.entities.assignment.Assignment assignment

Example:

task.assignments.create(assignee_id='annotator1@dataloop.ai')
get(assignment_name: Optional[str] = None, assignment_id: Optional[str] = None)[source]

Get Assignment object to use it in your code.

Parameters
  • assignment_name (str) – optional - search by name

  • assignment_id (str) – optional - search by id

Returns

Assignment object

Return type

dtlpy.entities.assignment.Assignment

Example:

task.assignments.get(assignment_id='assignment_id')
get_items(assignment: Optional[Assignment] = None, assignment_id=None, assignment_name=None, dataset=None, filters=None) PagedEntities[source]

Get all the items in the assignment.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters
Returns

pages of the items

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

task.assignments.get_items(assignment_id='assignment_id')
list(project_ids: Optional[list] = None, status: Optional[str] = None, assignment_name: Optional[str] = None, assignee_id: Optional[str] = None, pages_size: Optional[int] = None, page_offset: Optional[int] = None, task_id: Optional[int] = None) List[Assignment][source]

Get Assignment list to be able to use it in your code.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters
  • project_ids (list) – search assignment by given list of project ids

  • status (str) – search assignment by a given task status

  • assignment_name (str) – search assignment by a given assignment name

  • assignee_id (str) – the user email that assignee the assignment to it

  • pages_size (int) – pages size of the output generator

  • page_offset (int) – page offset of the output generator

  • task_id (str) – search assignment by given task id

Returns

List of Assignment objects

Return type

miscellaneous.List[dtlpy.entities.assignment.Assignment]

Example:

task.assignments.list(status='complete', assignee_id='user@dataloop.ai', pages_size=100, page_offset=0)
open_in_web(assignment_name: Optional[str] = None, assignment_id: Optional[str] = None, assignment: Optional[str] = None)[source]

Open the assignment in the platform.

Prerequisites: All users.

Parameters

Example:

task.assignments.open_in_web(assignment_id='assignment_id')
reassign(assignee_id: str, assignment: Optional[Assignment] = None, assignment_id: Optional[str] = None, task: Optional[Task] = None, task_id: Optional[str] = None, wait: bool = True)[source]

Reassign an assignment.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters
  • assignee_id (str) – the email of the user that want to assign the assignment

  • assignment (dtlpy.entities.assignment.Assignment) – assignment object

  • assignment_id – the Id of the assignment

  • task (dtlpy.entities.task.Task) – task object

  • task_id (str) – the Id of the task that include the assignment

  • wait (bool) – wait until reassign assignment finish

Returns

Assignment object

Return type

dtlpy.entities.assignment.Assignment

Example:

task.assignments.reassign(assignee_ids='annotator1@dataloop.ai')
redistribute(workload: Workload, assignment: Optional[Assignment] = None, assignment_id: Optional[str] = None, task: Optional[Task] = None, task_id: Optional[str] = None, wait: bool = True)[source]

Redistribute an assignment.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Example:

Parameters
Returns

Assignment object

Return type

dtlpy.entities.assignment.Assignment assignment

task.assignments.redistribute(workload=dl.Workload([dl.WorkloadUnit(assignee_id="annotator1@dataloop.ai", load=50),
                                                    dl.WorkloadUnit(assignee_id="annotator2@dataloop.ai", load=50)]))
set_status(status: str, operation: str, item_id: str, assignment_id: str) bool[source]

Set item status within assignment.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters
  • status (str) – string the describes the status

  • operation (str) – the status action need ‘create’ or ‘delete’

  • item_id (str) – item id that want to set his status

  • assignment_id – the Id of the assignment

Returns

True id success

Return type

bool

Example:

task.assignments.set_status(assignment_id='assignment_id',
                            status='complete',
                            operation='created',
                            item_id='item_id')
update(assignment: Optional[Assignment] = None, system_metadata: bool = False) Assignment[source]

Update an assignment.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters
  • assignment (dtlpy.entities.assignment.Assignment assignment) – assignment entity

  • system_metadata (bool) – True, if you want to change metadata system

Returns

Assignment object

Return type

dtlpy.entities.assignment.Assignment assignment

Example:

task.assignments.update(assignment='assignment_entity', system_metadata=False)

Packages

class LocalServiceRunner(client_api: ApiClient, packages, cwd=None, multithreading=False, concurrency=10, package: Optional[Package] = None, module_name='default_module', function_name='run', class_name='ServiceRunner', entry_point='main.py', mock_file_path=None)[source]

Bases: object

Service Runner Class

get_field(field_name, field_type, mock_json, project=None, mock_inputs=None)[source]

Get field in mock json.

Parameters
  • field_name – field name

  • field_type – field type

  • mock_json – mock json

  • project – project

  • mock_inputs – mock inputs

Returns

get_mainpy_run_service()[source]

Get mainpy run service

Returns

run_local_project(project=None)[source]

Run local project

Parameters

project – project entity

class Packages(client_api: ApiClient, project: Optional[Project] = None)[source]

Bases: object

Packages Repository

The Packages class allows users to manage packages (code used for running in Dataloop’s FaaS) and their properties. Read more about Packages.

build(package: Package, module_name=None, init_inputs=None, local_path=None, from_local=None)[source]

Instantiate a module from the package code. Returns a loaded instance of the runner class

Parameters
  • package – Package entity

  • module_name – Name of the module to build the runner class

  • init_inputs (str) – dictionary of the class init variables (if exists). will be used to init the module class

  • local_path (str) – local path of the package (if from_local=False - codebase will be downloaded)

  • from_local (bool) – bool. if true - codebase will not be downloaded (only use local files)

Returns

dl.BaseServiceRunner

build_requirements(filepath) list[source]

Build a requirement list (list of packages your code requires to run) from a file path. The file listing the requirements MUST BE a txt file.

Prerequisites: You must be in the role of an owner or developer.

Parameters

filepath – path of the requirements file

Returns

a list of dl.PackageRequirement

Return type

list

static build_trigger_dict(actions, name='default_module', filters=None, function='run', execution_mode: TriggerExecutionMode = 'Once', type_t: TriggerType = 'Event')[source]

Build a trigger dictionary to trigger FaaS. Read more about FaaS Triggers.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • actions – list of dl.TriggerAction

  • name (str) – trigger name

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • function (str) – function name

  • execution_mode (str) – execution mode dl.TriggerExecutionMode

  • type_t (str) – trigger type dl.TriggerType

Returns

trigger dict

Return type

dict

Example:

project.packages.build_trigger_dict(actions=dl.TriggerAction.CREATED,
                                  function='run',
                                  execution_mode=dl.TriggerExecutionMode.ONCE)
static check_cls_arguments(cls, missing, function_name, function_inputs)[source]

Check class arguments. This method checks that the package function is correct.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • cls – packages class

  • missing (list) – list of the missing params

  • function_name (str) – name of function

  • function_inputs (list) – list of function inputs

checkout(package: Optional[Package] = None, package_id: Optional[str] = None, package_name: Optional[str] = None)[source]

Checkout (switch) to a package.

Prerequisites: You must be in the role of an owner or developer.

Parameters

Example:

project.packages.checkout(package='package_entity')
delete(package: Optional[Package] = None, package_name=None, package_id=None)[source]

Delete a Package object.

Prerequisites: You must be in the role of an owner or developer.

Parameters
Returns

True if success

Return type

bool

Example:

project.packages.delete(package_name='package_name')
deploy(package_id: Optional[str] = None, package_name: Optional[str] = None, package: Optional[Package] = None, service_name: Optional[str] = None, project_id: Optional[str] = None, revision: Optional[str] = None, init_input: Optional[Union[List[FunctionIO], FunctionIO, dict]] = None, runtime: Optional[Union[KubernetesRuntime, dict]] = None, sdk_version: Optional[str] = None, agent_versions: Optional[dict] = None, bot: Optional[Union[Bot, str]] = None, pod_type: Optional[InstanceCatalog] = None, verify: bool = True, checkout: bool = False, module_name: Optional[str] = None, run_execution_as_process: Optional[bool] = None, execution_timeout: Optional[int] = None, drain_time: Optional[int] = None, on_reset: Optional[str] = None, max_attempts: Optional[int] = None, force: bool = False, secrets: Optional[list] = None, **kwargs) Service[source]

Deploy a package. A service is required to run the code in your package.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • package_id (str) – package id

  • package_name (str) – package name

  • package (dtlpy.entities.package.Package) – package entity

  • service_name (str) – service name

  • project_id (str) – project id

  • revision (str) – package revision - default=latest

  • init_input – config to run at startup

  • runtime (dict) – runtime resources

  • sdk_version (str) –

    • optional - string - sdk version

  • agent_versions (dict) –

    • dictionary - - optional -versions of sdk, agent runner and agent proxy

  • bot (str) – bot email

  • pod_type (str) – pod type dl.InstanceCatalog

  • verify (bool) – verify the inputs

  • checkout (bool) – checkout

  • module_name (str) – module name

  • run_execution_as_process (bool) – run execution as process

  • execution_timeout (int) – execution timeout

  • drain_time (int) – drain time

  • on_reset (str) – on reset

  • max_attempts (int) – Maximum execution retries in-case of a service reset

  • force (bool) – optional - terminate old replicas immediately

  • secrets (list) – list of the integrations ids

Returns

Service object

Return type

dtlpy.entities.service.Service

Example:

project.packages.deploy(service_name=package_name,
                        execution_timeout=3 * 60 * 60,
                        module_name=module.name,
                        runtime=dl.KubernetesRuntime(
                            concurrency=10,
                            pod_type=dl.InstanceCatalog.REGULAR_S,
                            autoscaler=dl.KubernetesRabbitmqAutoscaler(
                                min_replicas=1,
                                max_replicas=20,
                                queue_length=20
                            )
                        )
                    )
deploy_from_file(project, json_filepath)[source]

Deploy package and service from a JSON file.

Prerequisites: You must be in the role of an owner or developer.

Parameters
Returns

the package and the services

Example:

project.packages.deploy_from_file(project='project_entity', json_filepath='json_filepath')
static generate(name=None, src_path: Optional[str] = None, service_name: Optional[str] = None, package_type='default_package_type')[source]

Generate a new package. Provide a file path to a JSON file with all the details of the package and service to generate the package.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • name (str) – name

  • src_path (str) – source file path

  • service_name (str) – service name

  • package_type (str) – package type from PackageCatalog

Example:

project.packages.generate(name='package_name',
                          src_path='src_path')
get(package_name: Optional[str] = None, package_id: Optional[str] = None, checkout: bool = False, fetch=None) Package[source]

Get Package object to use in your code.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • package_id (str) – package id

  • package_name (str) – package name

  • checkout (bool) – set the package as a default package object (cookies)

  • fetch – optional - fetch entity from platform, default taken from cookie

Returns

Package object

Return type

dtlpy.entities.package.Package

Example:

project.packages.get(package_id='package_id')
list(filters: Optional[Filters] = None, project_id: Optional[str] = None) PagedEntities[source]

List project packages.

Prerequisites: You must be in the role of an owner or developer.

Parameters
Returns

Paged entity

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

project.packages.list()
open_in_web(package: Optional[Package] = None, package_id: Optional[str] = None, package_name: Optional[str] = None)[source]

Open the package in the web platform.

Prerequisites: You must be in the role of an owner or developer.

Parameters

Example:

project.packages.open_in_web(package_id='package_id')
pull(package: Package, version=None, local_path=None, project_id=None)[source]

Pull (download) the package to a local path.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • package (dtlpy.entities.package.Package) – package entity

  • version (str) – the package version to pull

  • local_path – the path of where to save the package

  • project_id – the project id that include the package

Returns

local path where the package pull

Return type

str

Example:

project.packages.pull(package='package_entity', local_path='local_path')
push(project: Optional[Project] = None, project_id: Optional[str] = None, package_name: Optional[str] = None, src_path: Optional[str] = None, codebase: Optional[Union[GitCodebase, ItemCodebase, FilesystemCodebase]] = None, modules: Optional[List[PackageModule]] = None, is_global: Optional[bool] = None, checkout: bool = False, revision_increment: Optional[str] = None, version: Optional[str] = None, ignore_sanity_check: bool = False, service_update: bool = False, service_config: Optional[dict] = None, slots: Optional[List[PackageSlot]] = None, requirements: Optional[List[PackageRequirement]] = None, package_type=None, metadata=None) Package[source]

Push your local package to the UI.

Prerequisites: You must be in the role of an owner or developer.

Project will be taken in the following hierarchy: project(input) -> project_id(input) -> self.project(context) -> checked out

Parameters
  • project (dtlpy.entities.project.Project) – optional - project entity to deploy to. default from context or checked-out

  • project_id (str) – optional - project id to deploy to. default from context or checked-out

  • package_name (str) – package name

  • src_path (str) – path to package codebase

  • codebase (dtlpy.entities.codebase.Codebase) – codebase object

  • modules (list) – list of modules PackageModules of the package

  • is_global (bool) – is package is global or local

  • checkout (bool) – checkout package to local dir

  • revision_increment (str) – optional - str - version bumping method - major/minor/patch - default = None

  • version (str) – semver version f the package

  • ignore_sanity_check (bool) – NOT RECOMMENDED - skip code sanity check before pushing

  • service_update (bool) – optional - bool - update the service

:param dict service_config : Service object as dict. Contains the spec of the default service to create. :param list slots: optional - list of slots PackageSlot of the package :param list requirements: requirements - list of package requirements :param str package_type: default ‘faas’, options: ‘app’, ‘ml :param dict metadata: dictionary of system and user metadata

Returns

Package object

Return type

dtlpy.entities.package.Package

Example:

project.packages.push(package_name='package_name',
                        modules=[module],
                        version='1.0.0',
                        src_path=os.getcwd()
                    )
revisions(package: Optional[Package] = None, package_id: Optional[str] = None)[source]

Get the package revisions history.

Prerequisites: You must be in the role of an owner or developer.

Parameters

Example:

project.packages.revisions(package='package_entity')
test_local_package(cwd: Optional[str] = None, concurrency: Optional[int] = None, package: Optional[Package] = None, module_name: str = 'default_module', function_name: str = 'run', class_name: str = 'ServiceRunner', entry_point: str = 'main.py', mock_file_path: Optional[str] = None)[source]

Test local package in local environment.

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • cwd (str) – path to the file

  • concurrency (int) – the concurrency of the test

  • package (dtlpy.entities.package.Package) – entities.package

  • module_name (str) – module name

  • function_name (str) – function name

  • class_name (str) – class name

  • entry_point (str) – the file to run like main.py

  • mock_file_path (str) – the mock file that have the inputs

Returns

list created by the function that tested the output

Return type

list

Example:

project.packages.test_local_package(cwd='path_to_package',
                                    package='package_entity',
                                    function_name='run')
update(package: Package, revision_increment: Optional[str] = None) Package[source]

Update Package changes to the platform.

Prerequisites: You must be in the role of an owner or developer.

Parameters
Returns

Package object

Return type

dtlpy.entities.package.Package

Example:

project.packages.delete(package='package_entity')

Codebases

class Codebases(client_api: ApiClient, project: Optional[Project] = None, dataset: Optional[Dataset] = None, project_id: Optional[str] = None)[source]

Bases: object

Codebase Repository

The Codebases class allows the user to manage codebases and their properties. The codebase is the code the user uploads for the user’s packages to run. Read more about codebase in our FaaS (function as a service).

clone_git(codebase: Codebase, local_path: str)[source]

Clone code base

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • codebase (dtlpy.entities.codebase.Codebase) – codebase object

  • local_path (str) – local path

Returns

path where the clone will be

Return type

str

Example:

package.codebases.clone_git(codebase='codebase_entity', local_path='local_path')
get(codebase_name: Optional[str] = None, codebase_id: Optional[str] = None, version: Optional[str] = None)[source]

Get a Codebase object to use in your code.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Example:

package.codebases.get(codebase_name='codebase_name')
Parameters
  • codebase_name (str) – optional - search by name

  • codebase_id (str) – optional - search by id

  • version (str) – codebase version. default is latest. options: “all”, “latest” or ver number - “10”

Returns

Codebase object

static get_current_version(all_versions_pages, zip_md)[source]

This method returns the current version of the codebase and other versions found.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • all_versions_pages (codebase) – codebase object

  • zip_md – zipped file of codebase

Returns

current version and all versions found of codebase

Return type

int, int

Example:

package.codebases.get_current_version(all_versions_pages='codebase_entity', zip_md='path')
list() PagedEntities[source]

List all codebases.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Example:

package.codebases.list()
Returns

Paged entity

Return type

dtlpy.entities.paged_entities.PagedEntities

list_versions(codebase_name: str)[source]

List all codebase versions.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Example:

package.codebases.list_versions(codebase_name='codebase_name')
Parameters

codebase_name (str) – code base name

Returns

list of versions

Return type

list

pack(directory: str, name: Optional[str] = None, extension: str = 'zip', description: str = '', ignore_directories: Optional[List[str]] = None)[source]

Zip a local code directory and post to codebases.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • directory (str) – local directory to pack

  • name (str) – codebase name

  • extension (str) – the extension of the file

  • description (str) – codebase description

  • ignore_directories (list[str]) – directories to ignore.

Returns

Codebase object

Return type

dtlpy.entities.codebase.Codebase

Example:

package.codebases.pack(directory='path_dir', name='codebase_name')
pull_git(codebase, local_path)[source]

Pull (download) a codebase.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • codebase (dtlpy.entities.codebase.Codebase) – codebase object

  • local_path (str) – local path

Returns

path where the Pull will be

Return type

str

Example:

package.codebases.pull_git(codebase='codebase_entity', local_path='local_path')
unpack(codebase: Optional[Codebase] = None, codebase_name: Optional[str] = None, codebase_id: Optional[str] = None, local_path: Optional[str] = None, version: Optional[str] = None)[source]

Unpack codebase locally. Download source code and unzip.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • codebase (dtlpy.entities.codebase.Codebase) – dl.Codebase object

  • codebase_name (str) – search by name

  • codebase_id (str) – search by id

  • local_path (str) – local path to save codebase

  • version (str) – codebase version to unpack. default - latest

Returns

String (dirpath)

Return type

str

Example:

package.codebases.unpack(codebase='codebase_entity', local_path='local_path')

Services

class ServiceLog(_json: dict, service: Service, services: Services, start=None, follow=None, execution_id=None, function_name=None, replica_id=None, system=False)[source]

Bases: object

Service Log

view(until_completed)[source]

View logs

Parameters

until_completed

class Services(client_api: ApiClient, project: Optional[Project] = None, package: Optional[Package] = None, project_id=None)[source]

Bases: object

Services Repository

The Services class allows the user to manage services and their properties. Services are created from the packages users create. See our documentation for more information about services.

activate_slots(service: Service, project_id: Optional[str] = None, task_id: Optional[str] = None, dataset_id: Optional[str] = None, org_id: Optional[str] = None, user_email: Optional[str] = None, slots: Optional[List[PackageSlot]] = None, role=None, prevent_override: bool = True, visible: bool = True, icon: str = 'fas fa-magic', **kwargs)[source]

Activate service slots (creates buttons in the UI that activate services).

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • service (dtlpy.entities.service.Service) – service entity

  • project_id (str) – project id

  • task_id (str) – task id

  • dataset_id (str) – dataset id

  • org_id (str) – org id

  • user_email (str) – user email

  • slots (list) – list of entities.PackageSlot

  • role (str) – user role MemberOrgRole.ADMIN, MemberOrgRole.owner, MemberOrgRole.MEMBER

  • prevent_override (bool) – True to prevent override

  • visible (bool) – visible

  • icon (str) – icon

  • kwargs – all additional arguments

Returns

list of user setting for activated slots

Return type

list

Example:

package.services.activate_slots(service='service_entity',
                                project_id='project_id',
                                slots=List[entities.PackageSlot],
                                icon='fas fa-magic')
checkout(service: Optional[Service] = None, service_name: Optional[str] = None, service_id: Optional[str] = None)[source]

Checkout (switch) to a service.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters

Example:

package.services.checkout(service_id='service_id')
delete(service_name: Optional[str] = None, service_id: Optional[str] = None)[source]

Delete Service object

Prerequisites: You must be in the role of an owner or developer. You must have a package.

You must provide at least ONE of the following params: service_id, service_name.

Parameters
  • service_name (str) – by name

  • service_id (str) – by id

Returns

True

Return type

bool

Example:

package.services.delete(service_id='service_id')
deploy(service_name: Optional[str] = None, package: Optional[Package] = None, bot: Optional[Union[Bot, str]] = None, revision: Optional[str] = None, init_input: Optional[Union[List[FunctionIO], FunctionIO, dict]] = None, runtime: Optional[Union[KubernetesRuntime, dict]] = None, pod_type: Optional[InstanceCatalog] = None, sdk_version: Optional[str] = None, agent_versions: Optional[dict] = None, verify: bool = True, checkout: bool = False, module_name: Optional[str] = None, project_id: Optional[str] = None, driver_id: Optional[str] = None, func: Optional[Callable] = None, run_execution_as_process: Optional[bool] = None, execution_timeout: Optional[int] = None, drain_time: Optional[int] = None, max_attempts: Optional[int] = None, on_reset: Optional[str] = None, force: bool = False, secrets: Optional[list] = None, **kwargs) Service[source]

Deploy service.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • service_name (str) – name

  • package (dtlpy.entities.package.Package) – package entity

  • bot (str) – bot email

  • revision (str) – package revision of version

  • init_input – config to run at startup

  • runtime (dict) – runtime resources

  • pod_type (str) – pod type dl.InstanceCatalog

  • sdk_version (str) –

    • optional - string - sdk version

  • agent_versions (str) –

    • dictionary - - optional -versions of sdk

  • verify (bool) – if true, verify the inputs

  • checkout (bool) – if true, checkout (switch) to service

  • module_name (str) – module name

  • project_id (str) – project id

  • driver_id (str) – driver id

  • func (Callable) – function to deploy

  • run_execution_as_process (bool) – if true, run execution as process

  • execution_timeout (int) – execution timeout in seconds

  • drain_time (int) – drain time in seconds

  • max_attempts (int) – maximum execution retries in-case of a service reset

  • on_reset (str) – what happens on reset

  • force (bool) – optional - if true, terminate old replicas immediately

  • secrets (list) – list of the integrations ids

  • kwargs – list of additional arguments

Returns

Service object

Return type

dtlpy.entities.service.Service

Example:

package.services.deploy(service_name=package_name,
                        execution_timeout=3 * 60 * 60,
                        module_name=module.name,
                        runtime=dl.KubernetesRuntime(
                            concurrency=10,
                            pod_type=dl.InstanceCatalog.REGULAR_S,
                            autoscaler=dl.KubernetesRabbitmqAutoscaler(
                                min_replicas=1,
                                max_replicas=20,
                                queue_length=20
                            )
                        )
                    )
deploy_from_local_folder(cwd=None, service_file=None, bot=None, checkout=False, force=False) Service[source]

Deploy from local folder in local environment.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • cwd (str) – optional - package working directory. Default=cwd

  • service_file (str) – optional - service file. Default=None

  • bot (str) – bot

  • checkout – checkout

  • force (bool) – optional - terminate old replicas immediately

Returns

Service object

Return type

dtlpy.entities.service.Service

Example:

package.services.deploy_from_local_folder(cwd='file_path',
                                          service_file='service_file')
execute(service: Optional[Service] = None, service_id: Optional[str] = None, service_name: Optional[str] = None, sync: bool = False, function_name: Optional[str] = None, stream_logs: bool = False, execution_input=None, resource=None, item_id=None, dataset_id=None, annotation_id=None, project_id=None) Execution[source]

Execute a function on an existing service.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • service (dtlpy.entities.service.Service) – service entity

  • service_id (str) – service id

  • service_name (str) – service name

  • sync (bool) – wait for function to end

  • function_name (str) – function name to run

  • stream_logs (bool) – prints logs of the new execution. only works with sync=True

  • execution_input – input dictionary or list of FunctionIO entities

  • resource (str) – dl.PackageInputType - input type.

  • item_id (str) – str - optional - input to function

  • dataset_id (str) – str - optional - input to function

  • annotation_id (str) – str - optional - input to function

  • project_id (str) – str - resource’s project

Returns

entities.Execution

Return type

dtlpy.entities.execution.Execution

Example:

package.services.execute(service='service_entity',
                        function_name='run',
                        item_id='item_id',
                        project_id='project_id')
get(service_name=None, service_id=None, checkout=False, fetch=None) Service[source]

Get service to use in your code.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • service_name (str) – optional - search by name

  • service_id (str) – optional - search by id

  • checkout (bool) – if true, checkout (switch) to service

  • fetch – optional - fetch entity from platform, default taken from cookie

Returns

Service object

Return type

dtlpy.entities.service.Service

Example:

package.services.get(service_id='service_id')
list(filters: Optional[Filters] = None) PagedEntities[source]

List all services (services can be listed for a package or for a project).

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters

filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

Returns

Paged entity

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

package.services.list()
log(service, size=100, checkpoint=None, start=None, end=None, follow=False, text=None, execution_id=None, function_name=None, replica_id=None, system=False, view=True, until_completed=True)[source]

Get service logs.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
  • service (dtlpy.entities.service.Service) – service object

  • size (int) – size

  • checkpoint (dict) – the information from the lst point checked in the service

  • start (str) – iso format time

  • end (str) – iso format time

  • follow (bool) – if true, keep stream future logs

  • text (str) – text

  • execution_id (str) – execution id

  • function_name (str) – function name

  • replica_id (str) – replica id

  • system (bool) – system

  • view (bool) – if true, print out all the logs

  • until_completed (bool) – wait until completed

Returns

ServiceLog entity

Return type

ServiceLog

Example:

package.services.log(service='service_entity')
name_validation(name: str)[source]

Validation service name.

Prerequisites: You must be in the role of an owner or developer.

Parameters

name (str) – service name

Example:

package.services.name_validation(name='name')
open_in_web(service: Optional[Service] = None, service_id: Optional[str] = None, service_name: Optional[str] = None)[source]

Open the service in web platform

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters

Example:

package.services.open_in_web(service_id='service_id')
pause(service_name: Optional[str] = None, service_id: Optional[str] = None, force: bool = False)[source]

Pause service.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

You must provide at least ONE of the following params: service_id, service_name

Parameters
  • service_name (str) – service name

  • service_id (str) – service id

  • force (bool) – optional - terminate old replicas immediately

Returns

True if success

Return type

bool

Example:

package.services.pause(service_id='service_id')
resume(service_name: Optional[str] = None, service_id: Optional[str] = None, force: bool = False)[source]

Resume service.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

You must provide at least ONE of the following params: service_id, service_name.

Parameters
  • service_name (str) – service name

  • service_id (str) – service id

  • force (bool) – optional - terminate old replicas immediately

Returns

json of the service

Return type

dict

Example:

package.services.resume(service_id='service_id')
revisions(service: Optional[Service] = None, service_id: Optional[str] = None)[source]

Get service revisions history.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

You must provide at leats ONE of the following params: service, service_id

Parameters

Example:

package.services.revisions(service_id='service_id')
status(service_name=None, service_id=None)[source]

Get service status.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

You must provide at least ONE of the following params: service_id, service_name

Parameters
  • service_name (str) – service name

  • service_id (str) – service id

Returns

status json

Return type

dict

Example:

package.services.status(service_id='service_id')
update(service: Service, force: bool = False) Service[source]

Update service changes to platform.

Prerequisites: You must be in the role of an owner or developer. You must have a package.

Parameters
Returns

Service entity

Return type

dtlpy.entities.service.Service

Example:

package.services.update(service='service_entity')

Bots

class Bots(client_api: ApiClient, project: Project)[source]

Bases: object

Bots Repository

The Bots class allows the user to manage bots and their properties. See our documentation for more information on bots.

create(name: str, return_credentials: bool = False)[source]

Create a new Bot.

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters
  • name (str) – bot name

  • return_credentials (str) – True will return the password when created

Returns

Bot object

Return type

dtlpy.entities.bot.Bot

Example:

service.bots.delete(name='bot', return_credentials=False)
delete(bot_id: Optional[str] = None, bot_email: Optional[str] = None)[source]

Delete a Bot.

Prerequisites: You must be in the role of an owner or developer. You must have a service.

You must provide at least ONE of the following params: bot_id, bot_email

Parameters
  • bot_id (str) – bot id to delete

  • bot_email (str) – bot email to delete

Returns

True if successful

Return type

bool

Example:

service.bots.delete(bot_id='bot_id')
get(bot_email: Optional[str] = None, bot_id: Optional[str] = None, bot_name: Optional[str] = None)[source]

Get a Bot object.

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters
  • bot_email (str) – get bot by email

  • bot_id (str) – get bot by id

  • bot_name (str) – get bot by name

Returns

Bot object

Return type

dtlpy.entities.bot.Bot

Example:

service.bots.get(bot_id='bot_id')
list() List[Bot][source]

Get a project or package bots list.

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Returns

List of Bots objects

Return type

list

Example:

service.bots.list()

Triggers

class Triggers(client_api: ApiClient, project: Optional[Project] = None, service: Optional[Service] = None, project_id: Optional[str] = None, pipeline: Optional[Pipeline] = None)[source]

Bases: object

Triggers Repository

The Triggers class allows users to manage triggers and their properties. Triggers activate services. See our documentation for more information on triggers.

create(service_id: Optional[str] = None, trigger_type: TriggerType = TriggerType.EVENT, name: Optional[str] = None, webhook_id=None, function_name='run', project_id=None, active=True, filters=None, resource: TriggerResource = TriggerResource.ITEM, actions: Optional[TriggerAction] = None, execution_mode: TriggerExecutionMode = TriggerExecutionMode.ONCE, start_at=None, end_at=None, inputs=None, cron=None, pipeline_id=None, pipeline=None, pipeline_node_id=None, root_node_namespace=None, **kwargs) BaseTrigger[source]

Create a Trigger. Can create two types: a cron trigger or an event trigger. Inputs are different for each type

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Inputs for all types:

Parameters
  • service_id (str) – Id of services to be triggered

  • trigger_type (str) – can be cron or event. use enum dl.TriggerType for the full list

  • name (str) – name of the trigger

  • webhook_id (str) – id for webhook to be called

  • function_name (str) – the function name to be called when triggered (must be defined in the package)

  • project_id (str) – project id where trigger will work

  • active (bool) – optional - True/False, default = True, if true trigger is active

Inputs for event trigger: :param dtlpy.entities.filters.Filters filters: optional - Item/Annotation metadata filters, default = none :param str resource: optional - Dataset/Item/Annotation/ItemStatus, default = Item :param str actions: optional - Created/Updated/Deleted, default = create :param str execution_mode: how many times trigger should be activated; default is “Once”. enum dl.TriggerExecutionMode

Inputs for cron trigger: :param start_at: iso format date string to start activating the cron trigger :param end_at: iso format date string to end the cron activation :param inputs: dictionary “name”:”val” of inputs to the function :param str cron: cron spec specifying when it should run. more information: https://en.wikipedia.org/wiki/Cron :param str pipeline_id: Id of pipeline to be triggered :param pipeline: pipeline entity to be triggered :param str pipeline_node_id: Id of pipeline root node to be triggered :param root_node_namespace: namespace of pipeline root node to be triggered

Returns

Trigger entity

Return type

dtlpy.entities.trigger.Trigger

Example:

service.triggers.create(name='triggername',
                      execution_mode=dl.TriggerExecutionMode.ONCE,
                      resource='Item',
                      actions='Created',
                      function_name='run',
                      filters={'$and': [{'hidden': False},
                                        {'type': 'file'}]}
                      )
delete(trigger_id=None, trigger_name=None)[source]

Delete Trigger object

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters
  • trigger_id (str) – trigger id

  • trigger_name (str) – trigger name

Returns

True is successful error if not

Return type

bool

Example:

service.triggers.delete(trigger_id='trigger_id')
get(trigger_id=None, trigger_name=None) BaseTrigger[source]

Get Trigger object

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters
  • trigger_id (str) – trigger id

  • trigger_name (str) – trigger name

Returns

Trigger entity

Return type

dtlpy.entities.trigger.Trigger

Example:

service.triggers.get(trigger_id='trigger_id')
list(filters: Optional[Filters] = None) PagedEntities[source]

List triggers of a project, package, or service.

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters

filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

Returns

Paged entity

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

service.triggers.list()
name_validation(name: str)[source]

This method validates the trigger name. If name is not valid, this method will return an error. Otherwise, it will not return anything.

Parameters

name (str) – trigger name

resource_information(resource, resource_type, action='Created')[source]

Returns which function should run on an item (based on global triggers).

Prerequisites: You must be a superuser to run this method.

Parameters
  • resource – ‘Item’ / ‘Dataset’ / etc

  • resource_type – dictionary of the resource object

  • action – ‘Created’ / ‘Updated’ / etc.

Example:

service.triggers.resource_information(resource='Item', resource_type=item_object, action='Created')
update(trigger: BaseTrigger) BaseTrigger[source]

Update trigger

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters

trigger (dtlpy.entities.trigger.Trigger) – Trigger entity

Returns

Trigger entity

Return type

dtlpy.entities.trigger.Trigger

Example:

service.triggers.update(trigger='trigger_entity')

Executions

class Executions(client_api: ApiClient, service: Optional[Service] = None, project: Optional[Project] = None)[source]

Bases: object

Service Executions Repository

The Executions class allows the users to manage executions (executions of services) and their properties. See our documentation for more information about executions.

create(service_id: Optional[str] = None, execution_input: Optional[list] = None, function_name: Optional[str] = None, resource: Optional[PackageInputType] = None, item_id: Optional[str] = None, dataset_id: Optional[str] = None, annotation_id: Optional[str] = None, project_id: Optional[str] = None, sync: bool = False, stream_logs: bool = False, return_output: bool = False, return_curl_only: bool = False, timeout: Optional[int] = None) Execution[source]

Execute a function on an existing service

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters
  • service_id (str) – service id to execute on

  • execution_input (List[FunctionIO] or dict) – input dictionary or list of FunctionIO entities

  • function_name (str) – function name to run

  • resource (str) – input type.

  • item_id (str) – optional - item id as input to function

  • dataset_id (str) – optional - dataset id as input to function

  • annotation_id (str) – optional - annotation id as input to function

  • project_id (str) – resource’s project

  • sync (bool) – if true, wait for function to end

  • stream_logs (bool) – prints logs of the new execution. only works with sync=True

  • return_output (bool) – if True and sync is True - will return the output directly

  • return_curl_only (bool) – return the cURL of the creation WITHOUT actually do it

  • timeout (int) – int, seconds to wait until TimeoutError is raised. if <=0 - wait until done - by default wait take the service timeout

Returns

execution object

Return type

dtlpy.entities.execution.Execution

Example:

service.executions.create(function_name='function_name', item_id='item_id', project_id='project_id')
get(execution_id: Optional[str] = None, sync: bool = False) Execution[source]

Get Service execution object

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters
  • execution_id (str) – execution id

  • sync (bool) – if true, wait for the execution to finish

Returns

Service execution object

Return type

dtlpy.entities.execution.Execution

Example:

service.executions.get(execution_id='execution_id')
increment(execution: Execution)[source]

Increment the number of attempts that an execution is allowed to attempt to run a service that is not responding.

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters

execution (dtlpy.entities.execution.Execution) –

Returns

int

Return type

int

Example:

service.executions.increment(execution='execution_entity')
list(filters: Optional[Filters] = None) PagedEntities[source]

List service executions

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters

filters (dtlpy.entities.filters.Filters) – dl.Filters entity to filters items

Returns

Paged entity

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

service.executions.list()
logs(execution_id: str, follow: bool = True, until_completed: bool = True)[source]

executions logs

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters
  • execution_id (str) – execution id

  • follow (bool) – if true, keep stream future logs

  • until_completed (bool) – if true, wait until completed

Returns

executions logs

Example:

service.executions.logs(execution_id='execution_id')
progress_update(execution_id: str, status: Optional[ExecutionStatus] = None, percent_complete: Optional[int] = None, message: Optional[str] = None, output: Optional[str] = None, service_version: Optional[str] = None)[source]

Update Execution Progress.

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters
  • execution_id (str) – execution id

  • status (str) – ExecutionStatus

  • percent_complete (int) – percent work done

  • message (str) – message

  • output (str) – the output of the execution

  • service_version (str) – service version

Returns

Service execution object

Return type

dtlpy.entities.execution.Execution

Example:

service.executions.progress_update(execution_id='execution_id', status='complete', percent_complete=100)
rerun(execution: Execution, sync: bool = False)[source]

Rerun execution

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters
Returns

Execution object

Return type

dtlpy.entities.execution.Execution

Example:

service.executions.rerun(execution='execution_entity')
terminate(execution: Execution)[source]

Terminate Execution

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters

execution (dtlpy.entities.execution.Execution) –

Returns

execution object

Return type

dtlpy.entities.execution.Execution

Example:

service.executions.terminate(execution='execution_entity')
update(execution: Execution) Execution[source]

Update execution changes to platform

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters

execution (dtlpy.entities.execution.Execution) – execution entity

Returns

Service execution object

Return type

dtlpy.entities.execution.Execution

Example:

service.executions.update(execution='execution_entity')
wait(execution_id: str, timeout: Optional[int] = None)[source]

Get Service execution object.

Prerequisites: You must be in the role of an owner or developer. You must have a service.

Parameters
  • execution_id (str) – execution id

  • timeout (int) – seconds to wait until TimeoutError is raised. if <=0 - wait until done - by default wait take the service timeout

Returns

Service execution object

Return type

dtlpy.entities.execution.Execution

Example:

service.executions.wait(execution_id='execution_id')

Pipelines

class Pipelines(client_api: ApiClient, project: Optional[Project] = None)[source]

Bases: object

Pipelines Repository

The Pipelines class allows users to manage pipelines and their properties. See our documentation for more information on pipelines.

create(name: Optional[str] = None, project_id: Optional[str] = None, pipeline_json: Optional[dict] = None) Pipeline[source]

Create a new pipeline.

prerequisites: You must be an owner or developer to use this method.

Parameters
  • name (str) – pipeline name

  • project_id (str) – project id

  • pipeline_json (dict) – json containing the pipeline fields

Returns

Pipeline object

Return type

dtlpy.entities.pipeline.Pipeline

Example:

project.pipelines.create(name='pipeline_name')
delete(pipeline: Optional[Pipeline] = None, pipeline_name: Optional[str] = None, pipeline_id: Optional[str] = None)[source]

Delete Pipeline object.

prerequisites: You must be an owner or developer to use this method.

Parameters
Returns

True if success

Return type

bool

Example:

project.pipelines.delete(pipeline_id='pipeline_id')
execute(pipeline: Optional[Pipeline] = None, pipeline_id: Optional[str] = None, pipeline_name: Optional[str] = None, execution_input=None)[source]

Execute a pipeline and return the pipeline execution as an object.

prerequisites: You must be an owner or developer to use this method.

Parameters
  • pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity

  • pipeline_id (str) – pipeline id

  • pipeline_name (str) – pipeline name

  • execution_input – list of the dl.FunctionIO or dict of pipeline input - example {‘item’: ‘item_id’}

Returns

entities.PipelineExecution object

Return type

dtlpy.entities.pipeline_execution.PipelineExecution

Example:

project.pipelines.execute(pipeline='pipeline_entity', execution_input= {'item': 'item_id'} )
get(pipeline_name=None, pipeline_id=None, fetch=None) Pipeline[source]

Get Pipeline object to use in your code.

prerequisites: You must be an owner or developer to use this method.

You must provide at least ONE of the following params: pipeline_name, pipeline_id.

Parameters
  • pipeline_id (str) – pipeline id

  • pipeline_name (str) – pipeline name

  • fetch – optional - fetch entity from platform, default taken from cookie

Returns

Pipeline object

Return type

dtlpy.entities.pipeline.Pipeline

Example:

project.pipelines.get(pipeline_id='pipeline_id')
install(pipeline: Optional[Pipeline] = None)[source]

Install (start) a pipeline.

prerequisites: You must be an owner or developer to use this method.

Parameters

pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity

Returns

Composition object

Example:

project.pipelines.install(pipeline='pipeline_entity')
list(filters: Optional[Filters] = None, project_id: Optional[str] = None) PagedEntities[source]

List project pipelines.

prerequisites: You must be an owner or developer to use this method.

Parameters
Returns

Paged entity

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

project.pipelines.get()
open_in_web(pipeline: Optional[Pipeline] = None, pipeline_id: Optional[str] = None, pipeline_name: Optional[str] = None)[source]

Open the pipeline in web platform.

prerequisites: Must be owner or developer to use this method.

Parameters

Example:

project.pipelines.open_in_web(pipeline_id='pipeline_id')
pause(pipeline: Optional[Pipeline] = None)[source]

Pause a pipeline.

prerequisites: You must be an owner or developer to use this method.

Parameters

pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity

Returns

Composition object

Example:

project.pipelines.pause(pipeline='pipeline_entity')
reset(pipeline: Optional[Pipeline] = None, pipeline_id: Optional[str] = None, pipeline_name: Optional[str] = None, stop_if_running: bool = False)[source]

Reset pipeline counters.

prerequisites: You must be an owner or developer to use this method.

Parameters
  • pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity - optional

  • pipeline_id (str) – pipeline_id - optional

  • pipeline_name (str) – pipeline_name - optional

  • stop_if_running (bool) – If the pipeline is installed it will stop the pipeline and reset the counters.

Returns

bool

Example:

project.pipelines.reset(pipeline='pipeline_entity')
stats(pipeline: Optional[Pipeline] = None, pipeline_id: Optional[str] = None, pipeline_name: Optional[str] = None)[source]

Get pipeline counters.

prerequisites: You must be an owner or developer to use this method.

Parameters
Returns

PipelineStats

Return type

dtlpy.entities.pipeline.PipelineStats

Example:

project.pipelines.stats(pipeline='pipeline_entity')
update(pipeline: Optional[Pipeline] = None) Pipeline[source]

Update pipeline changes to platform.

prerequisites: You must be an owner or developer to use this method.

Parameters

pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity

Returns

Pipeline object

Return type

dtlpy.entities.pipeline.Pipeline

Example:

project.pipelines.update(pipeline='pipeline_entity')

Pipeline Executions

class PipelineExecutions(client_api: ApiClient, project: Optional[Project] = None, pipeline: Optional[Pipeline] = None)[source]

Bases: object

PipelineExecutions Repository

The PipelineExecutions class allows users to manage pipeline executions. See our documentation for more information on pipelines.

create(pipeline_id: Optional[str] = None, execution_input=None)[source]

Execute a pipeline and return the execute.

prerequisites: You must be an owner or developer to use this method.

Parameters
  • pipeline_id – pipeline id

  • execution_input – list of the dl.FunctionIO or dict of pipeline input - example {‘item’: ‘item_id’}

Returns

entities.PipelineExecution object

Return type

dtlpy.entities.pipeline_execution.PipelineExecution

Example:

pipeline.pipeline_executions.create(pipeline_id='pipeline_id', execution_input={'item': 'item_id'})
get(pipeline_execution_id: str, pipeline_id: Optional[str] = None) PipelineExecution[source]

Get Pipeline Execution object

prerequisites: You must be an owner or developer to use this method.

Parameters
  • pipeline_execution_id (str) – pipeline execution id

  • pipeline_id (str) – pipeline id

Returns

PipelineExecution object

Return type

dtlpy.entities.pipeline_execution.PipelineExecution

Example:

pipeline.pipeline_executions.get(pipeline_id='pipeline_id')
list(filters: Optional[Filters] = None) PagedEntities[source]

List project pipeline executions.

prerequisites: You must be an owner or developer to use this method.

Parameters

filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

Returns

Paged entity

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

pipeline.pipeline_executions.list()

General Commands

class Commands(client_api: ApiClient)[source]

Bases: object

Service Commands repository

abort(command_id: str)[source]

Abort Command

Parameters

command_id (str) – command id

Returns

get(command_id: Optional[str] = None, url: Optional[str] = None) Command[source]

Get Service command object

Parameters
  • command_id (str) –

  • url (str) – command url

Returns

Command object

list()[source]

List of commands

Returns

list of commands

wait(command_id, timeout=0, step=None, url=None, backoff_factor=0.1)[source]

Wait for command to finish

backoff_factor: A backoff factor to apply between attempts after the second try {backoff factor} * (2 ** ({number of total retries} - 1)) seconds. If the backoff_factor is 0.1, then sleep() will sleep for [0.0s, 0.2s, 0.4s, …] between retries. It will never be longer than 8 sec

Parameters
  • command_id (str) – Command id to wait to

  • timeout (int) – int, seconds to wait until TimeoutError is raised. if 0 - wait until done

  • step (int) – int, seconds between polling

  • url (str) – url to the command

  • backoff_factor (float) – A backoff factor to apply between attempts after the second try

Returns

Command object

Download Commands

Upload Commands

Entities

Organization

class CacheAction(value)[source]

Bases: str, Enum

An enumeration.

class MemberOrgRole(value)[source]

Bases: str, Enum

An enumeration.

class Organization(members: list, groups: list, account: dict, created_at, updated_at, id, name, logo_url, plan, owner, created_by, client_api: ApiClient, repositories=NOTHING)[source]

Bases: BaseEntity

Organization entity

add_member(email, role: ~dtlpy.entities.organization.MemberOrgRole = <enum 'MemberOrgRole'>)[source]

Add members to your organization. Read about members and groups [here](https://dataloop.ai/docs/org-members-groups).

Prerequisities: To add members to an organization, you must be in the role of an “owner” in that organization.

Parameters
  • email (str) – the member’s email

  • role (str) – MemberOrgRole.ADMIN, MemberOrgRole.OWNER, MemberOrgRole.MEMBER

Returns

True if successful or error if unsuccessful

Return type

bool

cache_action(mode=CacheAction.APPLY, pod_type=PodType.SMALL)[source]

Open the organizations in web platform

Parameters
  • mode (str) – dl.CacheAction.APPLY or dl.CacheAction.DESTROY

  • pod_type (dl.PodType) – dl.PodType.SMALL, dl.PodType.MEDIUM, dl.PodType.HIGH

Returns

True if success

Return type

bool

delete_member(user_id: str, sure: bool = False, really: bool = False)[source]

Delete member from the Organization.

Prerequisites: Must be an organization “owner” to delete members.

Parameters
  • user_id (str) – user id

  • sure (bool) – Are you sure you want to delete?

  • really (bool) – Really really sure?

Returns

True if success and error if not

Return type

bool

classmethod from_json(_json, client_api, is_fetched=True)[source]

Build a Project entity object from a json

Parameters
  • is_fetched (bool) – is Entity fetched from Platform

  • _json (dict) – _json response from host

  • client_api (dl.ApiClient) – ApiClient entity

Returns

Organization object

Return type

dtlpy.entities.organization.Organization

list_groups()[source]

List all organization groups (groups that were created within the organization).

Prerequisites: You must be an organization “owner” to use this method.

Returns

groups list

Return type

list

list_members(role: Optional[MemberOrgRole] = None)[source]

List all organization members.

Prerequisites: You must be an organization “owner” to use this method.

Parameters

role (str) – MemberOrgRole.ADMIN, MemberOrgRole.OWNER, MemberOrgRole.MEMBER

Returns

projects list

Return type

list

open_in_web()[source]

Open the organizations in web platform

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update(plan: str)[source]

Update Organization.

Prerequisities: You must be an Organization superuser to update an organization.

Parameters

plan (str) – OrganizationsPlans.FREEMIUM, OrganizationsPlans.PREMIUM

Returns

organization object

update_member(email: str, role: MemberOrgRole = MemberOrgRole.MEMBER)[source]

Update member role.

Prerequisities: You must be an organization “owner” to update a member’s role.

Parameters
  • email (str) – the member’s email

  • role (str) – MemberOrgRole.ADMIN, MemberOrgRole.OWNER, MemberOrgRole.MEMBER

Returns

json of the member fields

Return type

dict

class OrganizationsPlans(value)[source]

Bases: str, Enum

An enumeration.

class PodType(value)[source]

Bases: str, Enum

An enumeration.

Integration

class Integration(id, name, type, org, created_at, created_by, update_at, client_api: ApiClient, project=None)[source]

Bases: BaseEntity

Integration object

delete(sure: bool = False, really: bool = False) bool[source]

Delete integrations from the Organization

Parameters
  • sure (bool) – are you sure you want to delete?

  • really (bool) – really really?

Returns

True

Return type

bool

classmethod from_json(_json: dict, client_api: ApiClient, is_fetched=True)[source]

Build a Integration entity object from a json

Parameters
  • _json – _json response from host

  • client_api – ApiClient entity

  • is_fetched – is Entity fetched from Platform

Returns

Integration object

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update(new_name: str)[source]

Update the integrations name

Parameters

new_name (str) – new name

Project

class MemberRole(value)[source]

Bases: str, Enum

An enumeration.

class Project(contributors, created_at, creator, id, name, org, updated_at, role, account, is_blocked, feature_constraints, client_api: ApiClient, repositories=NOTHING)[source]

Bases: BaseEntity

Project entity

add_member(email, role: MemberRole = MemberRole.DEVELOPER)[source]

Add a member to the project.

Parameters
  • email (str) – member email

  • role – The required role for the user. Use the enum dl.MemberRole

Returns

dict that represent the user

Return type

dict

checkout()[source]

Checkout (switch) to a project to work on.

delete(sure=False, really=False)[source]

Delete the project forever!

Parameters
  • sure (bool) – Are you sure you want to delete?

  • really (bool) – Really really sure?

Returns

True if success, error if not

Return type

bool

classmethod from_json(_json, client_api, is_fetched=True)[source]

Build a Project object from a json

Parameters
  • is_fetched (bool) – is Entity fetched from Platform

  • _json (dict) – _json response from host

  • client_api (dl.ApiClient) – ApiClient entity

Returns

Project object

Return type

dtlpy.entities.project.Project

list_members(role: Optional[MemberRole] = None)[source]

List the project members.

Parameters

role – The required role for the user. Use the enum dl.MemberRole

Returns

list of the project members

Return type

list

open_in_web()[source]

Open the project in web platform

remove_member(email)[source]

Remove a member from the project.

Parameters

email (str) – member email

Returns

dict that represents the user

Return type

dict

to_json()[source]

Returns platform _json format of project object

Returns

platform json format of project object

Return type

dict

update(system_metadata=False)[source]

Update the project

Parameters

system_metadata (bool) – optional - True, if you want to change metadata system

Returns

Project object

Return type

dtlpy.entities.project.Project

update_member(email, role: MemberRole = MemberRole.DEVELOPER)[source]

Update member’s information/details from the project.

Parameters
  • email (str) – member email

  • role – The required role for the user. Use the enum dl.MemberRole

Returns

dict that represent the user

Return type

dict

User

class User(created_at, updated_at, name, last_name, username, avatar, email, role, type, org, id, project, client_api=None, users=None)[source]

Bases: BaseEntity

User entity

classmethod from_json(_json, project, client_api, users=None)[source]

Build a User entity object from a json

Parameters
Returns

User object

Return type

dtlpy.entities.user.User

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

Dataset

class Dataset(id, url, name, annotated, creator, projects, items_count, metadata, directoryTree, export, expiration_options, index_driver, created_at, items_url, readable_type, access_level, driver, readonly, client_api: ApiClient, project=None, datasets=None, repositories=NOTHING, ontology_ids=None, labels=None, directory_tree=None, recipe=None, ontology=None)[source]

Bases: BaseEntity

Dataset object

add_label(label_name, color=None, children=None, attributes=None, display_label=None, label=None, recipe_id=None, ontology_id=None, icon_path=None)[source]

Add single label to dataset

Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.

Parameters
  • label_name (str) – str - label name

  • color (tuple) – RGB color of the annotation, e.g (255,0,0) or ‘#ff0000’ for red

  • children – children (sub labels). list of sub labels of this current label, each value is either dict or dl.Label

  • attributes (list) – add attributes to the labels

  • display_label (str) – name that display label

  • label (dtlpy.entities.label.Label) – label object

  • recipe_id (str) – optional recipe id

  • ontology_id (str) – optional ontology id

  • icon_path (str) – path to image to be display on label

Returns

label entity

Return type

dtlpy.entities.label.Label

Example:

dataset.add_label(label_name='person', color=(34, 6, 231), attributes=['big', 'small'])
add_labels(label_list, ontology_id=None, recipe_id=None)[source]

Add labels to dataset

Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.

Parameters
  • label_list (list) – a list of labels to add to the dataset’s ontology. each value should be a dict, dl.Label or a string

  • ontology_id (str) – optional ontology id

  • recipe_id (str) – optional recipe id

Returns

label entities

Example:

dataset.add_labels(label_list=label_list)
checkout()[source]

Checkout the dataset

clone(clone_name, filters=None, with_items_annotations=True, with_metadata=True, with_task_annotations_status=True)[source]

Clone dataset

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • clone_name (str) – new dataset name

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a query dict

  • with_items_annotations (bool) – clone all item’s annotations

  • with_metadata (bool) – clone metadata

  • with_task_annotations_status (bool) – clone task annotations status

Returns

dataset object

Return type

dtlpy.entities.dataset.Dataset

Example:

dataset.clone(dataset_id='dataset_id',
              clone_name='dataset_clone_name',
              with_metadata=True,
              with_items_annotations=False,
              with_task_annotations_status=False)
delete(sure=False, really=False)[source]

Delete a dataset forever!

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • sure (bool) – are you sure you want to delete?

  • really (bool) – really really?

Returns

True is success

Return type

bool

Example:

dataset.delete(sure=True, really=True)
delete_attributes(keys: list, recipe_id: Optional[str] = None, ontology_id: Optional[str] = None)[source]

Delete a bulk of attributes

Parameters
  • recipe_id (str) – recipe id

  • ontology_id (str) – ontology id

  • keys (list) – Keys of attributes to delete

Returns

True if success

Return type

bool

delete_labels(label_names)[source]

Delete labels from dataset’s ontologies

Prerequisites: You must be in the role of an owner or developer.

Parameters

label_names – label object/ label name / list of label objects / list of label names

Example:

dataset.delete_labels(label_names=['myLabel1', 'Mylabel2'])
download(filters=None, local_path=None, file_types=None, annotation_options: Optional[ViewAnnotationOptions] = None, annotation_filters=None, overwrite=False, to_items_folder=True, thickness=1, with_text=False, without_relative_path=None, alpha=1, export_version=ExportVersion.V1)[source]

Download dataset by filters. Filtering the dataset for items and save them local Optional - also download annotation, mask, instance and image mask of the item

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • local_path (str) – local folder or filename to save to.

  • file_types (list) – a list of file type to download. e.g [‘video/webm’, ‘video/mp4’, ‘image/jpeg’, ‘image/png’]

  • annotation_options (list) – type of download annotations: list(dl.ViewAnnotationOptions)

  • annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download

  • overwrite (bool) – optional - default = False to overwrite the existing files

  • to_items_folder (bool) – Create ‘items’ folder and download items to it

  • thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1

  • with_text (bool) – optional - add text to annotations, default = False

  • without_relative_path (bool) – bool - download items without the relative path from platform

  • alpha (float) – opacity value [0 1], default 1

  • export_version (str) – V2 - exported items will have original extension in filename, V1 - no original extension in filenames

Returns

List of local_path per each downloaded item

Example:

dataset.download(local_path='local_path',
                 annotation_options=[dl.ViewAnnotationOptions.JSON, dl.ViewAnnotationOptions.MASK],
                 overwrite=False,
                 thickness=1,
                 with_text=False,
                 alpha=1,
                 save_locally=True
                 )
download_annotations(local_path=None, filters=None, annotation_options: Optional[ViewAnnotationOptions] = None, annotation_filters=None, overwrite=False, thickness=1, with_text=False, remote_path=None, include_annotations_in_output=True, export_png_files=False, filter_output_annotations=False, alpha=1, export_version=ExportVersion.V1)[source]

Download dataset by filters. Filtering the dataset for items and save them local Optional - also download annotation, mask, instance and image mask of the item

Prerequisites: You must be in the role of an owner or developer.

Parameters
  • local_path (str) – local folder or filename to save to.

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • annotation_options (list(dtlpy.entities.annotation.ViewAnnotationOptions)) – download annotations options: list(dl.ViewAnnotationOptions)

  • annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download

  • overwrite (bool) – optional - default = False

  • thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1

  • with_text (bool) – optional - add text to annotations, default = False

  • remote_path (str) – DEPRECATED and ignored

  • include_annotations_in_output (bool) – default - False , if export should contain annotations

  • export_png_files (bool) – default - if True, semantic annotations should be exported as png files

  • filter_output_annotations (bool) – default - False, given an export by filter - determine if to filter out annotations

  • alpha (float) – opacity value [0 1], default 1

  • export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames

Returns

local_path of the directory where all the downloaded item

Return type

str

Example:

dataset.download_annotations(dataset='dataset_entity',
                             local_path='local_path',
                             annotation_options=[dl.ViewAnnotationOptions.JSON, dl.ViewAnnotationOptions.MASK],
                             overwrite=False,
                             thickness=1,
                             with_text=False,
                             alpha=1
                             )
classmethod from_json(project: Project, _json: dict, client_api: ApiClient, datasets=None, is_fetched=True)[source]

Build a Dataset entity object from a json

Parameters
  • project – dataset’s project

  • _json (dict) – _json response from host

  • client_api – ApiClient entity

  • datasets – Datasets repository

  • is_fetched (bool) – is Entity fetched from Platform

Returns

Dataset object

Return type

dtlpy.entities.dataset.Dataset

get_recipe_ids()[source]

Get dataset recipe Ids

Returns

list of recipe ids

Return type

list

open_in_web()[source]

Open the dataset in web platform

static serialize_labels(labels_dict)[source]

Convert hex color format to rgb

Parameters

labels_dict (dict) – dict of labels

Returns

dict of converted labels

set_readonly(state: bool)[source]

Set dataset readonly mode

Prerequisites: You must be in the role of an owner or developer.

Parameters

state (bool) – state

Example:

dataset.set_readonly(state=True)
switch_recipe(recipe_id=None, recipe=None)[source]

Switch the recipe that linked to the dataset with the given one

Parameters

Example:

dataset.switch_recipe(recipe_id='recipe_id')
sync(wait=True)[source]

Sync dataset with external storage

Prerequisites: You must be in the role of an owner or developer.

Parameters

wait (bool) – wait for the command to finish

Returns

True if success

Return type

bool

Example:

dataset.sync()
to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update(system_metadata=False)[source]

Update dataset field

Prerequisites: You must be an owner or developer to use this method.

Parameters

system_metadata (bool) – bool - True, if you want to change metadata system

Returns

Dataset object

Return type

dtlpy.entities.dataset.Dataset

Example:

dataset.update()
update_attributes(title: str, key: str, attribute_type, recipe_id: Optional[str] = None, ontology_id: Optional[str] = None, scope: Optional[list] = None, optional: Optional[bool] = None, values: Optional[list] = None, attribute_range=None)[source]

ADD a new attribute or update if exist

Parameters
  • ontology_id (str) – ontology_id

  • title (str) – attribute title

  • key (str) – the key of the attribute must br unique

  • attribute_type (AttributesTypes) – dl.AttributesTypes your attribute type

  • scope (list) – list of the labels or * for all labels

  • optional (bool) – optional attribute

  • values (list) – list of the attribute values ( for checkbox and radio button)

  • attribute_range (dict or AttributesRange) – dl.AttributesRange object

Returns

true in success

Return type

bool

Example:

dataset.update_attributes(ontology_id='ontology_id',
                          key='1',
                          title='checkbox',
                          attribute_type=dl.AttributesTypes.CHECKBOX,
                          values=[1,2,3])
update_label(label_name, color=None, children=None, attributes=None, display_label=None, label=None, recipe_id=None, ontology_id=None, upsert=False, icon_path=None)[source]

Add single label to dataset

Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.

Parameters
  • label_name (str) – str - label name

  • color (tuple) – color

  • children – children (sub labels)

  • attributes (list) – add attributes to the labels

  • display_label (str) – name that display label

  • label (dtlpy.entities.label.Label) – label

  • recipe_id (str) – optional recipe id

  • ontology_id (str) – optional ontology id

  • icon_path (str) – path to image to be display on label

Returns

label entity

Return type

dtlpy.entities.label.Label

Example:

dataset.update_label(label_name='person', color=(34, 6, 231), attributes=['big', 'small'])
update_labels(label_list, ontology_id=None, recipe_id=None, upsert=False)[source]

Add labels to dataset

Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.

Parameters
  • label_list (list) – label list

  • ontology_id (str) – optional ontology id

  • recipe_id (str) – optional recipe id

  • upsert (bool) – if True will add in case it does not existing

Returns

label entities

Return type

dtlpy.entities.label.Label

Example:

dataset.update_labels(label_list=label_list)
upload_annotations(local_path, filters=None, clean=False, remote_root_path='/', export_version=ExportVersion.V1)[source]

Upload annotations to dataset.

Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.

Parameters
  • local_path (str) – str - local folder where the annotations files is.

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • clean (bool) – bool - if True it remove the old annotations

  • remote_root_path (str) – str - the remote root path to match remote and local items

  • export_version (str) – V2 - exported items will have original extension in filename, V1 - no original extension in filenames

For example, if the item filepath is a/b/item and remote_root_path is /a the start folder will be b instead of a

Example:

dataset.upload_annotations(dataset='dataset_entity',
                         local_path='local_path',
                         clean=False,
                         export_version=dl.ExportVersion.V1
                         )
class ExpirationOptions(item_max_days: Optional[int] = None)[source]

Bases: object

ExpirationOptions object

class IndexDriver(value)[source]

Bases: str, Enum

An enumeration.

Driver

class Driver(bucket_name, creator, allow_external_delete, allow_external_modification, created_at, region, path, type, integration_id, integration_type, metadata, name, id, client_api: ApiClient, repositories=NOTHING)[source]

Bases: BaseEntity

Driver entity

delete(sure=False, really=False)[source]

Delete a driver forever!

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • sure (bool) – are you sure you want to delete?

  • really (bool) – really really?

Returns

True if success

Return type

bool

Example:

driver.delete(sure=True, really=True)
classmethod from_json(_json, client_api, is_fetched=True)[source]

Build a Driver entity object from a json

Parameters
  • _json – _json response from host

  • client_api – ApiClient entity

  • is_fetched – is Entity fetched from Platform

Returns

Driver object

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

class ExternalStorage(value)[source]

Bases: str, Enum

An enumeration.

Item

class ExportMetadata(value)[source]

Bases: Enum

An enumeration.

class Item(annotations_link, dataset_url, thumbnail, created_at, dataset_id, annotated, metadata, filename, stream, name, type, url, id, hidden, dir, spec, creator, description, src_item, annotations_count, client_api: ApiClient, platform_dict, dataset, model, project, project_id, repositories=NOTHING)[source]

Bases: BaseEntity

Item object

clone(dst_dataset_id=None, remote_filepath=None, metadata=None, with_annotations=True, with_metadata=True, with_task_annotations_status=False, allow_many=False, wait=True)[source]

Clone item

Parameters
  • dst_dataset_id (str) – destination dataset id

  • remote_filepath (str) – complete filepath

  • metadata (dict) – new metadata to add

  • with_annotations (bool) – clone annotations

  • with_metadata (bool) – clone metadata

  • with_task_annotations_status (bool) – clone task annotations status

  • allow_many (bool) – bool if True, using multiple clones in single dataset is allowed, (default=False)

  • wait (bool) – wait for the command to finish

Returns

Item object

Return type

dtlpy.entities.item.Item

Example:

item.clone(item_id='item_id',
        dst_dataset_id='dist_dataset_id',
        with_metadata=True,
        with_task_annotations_status=False,
        with_annotations=False)
delete()[source]

Delete item from platform

Returns

True

Return type

bool

download(local_path=None, file_types=None, save_locally=True, to_array=False, annotation_options: Optional[ViewAnnotationOptions] = None, overwrite=False, to_items_folder=True, thickness=1, with_text=False, annotation_filters=None, alpha=1, export_version=ExportVersion.V1)[source]

Download dataset by filters. Filtering the dataset for items and save them local Optional - also download annotation, mask, instance and image mask of the item

Parameters
  • local_path (str) – local folder or filename to save to.

  • file_types (list) – a list of file type to download. e.g [‘video/webm’, ‘video/mp4’, ‘image/jpeg’, ‘image/png’]

  • save_locally (bool) – bool. save to disk or return a buffer

  • to_array (bool) – returns Ndarray when True and local_path = False

  • annotation_options (list) – download annotations options: list(dl.ViewAnnotationOptions)

  • annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download

  • overwrite (bool) – optional - default = False

  • to_items_folder (bool) – Create ‘items’ folder and download items to it

  • thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1

  • with_text (bool) – optional - add text to annotations, default = False

  • alpha (float) – opacity value [0 1], default 1

  • export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames

Returns

generator of local_path per each downloaded item

Return type

generator or single item

Example:

item.download(local_path='local_path',
             annotation_options=dl.ViewAnnotationOptions.MASK,
             overwrite=False,
             thickness=1,
             with_text=False,
             alpha=1,
             save_locally=True
             )
classmethod from_json(_json, client_api, dataset=None, project=None, model=None, is_fetched=True)[source]

Build an item entity object from a json

Parameters
  • project (dtlpy.entities.project.Project) – project entity

  • _json (dict) – _json response from host

  • dataset (dtlpy.entities.dataset.Dataset) – dataset in which the annotation’s item is located

  • model (dtlpy.entities.dataset.Model) – the model entity if item is an artifact of a model

  • client_api (dlApiClient) – ApiClient entity

  • is_fetched (bool) – is Entity fetched from Platform

Returns

Item object

Return type

dtlpy.entities.item.Item

move(new_path)[source]

Move item from one folder to another in Platform If the directory doesn’t exist it will be created

Parameters

new_path (str) – new full path to move item to.

Returns

True if update successfully

Return type

bool

open_in_web()[source]

Open the items in web platform

Returns

set_description(text: str)[source]

Update Item description

Parameters

text (str) – if None or “” description will be deleted

:return

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update(system_metadata=False)[source]

Update items metadata

Parameters

system_metadata (bool) – bool - True, if you want to change metadata system

Returns

Item object

Return type

dtlpy.entities.item.Item

update_status(status: str, clear: bool = False, assignment_id: Optional[str] = None, task_id: Optional[str] = None)[source]

update item status

Parameters
  • status (str) – “completed” ,”approved” ,”discard”

  • clear (bool) – if true delete status

  • assignment_id (str) – assignment id

  • task_id (str) – task id

:return :True/False :rtype: bool

Example:

item.update_status(status='complete',
                   operation='created',
                   task_id='task_id')
class ItemStatus(value)[source]

Bases: str, Enum

An enumeration.

class ModalityRefTypeEnum(value)[source]

Bases: str, Enum

State enum

class ModalityTypeEnum(value)[source]

Bases: str, Enum

State enum

Annotation

class Annotation(id, url, item_url, item, item_id, creator, created_at, updated_by, updated_at, type, source, dataset_url, platform_dict, metadata, fps, hash=None, dataset_id=None, status=None, object_id=None, automated=None, item_height=None, item_width=None, label_suggestions=None, annotation_definition: Optional[BaseAnnotationDefinition] = None, frames=None, current_frame=0, end_frame=0, end_time=0, start_frame=0, start_time=0, dataset=None, datasets=None, annotations=None, Annotation__client_api=None, items=None, recipe_2_attributes=None)[source]

Bases: BaseEntity

Annotations object

add_frame(annotation_definition, frame_num=None, fixed=True, object_visible=True)[source]

Add a frame state to annotation

Parameters
  • annotation_definition – annotation type object - must be same type as annotation

  • frame_num (int) – frame number

  • fixed (bool) – is fixed

  • object_visible (bool) – does the annotated object is visible

Returns

True if success

Return type

bool

Example:

annotation.add_frame(frame_num=10,
                annotation_definition=dl.Box(top=10,left=10,bottom=100, right=100,label='labelName'))
                )
add_frames(annotation_definition, frame_num=None, end_frame_num=None, start_time=None, end_time=None, fixed=True, object_visible=True)[source]

Add a frames state to annotation

Prerequisites: Any user can upload annotations.

Parameters
  • annotation_definition – annotation type object - must be same type as annotation

  • frame_num (int) – first frame number

  • end_frame_num (int) – last frame number

  • start_time – starting time for video

  • end_time – ending time for video

  • fixed (bool) – is fixed

  • object_visible (bool) – does the annotated object is visible

Returns

Example:

annotation.add_frames(frame_num=10,
                annotation_definition=dl.Box(top=10,left=10,bottom=100, right=100,label='labelName'))
                )
delete()[source]

Remove an annotation from item

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.

Returns

True if success

Return type

bool

Example:

annotation.delete()
download(filepath: str, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.JSON, height: Optional[float] = None, width: Optional[float] = None, thickness: int = 1, with_text: bool = False, alpha: float = 1)[source]

Save annotation to file

Prerequisites: Any user can upload annotations.

Parameters
  • filepath (str) – local path to where annotation will be downloaded to

  • annotation_format (list) – options: list(dl.ViewAnnotationOptions)

  • height (float) – image height

  • width (float) – image width

  • thickness (int) – line thickness

  • with_text (bool) – get mask with text

  • alpha (float) – opacity value [0 1], default 1

Returns

filepath

Return type

str

Example:

annotation.download(filepath='filepath', annotation_format=dl.ViewAnnotationOptions.MASK)
classmethod from_json(_json, item=None, client_api=None, annotations=None, is_video=None, fps=None, item_metadata=None, dataset=None, is_audio=None)[source]

Create an annotation object from platform json

Parameters
  • _json (dict) – platform json

  • item (dtlpy.entities.item.Item) – item

  • client_api – ApiClient entity

  • annotations

  • is_video (bool) – is video

  • fps – video fps

  • item_metadata – item metadata

  • dataset – dataset entity

  • is_audio (bool) – is audio

Returns

annotation object

Return type

dtlpy.entities.annotation.Annotation

classmethod new(item=None, annotation_definition=None, object_id=None, automated=True, metadata=None, frame_num=None, parent_id=None, start_time=None, item_height=None, item_width=None, end_time=None)[source]

Create a new annotation object annotations

Prerequisites: Any user can upload annotations.

Parameters
  • item (dtlpy.entities.item.Items) – item to annotate

  • annotation_definition – annotation type object

  • object_id (str) – object_id

  • automated (bool) – is automated

  • metadata (dict) – metadata

  • frame_num (int) – optional - first frame number if video annotation

  • parent_id (str) – add parent annotation ID

  • start_time – optional - start time if video annotation

  • item_height (float) – annotation item’s height

  • item_width (float) – annotation item’s width

  • end_time – optional - end time if video annotation

Returns

annotation object

Return type

dtlpy.entities.annotation.Annotation

Example:

annotation.new(item='item_entity,
                annotation_definition=dl.Box(top=10,left=10,bottom=100, right=100,label='labelName'))
                )
set_frame(frame)[source]

Set annotation to frame state

Prerequisites: Any user can upload annotations.

Parameters

frame (int) – frame number

Returns

True if success

Return type

bool

Example:

annotation.set_frame(frame=10)
show(image=None, thickness=None, with_text=False, height=None, width=None, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.MASK, color=None, label_instance_dict=None, alpha=1, frame_num=None)[source]

Show annotations mark the annotation of the image array and return it

Prerequisites: Any user can upload annotations.

Parameters
  • image – empty or image to draw on

  • thickness (int) – line thickness

  • with_text (bool) – add label to annotation

  • height (float) – height

  • width (float) – width

  • annotation_format (dl.ViewAnnotationOptions) – list(dl.ViewAnnotationOptions)

  • color (tuple) – optional - color tuple

  • label_instance_dict – the instance labels

  • alpha (float) – opacity value [0 1], default 1

  • frame_num (int) – for video annotation, show specific fame

Returns

list or single ndarray of the annotations

Exampls:

annotation.show(image='ndarray',
                thickness=1,
                annotation_format=dl.VIEW_ANNOTATION_OPTIONS_MASK,
                )
to_json()[source]

Convert annotation object to a platform json representatio

Returns

platform json

Return type

dict

update(system_metadata=False)[source]

Update an existing annotation in host.

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.

Parameters

system_metadata – True, if you want to change metadata system

Returns

Annotation object

Return type

dtlpy.entities.annotation.Annotation

Example:

annotation.update()
update_status(status: AnnotationStatus = AnnotationStatus.ISSUE)[source]

Set status on annotation

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager.

Parameters

status (str) – can be AnnotationStatus.ISSUE, AnnotationStatus.APPROVED, AnnotationStatus.REVIEW, AnnotationStatus.CLEAR

Returns

Annotation object

Return type

dtlpy.entities.annotation.Annotation

Example:

annotation.update_status(status=dl.AnnotationStatus.ISSUE)
upload()[source]

Create a new annotation in host

Prerequisites: Any user can upload annotations.

Returns

Annotation entity

Return type

dtlpy.entities.annotation.Annotation

class AnnotationStatus(value)[source]

Bases: str, Enum

An enumeration.

class AnnotationType(value)[source]

Bases: str, Enum

An enumeration.

class ExportVersion(value)[source]

Bases: str, Enum

An enumeration.

class FrameAnnotation(annotation, annotation_definition, frame_num, fixed, object_visible, recipe_2_attributes=None, interpolation=False)[source]

Bases: BaseEntity

FrameAnnotation object

classmethod from_snapshot(annotation, _json, fps)[source]

new frame state to annotation

Parameters
  • annotation – annotation

  • _json – annotation type object - must be same type as annotation

  • fps – frame number

Returns

FrameAnnotation object

classmethod new(annotation, annotation_definition, frame_num, fixed, object_visible=True)[source]

new frame state to annotation

Parameters
  • annotation – annotation

  • annotation_definition – annotation type object - must be same type as annotation

  • frame_num – frame number

  • fixed – is fixed

  • object_visible – does the annotated object is visible

Returns

FrameAnnotation object

show(**kwargs)[source]

Show annotation as ndarray :param kwargs: see annotation definition :return: ndarray of the annotation

class ViewAnnotationOptions(value)[source]

Bases: str, Enum

The Annotations file types to download (JSON, MASK, INSTANCE, ANNOTATION_ON_IMAGE, VTT, OBJECT_ID).

State

Description

JSON

Dataloop json format

MASK

PNG file that contains drawing annotations on it

INSTANCE

An image file that contains 2D annotations

ANNOTATION_ON_IMAGE

The source image with the annotations drawing in it

VTT

An text file contains supplementary information about a web video

OBJECT_ID

An image file that contains 2D annotations

Collection of Annotation entities

class AnnotationCollection(item=None, annotations=NOTHING, dataset=None, colors=None)[source]

Bases: BaseEntity

Collection of Annotation entity

add(annotation_definition, object_id=None, frame_num=None, end_frame_num=None, start_time=None, end_time=None, automated=True, fixed=True, object_visible=True, metadata=None, parent_id=None, model_info=None)[source]

Add annotations to collection

Parameters
  • annotation_definition – dl.Polygon, dl.Segmentation, dl.Point, dl.Box etc

  • object_id – Object id (any id given by user). If video - must input to match annotations between frames

  • frame_num – video only, number of frame

  • end_frame_num – video only, the end frame of the annotation

  • start_time – video only, start time of the annotation

  • end_time – video only, end time of the annotation

  • automated

  • fixed – video only, mark frame as fixed

  • object_visible – video only, does the annotated object is visible

  • metadata – optional- metadata dictionary for annotation

  • parent_id – set a parent for this annotation (parent annotation ID)

  • model_info – optional - set model on annotation {‘name’,:’’, ‘confidence’:0}

Returns

delete()[source]

Remove an annotation from item

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.

Returns

True if success

Return type

bool

Example:

builder.delete()
download(filepath, img_filepath=None, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.JSON, height=None, width=None, thickness=1, with_text=False, orientation=0, alpha=1)[source]

Save annotations to file

Prerequisites: Any user can upload annotations.

Parameters
  • filepath (str) – path to save annotation

  • img_filepath (str) – img file path - needed for img_mask

  • annotation_format (dl.ViewAnnotationOptions) – how to show thw annotations. options: list(dl.ViewAnnotationOptions)

  • height (int) – height

  • width (int) – width

  • thickness (int) – thickness

  • with_text (bool) – add a text to the image

  • orientation (int) – the image orientation

  • alpha (float) – opacity value [0 1], default 1

Returns

file path of the downlaod annotation

Return type

str

Example:

builder.download(filepath='filepath', annotation_format=dl.ViewAnnotationOptions.MASK)
from_instance_mask(mask, instance_map=None)[source]

convert annotation from instance mask format

Parameters
  • mask – the mask annotation

  • instance_map – labels

classmethod from_json(_json: list, item=None, is_video=None, fps=25, height=None, width=None, client_api=None, is_audio=None)[source]

Create an annotation collection object from platform json

Parameters
  • _json (dict) – platform json

  • item (dtlpy.entities.item.Item) – item

  • client_api – ApiClient entity

  • is_video (bool) – is video

  • fps – video fps

  • height (float) – height

  • width (float) – width

  • is_audio (bool) – is audio

Returns

annotation object

Return type

dtlpy.entities.annotation.Annotation

from_vtt_file(filepath)[source]

convert annotation from vtt format

Parameters

filepath (str) – path to the file

get_frame(frame_num)[source]

Get frame

Parameters

frame_num (int) – frame num

Returns

AnnotationCollection

print(to_return=False, columns=None)[source]
Parameters
  • to_return

  • columns

show(image=None, thickness=None, with_text=False, height=None, width=None, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.MASK, label_instance_dict=None, color=None, alpha=1, frame_num=None)[source]

Show annotations according to annotation_format

Prerequisites: Any user can upload annotations.

Parameters
  • image (ndarray) – empty or image to draw on

  • height (int) – height

  • width (int) – width

  • thickness (int) – line thickness

  • with_text (bool) – add label to annotation

  • annotation_format (dl.ViewAnnotationOptions) – how to show thw annotations. options: list(dl.ViewAnnotationOptions)

  • label_instance_dict (dict) – instance label map {‘Label’: 1, ‘More’: 2}

  • color (tuple) – optional - color tuple

  • alpha (float) – opacity value [0 1], default 1

  • frame_num (int) – for video annotation, show specific frame

Returns

ndarray of the annotations

Example:

builder.show(image='ndarray',
            thickness=1,
            annotation_format=dl.VIEW_ANNOTATION_OPTIONS_MASK,
            )
to_json()[source]

Convert annotation object to a platform json representation

Returns

platform json

Return type

dict

update(system_metadata=True)[source]

Update an existing annotation in host.

Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.

Parameters

system_metadata – True, if you want to change metadata system

Returns

Annotation object

Return type

dtlpy.entities.annotation.Annotation

Example:

builder.update()
upload()[source]

Create a new annotation in host

Prerequisites: Any user can upload annotations.

Returns

Annotation entity

Return type

dtlpy.entities.annotation.Annotation

Example:

builder.upload()

Annotation Definition

Box Annotation Definition
class Box(left=None, top=None, right=None, bottom=None, label=None, attributes=None, description=None, angle=None)[source]

Bases: BaseAnnotationDefinition

Box annotation object Can create a box using 2 point using: “top”, “left”, “bottom”, “right” (to form a box [(left, top), (right, bottom)]) For rotated box add the “angel”

classmethod from_segmentation(mask, label, attributes=None)[source]

Convert binary mask to Polygon

Parameters
  • mask – binary mask (0,1)

  • label – annotation label

  • attributes – annotations list of attributes

Returns

Box annotations list to each separated segmentation

show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]

Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray

Classification Annotation Definition
class Classification(label, attributes=None, description=None)[source]

Bases: BaseAnnotationDefinition

Classification annotation object

show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]

Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray

Cuboid Annotation Definition
class Cube(label, front_tl, front_tr, front_br, front_bl, back_tl, back_tr, back_br, back_bl, angle=None, attributes=None, description=None)[source]

Bases: BaseAnnotationDefinition

Cube annotation object

classmethod from_boxes_and_angle(front_left, front_top, front_right, front_bottom, back_left, back_top, back_right, back_bottom, label, angle=0, attributes=None)[source]

Create cuboid by given front and back boxes with angle the angle calculate fom the center of each box

show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]

Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray

Item Description Definition
class Description(text, description=None)[source]

Bases: BaseAnnotationDefinition

Subtitle annotation object

Ellipse Annotation Definition
class Ellipse(x, y, rx, ry, angle, label, attributes=None, description=None)[source]

Bases: BaseAnnotationDefinition

Ellipse annotation object

show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]

Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray

Note Annotation Definition
class Message(msg_id: Optional[str] = None, creator: Optional[str] = None, msg_time=None, body: Optional[str] = None)[source]

Bases: object

Note message object

class Note(left, top, right, bottom, label, attributes=None, messages=None, status='issue', assignee=None, create_time=None, creator=None, description=None)[source]

Bases: Box

Note annotation object

Point Annotation Definition
class Point(x, y, label, attributes=None, description=None)[source]

Bases: BaseAnnotationDefinition

Point annotation object

show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]

Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray

Polygon Annotation Definition
class Polygon(geo, label, attributes=None, description=None)[source]

Bases: BaseAnnotationDefinition

Polygon annotation object

classmethod from_segmentation(mask, label, attributes=None, epsilon=None, max_instances=1, min_area=0)[source]

Convert binary mask to Polygon

Parameters
  • mask – binary mask (0,1)

  • label – annotation label

  • attributes – annotations list of attributes

  • epsilon – from opencv: specifying the approximation accuracy. This is the maximum distance between the original curve and its approximation. if 0 all points are returns

  • max_instances – number of max instances to return. if None all wil be returned

  • min_area – remove polygons with area lower thn this threshold (pixels)

Returns

Polygon annotation

show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]

Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray

Polyline Annotation Definition
class Polyline(geo, label, attributes=None, description=None)[source]

Bases: BaseAnnotationDefinition

Polyline annotation object

show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]

Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray

Pose Annotation Definition
class Pose(label, template_id, instance_id=None, attributes=None, points=None, description=None)[source]

Bases: BaseAnnotationDefinition

Classification annotation object

show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]

Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray

Segmentation Annotation Definition
class Segmentation(geo, label, attributes=None, description=None, color=None)[source]

Bases: BaseAnnotationDefinition

Segmentation annotation object

classmethod from_polygon(geo, label, shape, attributes=None)[source]
Parameters
  • geo – list of x,y coordinates of the polygon ([[x,y],[x,y]…]

  • label – annotation’s label

  • shape – image shape (h,w)

  • attributes

Returns

show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]

Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray

to_box()[source]
Returns

Box annotations list to each separated segmentation

Audio Annotation Definition
class Subtitle(text, label, attributes=None, description=None)[source]

Bases: BaseAnnotationDefinition

Subtitle annotation object

Undefined Annotation Definition
class UndefinedAnnotationType(type, label, coordinates, attributes=None, description=None)[source]

Bases: BaseAnnotationDefinition

UndefinedAnnotationType annotation object

show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]

Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray

Similarity

class Collection(type: CollectionTypes, name, items=None)[source]

Bases: object

Base Collection Entity

add(ref, type: SimilarityTypeEnum = SimilarityTypeEnum.ID)[source]

Add item to collection :param ref: :param type: url, id

pop(ref)[source]
Parameters

ref

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

class CollectionItem(type: SimilarityTypeEnum, ref)[source]

Bases: object

Base CollectionItem

class CollectionTypes(value)[source]

Bases: str, Enum

An enumeration.

class MultiView(name, items=None)[source]

Bases: Collection

Multi Entity

property items

list of the collection items

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

class MultiViewItem(type, ref)[source]

Bases: CollectionItem

Single multi view item

class Similarity(ref, name=None, items=None)[source]

Bases: Collection

Similarity Entity

property items

list of the collection items

property target

Target item for similarity

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

class SimilarityItem(type, ref, target=False)[source]

Bases: CollectionItem

Single similarity item

class SimilarityTypeEnum(value)[source]

Bases: str, Enum

State enum

Filter

class Filters(field=None, values=None, operator: Optional[FiltersOperations] = None, method: Optional[FiltersMethod] = None, custom_filter=None, resource: FiltersResource = FiltersResource.ITEM, use_defaults=True, context=None, page_size=None)[source]

Bases: object

Filters entity to filter items from pages in platform

add(field, values, operator: Optional[FiltersOperations] = None, method: Optional[FiltersMethod] = None)[source]

Add filter

Parameters
  • field (str) – Metadata field / attribute

  • values – field values

  • operator (dl.FiltersOperations) – optional - in, gt, lt, eq, ne

  • method (dl.FiltersMethod) – Optional - or/and

Example:

filter.add(field='metadata.user', values=['1','2'], operator=dl.FiltersOperations.IN)
add_join(field, values, operator: Optional[FiltersOperations] = None, method: FiltersMethod = FiltersMethod.AND)[source]

join a query to the filter

Parameters
  • field (str) – Metadata field / attribute

  • values (str or list) – field values

  • operator (dl.FiltersOperations) – optional - in, gt, lt, eq, ne

  • method – optional - str - FiltersMethod.AND, FiltersMethod.OR

Example:

filter.add_join(field='metadata.user', values=['1','2'], operator=dl.FiltersOperations.IN)
generate_url_query_params(url)[source]

generate url query params

Parameters

url (str) –

has_field(field)[source]

is filter has field

Parameters

field (str) – field to check

Returns

Ture is have it

Return type

bool

open_in_web(resource)[source]

Open the filter in the platform data browser (in a new web browser)

Parameters

resource (str) – dl entity to apply filter on. currently only supports dl.Dataset

platform_url(resource) str[source]

Build a url with filters param to open in web browser

Parameters

resource (str) – dl entity to apply filter on. currently only supports dl.Dataset

Returns

url string

Return type

str

pop(field)[source]

Pop filed

Parameters

field (str) – field to pop

pop_join(field)[source]

Pop join

Parameters

field (str) – field to pop

prepare(operation=None, update=None, query_only=False, system_update=None, system_metadata=False)[source]

To dictionary for platform call

Parameters
  • operation (str) – operation

  • update – update

  • query_only (bool) – query only

  • system_update – system update

  • system_metadata – True, if you want to change metadata system

Returns

dict of the filter

Return type

dict

sort_by(field, value: FiltersOrderByDirection = FiltersOrderByDirection.ASCENDING)[source]

sort the filter

Parameters
  • field (str) – field to sort by it

  • value (dl.FiltersOrderByDirection) – FiltersOrderByDirection.ASCENDING, FiltersOrderByDirection.DESCENDING

Example:

filter.sort_by(field='metadata.user', values=dl.FiltersOrderByDirection.ASCENDING)
class FiltersKnownFields(value)[source]

Bases: str, Enum

An enumeration.

class FiltersMethod(value)[source]

Bases: str, Enum

An enumeration.

class FiltersOperations(value)[source]

Bases: str, Enum

An enumeration.

class FiltersOrderByDirection(value)[source]

Bases: str, Enum

An enumeration.

class FiltersResource(value)[source]

Bases: str, Enum

An enumeration.

Recipe

class Recipe(id, creator, url, title, project_ids, description, ontology_ids, instructions, examples, custom_actions, metadata, ui_settings, client_api: ApiClient, dataset=None, project=None, repositories=NOTHING)[source]

Bases: BaseEntity

Recipe object

add_instruction(annotation_instruction_file)[source]

Add instruction to recipe

Parameters

annotation_instruction_file (str) – file path or url of the recipe instruction

clone(shallow=False)[source]

Clone Recipe

Parameters

shallow (bool) – If True, link ot existing ontology, clones all ontology that are link to the recipe as well

Returns

Cloned ontology object

Return type

dtlpy.entities.recipe.Recipe

delete(force: bool = False)[source]

Delete recipe from platform

Parameters

force (bool) – force delete recipe

Returns

True

Return type

bool

classmethod from_json(_json, client_api, dataset=None, project=None, is_fetched=True)[source]

Build a Recipe entity object from a json

Parameters
Returns

Recipe object

get_annotation_template_id(template_name)[source]

Get annotation template id by template name

Parameters

template_name (str) –

Returns

template id or None if does not exist

open_in_web()[source]

Open the recipes in web platform

Returns

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update(system_metadata=False)[source]

Update Recipe

Parameters

system_metadata (bool) – bool - True, if you want to change metadata system

Returns

Recipe object

Return type

dtlpy.entities.recipe.Recipe

Ontology

class Ontology(client_api: ApiClient, id, creator, url, title, labels, metadata, attributes, recipe=None, dataset=None, project=None, repositories=NOTHING, instance_map=None, color_map=None)[source]

Bases: BaseEntity

Ontology object

add_label(label_name, color=None, children=None, attributes=None, display_label=None, label=None, add=True, icon_path=None, update_ontology=False)[source]

Add a single label to ontology

Parameters
  • label_name (str) – str - label name

  • color (tuple) – color

  • children – children (sub labels)

  • attributes (list) – attributes

  • display_label (str) – display_label

  • label (dtlpy.entities.label.Label) – label

  • add (bool) – to add or not

  • icon_path (str) – path to image to be display on label

  • update_ontology (bool) – update the ontology, default = False for backward compatible

Returns

Label entity

Return type

dtlpy.entities.label.Label

Example:

ontology.add_label(label_name='person', color=(34, 6, 231), attributes=['big', 'small'])
add_labels(label_list, update_ontology=False)[source]

Adds a list of labels to ontology

Parameters
  • label_list (list) – list of labels [{“value”: {“tag”: “tag”, “displayLabel”: “displayLabel”, “color”: “#color”, “attributes”: [attributes]}, “children”: [children]}]

  • update_ontology (bool) – update the ontology, default = False for backward compatible

Returns

List of label entities added

Example:

ontology.add_labels(label_list=label_list)
property color_map

rgb}

Returns

dict

Return type

dict

Type

Color mapping of labels, {label

delete()[source]

Delete recipe from platform

Returns

True

delete_attributes(keys: list)[source]

Delete a bulk of attributes

Parameters

keys (list) – Keys of attributes to delete

Returns

True if success

Return type

bool

Example:

ontology.delete_attributes(['1'])
delete_labels(label_names)[source]

Delete labels from ontology

Parameters

label_names – label object/ label name / list of label objects / list of label names

Returns

classmethod from_json(_json, client_api, recipe, dataset=None, project=None, is_fetched=True)[source]

Build an Ontology entity object from a json

Parameters
Returns

Ontology object

Return type

dtlpy.entities.ontology.Ontology

property instance_map

instance mapping for creating instance mask

Return dictionary {label

map_id}

Return type

dict

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update(system_metadata=False)[source]

Update items metadata

Parameters

system_metadata (bool) – bool - True, if you want to change metadata system

Returns

Ontology object

update_attributes(title: str, key: str, attribute_type, scope: Optional[list] = None, optional: Optional[bool] = None, values: Optional[list] = None, attribute_range=None)[source]

ADD a new attribute or update if exist

Parameters
  • title (str) – attribute title

  • key (str) – the key of the attribute must br unique

  • attribute_type (AttributesTypes) – dl.AttributesTypes your attribute type

  • scope (list) – list of the labels or * for all labels

  • optional (bool) – optional attribute

  • values (list) – list of the attribute values ( for checkbox and radio button)

  • attribute_range (dict or AttributesRange) – dl.AttributesRange object

Returns

true in success

Return type

bool

update_label(label_name, color=None, children=None, attributes=None, display_label=None, label=None, add=True, icon_path=None, upsert=False, update_ontology=False)[source]

Update a single label to ontology

Parameters
  • label_name (str) – str - label name

  • color (tuple) – color

  • children – children (sub labels)

  • attributes (list) – attributes

  • display_label (str) – display_label

  • label (dtlpy.entities.label.Label) – label

  • add (bool) – to add or not

  • icon_path (str) – path to image to be display on label

  • update_ontology (bool) – update the ontology, default = False for backward compatible

  • upsert (bool) – if True will add in case it does not existing

Returns

Label entity

Return type

dtlpy.entities.label.Label

Example:

ontology.update_label(label_name='person', color=(34, 6, 231), attributes=['big', 'small'])
update_labels(label_list, upsert=False, update_ontology=False)[source]

Update a list of labels to ontology

Parameters
  • label_list (list) – list of labels [{“value”: {“tag”: “tag”, “displayLabel”: “displayLabel”, “color”: “#color”, “attributes”: [attributes]}, “children”: [children]}]

  • upsert (bool) – if True will add in case it does not existing

  • update_ontology (bool) – update the ontology, default = False for backward compatible

Returns

List of label entities added

Example:

ontology.update_labels(label_list=label_list)
Label

Task

class Task(name, status, project_id, metadata, id, url, task_owner, item_status, creator, due_date, dataset_id, spec, recipe_id, query, assignmentIds, annotation_status, progress, for_review, issues, updated_at, created_at, available_actions, total_items, priority, client_api, current_assignments=None, assignments=None, project=None, dataset=None, tasks=None, settings=None)[source]

Bases: object

Task object

add_items(filters=None, items=None, assignee_ids=None, workload=None, limit=None, wait=True, query=None)[source]

Add items to Task

Parameters
  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • items (list) – list of items (item Ids or objects) to add to the task

  • assignee_ids (list) – list to assignee who works in the task

  • workload (list) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]

  • limit (int) – the limit items that task can include

  • wait (bool) – wait until add items will to finish

  • query (dict) – query to filter the items for the task

Returns

task entity

Return type

dtlpy.entities.task.Task

create_assignment(assignment_name, assignee_id, items=None, filters=None)[source]

Create a new assignment

Parameters
  • assignment_name (str) – assignment name

  • assignee_id (str) – the assignment assignees (contributors) that should be working on the task. Provide a user email

  • items (List[entities.Item]) – list of items (item Id or objects) to insert to the task

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

Returns

Assignment object

Return type

dtlpy.entities.assignment.Assignment assignment

Example:

task.create_assignment(assignee_id='annotator1@dataloop.ai')
create_qa_task(due_date, assignee_ids, filters=None, items=None, query=None, workload=None, metadata=None, available_actions=None, wait=True, batch_size=None, max_batch_workload=None, allowed_assignees=None, priority=TaskPriority.MEDIUM)[source]

Create a new QA Task

Parameters
  • due_date (float) – date by which the QA task should be finished; for example, due_date=datetime.datetime(day=1, month=1, year=2029).timestamp()

  • assignee_ids (list) – list the QA task assignees (contributors) that should be working on the task. Provide a list of users’ emails

  • filters (entities.Filters) – dl.Filters entity to filter items for the task

  • items (List[entities.Item]) – list of items (item Id or objects) to insert to the task

  • query (dict DQL) – filter items for the task

  • workload (List[WorkloadUnit]) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]

  • metadata (dict) – metadata for the task

  • available_actions (list) – list of available actions (statuses) that will be available for the task items; The default statuses are: “Approved” and “Discarded”

  • wait (bool) – wait until create task finish

  • batch_size (int) – Pulling batch size (items), use with pulling allocation method. Restrictions - Min 3, max 100

  • max_batch_workload (int) – Max items in assignment, use with pulling allocation method. Restrictions - Min batchSize + 2, max batchSize * 2

  • allowed_assignees (list) – list the task assignees (contributors) that should be working on the task. Provide a list of users’ emails

  • priority (entities.TaskPriority) – priority of the task options in entities.TaskPriority

Returns

task object

Return type

dtlpy.entities.task.Task

Example:

task.create_qa_task(due_date = datetime.datetime(day= 1, month= 1, year= 2029).timestamp(),
                    assignee_ids =[ 'annotator1@dataloop.ai', 'annotator2@dataloop.ai'])
delete(wait=True)[source]

Delete task from platform

Parameters

wait (bool) – wait until delete task finish

Returns

True

Return type

bool

classmethod from_json(_json, client_api, project=None, dataset=None)[source]

Return the task object form the json

Parameters
Returns

get_items(filters=None)[source]

Get the task items

Parameters

filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

Returns

list of the items or PagedEntity output of items

Return type

list or dtlpy.entities.paged_entities.PagedEntities

open_in_web()[source]

Open the task in web platform

Returns

remove_items(filters: Optional[Filters] = None, query=None, items=None, wait=True)[source]

remove items from Task.

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.

Parameters
  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters

  • query (dict) – query to filter the items use it

  • items (list) – list of items to add to the task

  • wait (bool) – wait until remove items finish

Returns

True if success and an error if failed

Return type

bool

set_status(status: str, operation: str, item_ids: List[str])[source]

Update item status within task

Parameters
  • status (str) – string the describes the status

  • operation (str) – the status action need ‘create’ or ‘delete’

  • item_ids (list) – List[str] id items ids

Returns

True if success

Return type

bool

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update(system_metadata=False)[source]

Update an Annotation Task

Parameters

system_metadata (bool) – True, if you want to change metadata system

class TaskPriority(value)[source]

Bases: int, Enum

An enumeration.

Assignment

class Assignment(name, annotator, status, project_id, metadata, id, url, task_id, dataset_id, annotation_status, item_status, total_items, for_review, issues, client_api, task=None, assignments=None, project=None, dataset=None, datasets=None)[source]

Bases: BaseEntity

Assignment object

get_items(dataset=None, filters=None)[source]

Get all the items in the assignment

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters
Returns

pages of the items

Return type

dtlpy.entities.paged_entities.PagedEntities

Example:

task.assignments.get_items()
open_in_web()[source]

Open the assignment in web platform

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Returns

Example:

assignment.open_in_web()
reassign(assignee_id, wait=True)[source]

Reassign an assignment

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters
  • assignee_id (str) – the email of the user that want to assign the assignment

  • wait (bool) – wait until reassign assignment finish

Returns

Assignment object

Return type

dtlpy.entities.assignment.Assignment

Example:

assignment.reassign(assignee_ids='annotator1@dataloop.ai')
redistribute(workload, wait=True)[source]

Redistribute an assignment

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters
Returns

Assignment object

Return type

dtlpy.entities.assignment.Assignment assignment

Example:

assignment.redistribute(workload=dl.Workload([dl.WorkloadUnit(assignee_id="annotator1@dataloop.ai", load=50),
                                             dl.WorkloadUnit(assignee_id="annotator2@dataloop.ai", load=50)]))
set_status(status: str, operation: str, item_id: str)[source]

Set item status within assignment

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters
  • status (str) – string the describes the status

  • operation (str) – the status action need ‘create’ or ‘delete’

  • item_id (str) – item id that want to set his status

Returns

True id success

Return type

bool

Example:

assignment.set_status(status='complete',
                        operation='created',
                        item_id='item_id')
to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update(system_metadata=False)[source]

Update an assignment

Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.

Parameters

system_metadata (bool) – True, if you want to change metadata system

Returns

Assignment object

Return type

dtlpy.entities.assignment.Assignment assignment

Example:

assignment.update(system_metadata=False)
class Workload(workload: list = NOTHING)[source]

Bases: object

Workload object

add(assignee_id)[source]

add a assignee

Parameters

assignee_id

classmethod generate(assignee_ids, loads=None)[source]

generate the loads for the given assignee :param assignee_ids: :param loads:

class WorkloadUnit(assignee_id: str, load: float = 0)[source]

Bases: object

WorkloadUnit object

Package

class Package(_dict)[source]

Bases: DlEntity

Package object

build(module_name=None, init_inputs=None, local_path=None, from_local=None)[source]

Instantiate a module from the package code. Returns a loaded instance of the runner class

Parameters
  • module_name – Name of the module to build the runner class

  • init_inputs (str) – dictionary of the class init variables (if exists). will be used to init the module class

  • local_path (str) – local path of the package (if from_local=False - codebase will be downloaded)

  • from_local (bool) – bool. if true - codebase will not be downloaded (only use local files)

Returns

dl.BaseServiceRunner

checkout()[source]

Checkout as package

Returns

delete() bool[source]

Delete Package object

Returns

True

deploy(service_name=None, revision=None, init_input=None, runtime=None, sdk_version=None, agent_versions=None, verify=True, bot=None, pod_type=None, module_name=None, run_execution_as_process=None, execution_timeout=None, drain_time=None, on_reset=None, max_attempts=None, force=False, secrets: Optional[list] = None, **kwargs)[source]

Deploy package

Parameters
  • service_name (str) – service name

  • revision (str) – package revision - default=latest

  • init_input – config to run at startup

  • runtime (dict) – runtime resources

  • sdk_version (str) –

    • optional - string - sdk version

  • agent_versions (dict) –

    • dictionary - - optional -versions of sdk, agent runner and agent proxy

  • bot (str) – bot email

  • pod_type (str) – pod type dl.InstanceCatalog

  • verify (bool) – verify the inputs

  • module_name (str) – module name

  • run_execution_as_process (bool) – run execution as process

  • execution_timeout (int) – execution timeout

  • drain_time (int) – drain time

  • on_reset (str) – on reset

  • max_attempts (int) – Maximum execution retries in-case of a service reset

  • force (bool) – optional - terminate old replicas immediately

  • secrets (list) – list of the integrations ids

Returns

Service object

Return type

dtlpy.entities.service.Service

Example:


service: dl.Service = package.deploy(service_name=package_name,

execution_timeout=3 * 60 * 60, module_name=module.name, runtime=dl.KubernetesRuntime(

concurrency=10, pod_type=dl.InstanceCatalog.REGULAR_S, autoscaler=dl.KubernetesRabbitmqAutoscaler(

min_replicas=1, max_replicas=20, queue_length=20

)

classmethod from_json(_json, client_api, project, is_fetched=True)[source]

Turn platform representation of package into a package entity

Parameters
  • _json (dict) – platform representation of package

  • client_api (dl.ApiClient) – ApiClient entity

  • project (dtlpy.entities.project.Project) – project entity

  • is_fetched – is Entity fetched from Platform

Returns

Package entity

Return type

dtlpy.entities.package.Package

static get_ml_metadata(cls=None, available_methods=None, output_type=AnnotationType.CLASSIFICATION, input_type='image', default_configuration: Optional[dict] = None)[source]

Create ML metadata for the package :param cls: ModelAdapter class, to get the list of available_methods :param available_methods: available user function on the adapter. [‘load’, ‘save’, ‘predict’, ‘train’, ‘evaluate’] :param output_type: annotation type the model create, e.g. dl.AnnotationType.CLASSIFICATION :param input_type: input file type the model gets, one of [‘image’, ‘video’, ‘txt’] :param default_configuration: default service configuration for the deployed services :return:

open_in_web()[source]

Open the package in web platform

pull(version=None, local_path=None) str[source]

Pull local package

Parameters
  • version (str) – version

  • local_path (str) – local path

Example:

path = package.pull(local_path='local_path')
push(codebase: Optional[Union[GitCodebase, ItemCodebase]] = None, src_path: Optional[str] = None, package_name: Optional[str] = None, modules: Optional[list] = None, checkout: bool = False, revision_increment: Optional[str] = None, service_update: bool = False, service_config: Optional[dict] = None, package_type='faas')[source]

Push local package

Parameters
  • codebase (dtlpy.entities.codebase.Codebase) – PackageCode object - defines how to store the package code

  • checkout (bool) – save package to local checkout

  • src_path (str) – location of pacjage codebase folder to zip

  • package_name (str) – name of package

  • modules (list) – list of PackageModule

  • revision_increment (str) – optional - str - version bumping method - major/minor/patch - default = None

  • service_update (bool) – optional - bool - update the service

:param dict service_config : Service object as dict. Contains the spec of the default service to create. :param str package_type: default is “faas”, one of “app”, “ml” :return: package entity :rtype: dtlpy.entities.package.Package

Example:

package = packages.push(package_name='package_name',
                        modules=[module],
                        version='1.0.0',
                        src_path=os.getcwd())
test(cwd=None, concurrency=None, module_name='default_module', function_name='run', class_name='ServiceRunner', entry_point='main.py')[source]

Test local package in local environment.

Parameters
  • cwd (str) – path to the file

  • concurrency (int) – the concurrency of the test

  • module_name (str) – module name

  • function_name (str) – function name

  • class_name (str) – class name

  • entry_point (str) – the file to run like main.py

Returns

list created by the function that tested the output

Return type

list

Example:

package.test(cwd='path_to_package',
            function_name='run')
to_json()[source]

Turn Package entity into a platform representation of Package

Returns

platform json of package

Return type

dict

update()[source]

Update Package changes to platform

Returns

Package entity

class RequirementOperator(value)[source]

Bases: str, Enum

An enumeration.

Package Function

class PackageFunction(outputs=NOTHING, name=NOTHING, description='', inputs=NOTHING, display_name=None, display_icon=None)[source]

Bases: BaseEntity

Webhook object

class PackageInputType(value)[source]

Bases: str, Enum

An enumeration.

Package Module

class PackageModule(name=NOTHING, init_inputs=NOTHING, entry_point='main.py', class_name='ServiceRunner', functions=NOTHING)[source]

Bases: BaseEntity

PackageModule object

add_function(function)[source]
Parameters

function

classmethod from_entry_point(entry_point)[source]

Create a dl.PackageModule entity using decorator on the service class.

Parameters

entry_point – path to the python file with the runner class (relative to the package path)

Returns

Slot

class PackageSlot(module_name='default_module', function_name='run', display_name=None, display_scopes: Optional[list] = None, display_icon=None, post_action: SlotPostAction = NOTHING, default_inputs: Optional[list] = None, input_options: Optional[list] = None)[source]

Bases: BaseEntity

Webhook object

class SlotDisplayScopeResource(value)[source]

Bases: str, Enum

An enumeration.

class SlotPostActionType(value)[source]

Bases: str, Enum

An enumeration.

class UiBindingPanel(value)[source]

Bases: str, Enum

An enumeration.

Codebase

Service

class InstanceCatalog(value)[source]

Bases: str, Enum

The Service Pode size.

State

Description

REGULAR_XS

regular pod with extra small size

REGULAR_S

regular pod with small size

REGULAR_M

regular pod with medium size

REGULAR_L

regular pod with large size

REGULAR_XL

regular pod with extra large size

HIGHMEM_XS

highmem pod with extra small size

HIGHMEM_S

highmem pod with small size

HIGHMEM_M

highmem pod with medium size

HIGHMEM_L

highmem pod with large size

HIGHMEM_XL

highmem pod with extra large size

GPU_K80_S

GPU pod with small size

GPU_K80_M

GPU pod with medium size

class KubernetesAutuscalerType(value)[source]

Bases: str, Enum

The Service Autuscaler Type (RABBITMQ, CPU).

State

Description

RABBITMQ

Service Autuscaler will be in RABBITMQ

CPU

Service Autuscaler will be in in local CPU

class OnResetAction(value)[source]

Bases: str, Enum

The Execution action when the service reset (RERUN, FAILED).

State

Description

RERUN

When the service resting rerun the execution

FAILED

When the service resting fail the execution

class RuntimeType(value)[source]

Bases: str, Enum

Service culture Runtime (KUBERNETES).

State

Description

KUBERNETES

Service run in kubernetes culture

class Service(created_at, updated_at, creator, version, package_id, package_revision, bot, use_user_jwt, init_input, versions, module_name, name, url, id, active, driver_id, secrets, runtime: KubernetesRuntime, queue_length_limit, run_execution_as_process: bool, execution_timeout, drain_time, on_reset: OnResetAction, type: ServiceType, project_id, is_global, max_attempts, package, client_api: ApiClient, revisions=None, project=None, repositories=NOTHING)[source]

Bases: BaseEntity

Service object

activate_slots(project_id: Optional[str] = None, task_id: Optional[str] = None, dataset_id: Optional[str] = None, org_id: Optional[str] = None, user_email: Optional[str] = None, slots=None, role=None, prevent_override: bool = True, visible: bool = True, icon: str = 'fas fa-magic', **kwargs) object[source]

Activate service slots

Parameters
  • project_id (str) – project id

  • task_id (str) – task id

  • dataset_id (str) – dataset id

  • org_id (str) – org id

  • user_email (str) – user email

  • slots (list) – list of entities.PackageSlot

  • role (str) – user role MemberOrgRole.ADMIN, MemberOrgRole.owner, MemberOrgRole.MEMBER

  • prevent_override (bool) – True to prevent override

  • visible (bool) – visible

  • icon (str) – icon

  • kwargs – all additional arguments

Returns

list of user setting for activated slots

Return type

list

Example:

service.activate_slots(project_id='project_id',
                        slots=List[entities.PackageSlot],
                        icon='fas fa-magic')
checkout()[source]

Checkout

Returns

delete()[source]

Delete Service object

Returns

True

Return type

bool

execute(execution_input=None, function_name=None, resource=None, item_id=None, dataset_id=None, annotation_id=None, project_id=None, sync=False, stream_logs=True, return_output=True)[source]

Execute a function on an existing service

Parameters
  • execution_input (List[FunctionIO] or dict) – input dictionary or list of FunctionIO entities

  • function_name (str) – function name to run

  • resource (str) – input type.

  • item_id (str) – optional - item id as input to function

  • dataset_id (str) – optional - dataset id as input to function

  • annotation_id (str) – optional - annotation id as input to function

  • project_id (str) – resource’s project

  • sync (bool) – if true, wait for function to end

  • stream_logs (bool) – prints logs of the new execution. only works with sync=True

  • return_output (bool) – if True and sync is True - will return the output directly

Returns

execution object

Return type

dtlpy.entities.execution.Execution

Example:

service.execute(function_name='function_name', item_id='item_id', project_id='project_id')
classmethod from_json(_json: dict, client_api: ApiClient, package=None, project=None, is_fetched=True)[source]

Build a service entity object from a json

Parameters
Returns

service object

Return type

dtlpy.entities.service.Service

log(size=None, checkpoint=None, start=None, end=None, follow=False, text=None, execution_id=None, function_name=None, replica_id=None, system=False, view=True, until_completed=True)[source]

Get service logs

Parameters
  • size (int) – size

  • checkpoint (dict) – the information from the lst point checked in the service

  • start (str) – iso format time

  • end (str) – iso format time

  • follow (bool) – if true, keep stream future logs

  • text (str) – text

  • execution_id (str) – execution id

  • function_name (str) – function name

  • replica_id (str) – replica id

  • system (bool) – system

  • view (bool) – if true, print out all the logs

  • until_completed (bool) – wait until completed

Returns

ServiceLog entity

Return type

ServiceLog

Example:

service.log()
open_in_web()[source]

Open the service in web platform

Returns

pause()[source]
Returns

resume()[source]
Returns

status()[source]

Get Service status

Returns

status json

Return type

dict

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update(force=False)[source]

Update Service changes to platform

Parameters

force (bool) – force update

Returns

Service entity

Return type

dtlpy.entities.service.Service

class ServiceType(value)[source]

Bases: str, Enum

The type of the service (SYSTEM).

State

Description

SYSTEM

Dataloop internal service

Bot

class Bot(created_at, updated_at, name, last_name, username, avatar, email, role, type, org, id, project, client_api=None, users=None, bots=None, password=None)[source]

Bases: User

Bot entity

delete()[source]

Delete the bot

Returns

True

Return type

bool

classmethod from_json(_json, project, client_api, bots=None)[source]

Build a Bot entity object from a json

Parameters
  • _json – _json response from host

  • project – project entity

  • client_api – ApiClient entity

  • bots – Bots repository

Returns

User object

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

Trigger

class BaseTrigger(id, url, created_at, updated_at, creator, name, active, type, scope, is_global, input, function_name, service_id, webhook_id, pipeline_id, special, project_id, spec, operation, service, project, client_api: ApiClient, op_type='service', repositories=NOTHING)[source]

Bases: BaseEntity

Trigger Entity

delete()[source]

Delete Trigger object

Returns

True

classmethod from_json(_json, client_api, project, service=None)[source]

Build a trigger entity object from a json

Parameters
Returns

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update()[source]

Update Trigger object

Returns

Trigger entity

class CronTrigger(id, url, created_at, updated_at, creator, name, active, type, scope, is_global, input, function_name, service_id, webhook_id, pipeline_id, special, project_id, spec, operation, service, project, client_api: ApiClient, op_type='service', repositories=NOTHING, start_at=None, end_at=None, cron=None)[source]

Bases: BaseTrigger

classmethod from_json(_json, client_api, project, service=None)[source]

Build a trigger entity object from a json

Parameters
  • _json – platform json

  • client_api – ApiClient entity

  • project – project entity

  • service – service entity

Returns

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

class Trigger(id, url, created_at, updated_at, creator, name, active, type, scope, is_global, input, function_name, service_id, webhook_id, pipeline_id, special, project_id, spec, operation, service, project, client_api: ApiClient, op_type='service', repositories=NOTHING, filters=None, execution_mode=TriggerExecutionMode.ONCE, actions=TriggerAction.CREATED, resource=TriggerResource.ITEM)[source]

Bases: BaseTrigger

Trigger Entity

classmethod from_json(_json, client_api, project, service=None)[source]

Build a trigger entity object from a json

Parameters
Returns

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

class TriggerAction(value)[source]

Bases: str, Enum

An enumeration.

class TriggerExecutionMode(value)[source]

Bases: str, Enum

An enumeration.

class TriggerResource(value)[source]

Bases: str, Enum

An enumeration.

class TriggerType(value)[source]

Bases: str, Enum

An enumeration.

Execution

class Execution(id, url, creator, created_at, updated_at, input, output, feedback_queue, status, status_log, sync_reply_to, latest_status, function_name, duration, attempts, max_attempts, to_terminate: bool, trigger_id, service_id, project_id, service_version, package_id, package_name, client_api: ApiClient, service, project=None, repositories=NOTHING, pipeline: Optional[dict] = None)[source]

Bases: BaseEntity

Service execution entity

classmethod from_json(_json, client_api, project=None, service=None, is_fetched=True)[source]
Parameters
increment()[source]

Increment attempts

Returns

logs(follow=False)[source]

Print logs for execution

Parameters

follow – keep stream future logs

progress_update(status: Optional[ExecutionStatus] = None, percent_complete: Optional[int] = None, message: Optional[str] = None, output: Optional[str] = None, service_version: Optional[str] = None)[source]

Update Execution Progress

Parameters
  • status (str) – ExecutionStatus

  • percent_complete (int) – percent complete

  • message (str) – message to update the progress state

  • output (str) – output

  • service_version (str) – service version

Returns

Service execution object

rerun(sync: bool = False)[source]

Re-run

Returns

Execution object

terminate()[source]

Terminate execution

Returns

execution object

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

update()[source]

Update execution changes to platform

Returns

execution entity

wait()[source]

Wait for execution

Returns

Service execution object

class ExecutionStatus(value)[source]

Bases: str, Enum

An enumeration.

Pipeline

class Pipeline(id, name, creator, org_id, connections, created_at, updated_at, start_nodes, project_id, composition_id, url, preview, description, revisions, project, client_api: ApiClient, repositories=NOTHING)[source]

Bases: BaseEntity

Package object

delete()[source]

Delete pipeline object

Returns

True

execute(execution_input=None)[source]

execute a pipeline and return the execute

Parameters

execution_input – list of the dl.FunctionIO or dict of pipeline input - example {‘item’: ‘item_id’}

Returns

entities.PipelineExecution object

classmethod from_json(_json, client_api, project, is_fetched=True)[source]

Turn platform representation of pipeline into a pipeline entity

Parameters
  • _json (dict) – platform representation of package

  • client_api (dl.ApiClient) – ApiClient entity

  • project (dtlpy.entities.project.Project) – project entity

  • is_fetched (bool) – is Entity fetched from Platform

Returns

Pipeline entity

Return type

dtlpy.entities.pipeline.Pipeline

install()[source]

install pipeline

Returns

Composition entity

open_in_web()[source]

Open the pipeline in web platform

Returns

pause()[source]

pause pipeline

Returns

Composition entity

reset(stop_if_running: bool = False)[source]

Resets pipeline counters

Parameters

stop_if_running (bool) – If the pipeline is installed it will stop the pipeline and reset the counters.

Returns

bool

set_start_node(node: PipelineNode)[source]

Set the start node of the pipeline

Parameters

node (PipelineNode) – node to be the start node

stats()[source]

Get pipeline counters

Returns

PipelineStats

Return type

dtlpy.entities.pipeline.PipelineStats

to_json()[source]

Turn Package entity into a platform representation of Package

Returns

platform json of package

Return type

dict

update()[source]

Update pipeline changes to platform

Returns

pipeline entity

Pipeline Execution

class PipelineExecution(id, nodes, executions, status, created_at, updated_at, pipeline_id, max_attempts, pipeline, client_api: ApiClient, repositories=NOTHING)[source]

Bases: BaseEntity

Package object

classmethod from_json(_json, client_api, pipeline, is_fetched=True)[source]

Turn platform representation of pipeline_execution into a pipeline_execution entity

Parameters
  • _json (dict) – platform representation of package

  • client_api (dl.ApiClient) – ApiClient entity

  • pipeline (dtlpy.entities.pipeline.Pipeline) – Pipeline entity

  • is_fetched (bool) – is Entity fetched from Platform

Returns

Pipeline entity

Return type

dtlpy.entities.pipeline.Pipeline

to_json()[source]

Turn Package entity into a platform representation of Package

Returns

platform json of package

Return type

dict

Other

Pages

class PagedEntities(client_api: ApiClient, page_offset, page_size, filters, items_repository, has_next_page=False, total_pages_count=0, items_count=0, service_id=None, project_id=None, order_by_type=None, order_by_direction=None, execution_status=None, execution_resource_type=None, execution_resource_id=None, execution_function_name=None, list_function=None, items=[])[source]

Bases: object

Pages object

get_page(page_offset=None, page_size=None)[source]

Get page

Parameters
  • page_offset – page offset

  • page_size – page size

go_to_page(page=0)[source]

Brings specified page of items from host

Parameters

page – page number

Returns

next_page()[source]

Brings the next page of items from host

Returns

prev_page()[source]

Brings the previous page of items from host

Returns

process_result(result)[source]
Parameters

result – json object

return_page(page_offset=None, page_size=None)[source]

Return page

Parameters
  • page_offset – page offset

  • page_size – page size

Base Entity

class EntityScopeLevel(value)[source]

Bases: str, Enum

An enumeration.

Command

class Command(id, url, status, created_at, updated_at, type, progress, spec, error, client_api: ApiClient, repositories=NOTHING)[source]

Bases: BaseEntity

Com entity

abort()[source]

abort command

Returns

classmethod from_json(_json, client_api, is_fetched=True)[source]

Build a Command entity object from a json

Parameters
  • _json – _json response from host

  • client_api – ApiClient entity

  • is_fetched – is Entity fetched from Platform

Returns

Command object

in_progress()[source]

Check if command is still in one of the in progress statuses

Returns

True if command still in progress

Return type

bool

to_json()[source]

Returns platform _json format of object

Returns

platform json format of object

Return type

dict

wait(timeout=0, step=None, backoff_factor=0.1)[source]

Wait for Command to finish

Parameters
  • timeout (int) – int, seconds to wait until TimeoutError is raised. if 0 - wait until done

  • step (int) – int, seconds between polling

  • backoff_factor (float) – A backoff factor to apply between attempts after the second try

Returns

Command object

class CommandsStatus(value)[source]

Bases: str, Enum

An enumeration.

Directory Tree

class DirectoryTree(_json)[source]

Bases: object

Dataset DirectoryTree

class SingleDirectory(value, directory_tree, children=None)[source]

Bases: object

DirectoryTree single directory

Utilities

converter

class Converter(concurrency=6, return_error_filepath=False)[source]

Bases: object

Annotation Converter

attach_agent_progress(progress: Progress, progress_update_frequency: Optional[int] = None)[source]

Attach agent progress.

Parameters
  • progress (Progress) – the progress object that follows the work

  • progress_update_frequency (int) – progress update frequency in percentages

convert(annotations, from_format: str, to_format: str, conversion_func=None, item=None)[source]

Convert annotation list or single annotation.

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • item (dtlpy.entities.item.Item) – item entity

  • annotations (list or AnnotationCollection) – annotations list to convert

  • from_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP

  • to_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP

  • conversion_func (Callable) – Custom conversion service

Returns

the annotations

convert_dataset(dataset, to_format: str, local_path: str, conversion_func=None, filters=None, annotation_filter=None)[source]

Convert entire dataset.

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • dataset (dtlpy.entities.dataet.Dataset) – dataset entity

  • to_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP

  • local_path (str) – path to save the result to

  • conversion_func (Callable) – Custom conversion service

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filter parameters

  • annotation_filter (dtlpy.entities.filters.Filters) – Filter entity

Returns

the error log file path if there are errors and the coco json if the format is coco

convert_directory(local_path: str, to_format: AnnotationFormat, from_format: AnnotationFormat, dataset, conversion_func=None)[source]

Convert annotation files in entire directory.

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • local_path (str) – path to the directory

  • to_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP

  • from_format (str) – AnnotationFormat to convert from – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP

  • dataset (dtlpy.entities.dataset.Dataset) – dataset entity

  • conversion_func (Callable) – Custom conversion service

Returns

the error log file path if there are errors

convert_file(to_format: str, from_format: str, file_path: str, save_locally: bool = False, save_to: Optional[str] = None, conversion_func=None, item=None, pbar=None, upload: bool = False, **_)[source]

Convert file containing annotations.

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • to_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP

  • from_format (str) – AnnotationFormat to convert from – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP

  • file_path (str) – path of the file to convert

  • pbar (tqdm) – tqdm object that follows the work (progress bar)

  • upload (bool) – if True upload

  • save_locally (bool) – If True, save locally

  • save_to (str) – path to save the result to

  • conversion_func (Callable) – Custom conversion service

  • item (dtlpy.entities.item.Item) – item entity

Returns

annotation list, errors

static custom_format(annotation, conversion_func, i_annotation=None, annotations=None, from_format=None, item=None, **_)[source]

Custom convert function.

Prerequisites: You must be an owner or developer to use this method.

Parameters

param str from_format: AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP :param dtlpy.entities.item.Item item: item entity :return: converted Annotation

from_coco(annotation, **kwargs)[source]

Convert from COCO format to DATALOOP format. Use this as conversion_func param for functions that ask for this param.

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • annotation – annotations to convert

  • kwargs – additional params

Returns

converted Annotation entity

Return type

dtlpy.entities.annotation.Annotation

static from_voc(annotation, **_)[source]

Convert from VOC format to DATALOOP format. Use this as conversion_func for functions that ask for this param.

Prerequisites: You must be an owner or developer to use this method.

Parameters

annotation – annotations to convert

Returns

converted Annotation entity

Return type

dtlpy.entities.annotation.Annotation

from_yolo(annotation, item=None, **kwargs)[source]

Convert from YOLO format to DATALOOP format. Use this as conversion_func param for functions that ask for this param.

Prerequisites: You must be an owner or developer to use this method.

Parameters
Returns

converted Annotation entity

Return type

dtlpy.entities.annotation.Annotation

save_to_file(save_to, to_format, annotations, item=None)[source]

Save annotations to a file.

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • save_to (str) – path to save the result to

  • to_format – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP

  • annotations (list) – annotation list to convert

  • item (dtlpy.entities.item.Item) – item entity

static to_coco(annotation, item=None, **_)[source]

Convert from DATALOOP format to COCO format. Use this as conversion_func param for functions that ask for this param.

Prerequisites: You must be an owner or developer to use this method.

Parameters
Returns

converted Annotation

Return type

dict

static to_voc(annotation, item=None, **_)[source]

Convert from DATALOOP format to VOC format. Use this as conversion_func param for functions that ask for this param.

Prerequisites: You must be an owner or developer to use this method.

Parameters
Returns

converted Annotation

Return type

dict

to_yolo(annotation, item=None, **_)[source]

Convert from DATALOOP format to YOLO format. Use this as conversion_func param for functions that ask for this param.

Prerequisites: You must be an owner or developer to use this method.

Parameters
Returns

converted Annotation

Return type

tuple

upload_local_dataset(from_format: AnnotationFormat, dataset, local_items_path: Optional[str] = None, local_labels_path: Optional[str] = None, local_annotations_path: Optional[str] = None, only_bbox: bool = False, filters=None, remote_items=None)[source]

Convert and upload local dataset to dataloop platform.

Prerequisites: You must be an owner or developer to use this method.

Parameters
  • from_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP

  • dataset (dtlpy.entities.dataset.Dataset) – dataset entity

  • local_items_path (str) – path to items to upload

  • local_annotations_path (str) – path to annotations to upload

  • local_labels_path (str) – path to labels to upload

  • only_bbox (bool) – only for coco datasets, if True upload only bbox

  • filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filter parameters

  • remote_items (list) – list of the items to upload

Returns

the error log file path if there are errors

Tutorials

Getting Started With Dataloop SDK

Getting started with Dataloop SDK

Dataloop SDK Overview

Dataloop provides an end-to-end platform that supports the entire AI lifecycle, from development to production. By leveraging both a data management and annotation platform, deep learning data generation is streamlined, resulting in accelerated automated pipeline production and reduced engineering time and costs.

The Dataloop platform is built upon an extensive Python SDK that provides full control over your projects and code. It allows you to automate CRUD (Create, Read, Update, Delete) operations within the platform for

  • Projects

  • Datasets

  • Items

  • Annotations

  • Metadata

About this guide

The Getting Started guide provides the developer with an efficient SDK on-boarding experience and covers the following:

  1. Installing the prerequisite software

  2. Login to the platform through SDK

  3. Create a project

  4. Get existing project

  5. Create Dataset

  6. Get Dataset

  7. Upload items

  8. Get items

  9. Annotate item (labels and classification)

  10. Upload annotation

  11. Filter items

  12. Working with Item Metadata

  13. Create Task

  14. Logout

Installing Prerequisite Software

The Dataloop SDK requires several prerequisite software packages to be installed on your system before it can be used.

:information_source: The scope of this guide does not cover detailed external software installation issues. Please use the provided software vendor website links for further installation information and troubleshooting related to your OS.

Python

Python 3.6 or later must be installed in order to use the SDK.

To download Python:
  1. Visit https://www.python.org/downloads/

  2. From the Downloads page, select your desired OS and proceed with the download.

  3. Once the download is complete, you can proceed to install the software.

Dataloop SDK Package
pip

The SDK package requires pip to be installed on your system. pip Is the package installer for Python. If Python was installed from python.org as described above, pip should already be installed.

You can check if pip is installed on your system.

To verify if pip exists on your system:

Run the following from the Command Line:

pip --version

If pip isn’t already installed, you can bootstrap it from the standard library.

To bootstrap pip:

Run the following from the Command Line:

python3 -m ensurepip --default-pip
DTLPY Package

Once you have verified that pip is installed, the Dataloop SDK Package can be installed.

To install the Dataloop SDK Package:

Run the following from the Command Line:

pip install dtlpy

Once the SDK Package is successfully installed, a confirmation message is displayed:

Successfully installed dtlpy-1.64.9
SDK Login

Once the Dataloop SDK Package is installed, you can login to the SDK.

To log in to the Dataloop SDK:

  1. Open a Python Shell.

  2. Run the following Python command:

import dtlpy as dl
dl.login()

Login tokens expire after 24hours, therefore the following expression can be added at the start of the command:

if dl.token_expired():
    dl.login()

A web browser login screen is displayed:

image_tooltipalt_text

  1. Enter your credentials, or alternatively login using a Google account.

Once your credentials have been verified a confirmation message is displayed:

image_tooltipalt_text

Machine-to-Machine Login

Long-running SDK jobs require API authentication.The M2M flow allows machines to obtain valid, signed JWT (authentication token) and automatically refresh it, without the need for a web browser login.

M2M Login is recommended when you want to:- run commands on the platform without an ongoing internet connection- run API commands directly from an external system to Dataloop

:information_source: This can be done with your email and password (signup with a password), or using project bots (which is NOT is the scope of this tutorial).

dl.login_m2m(email=email, password=password)
Datasets

In Dataloop, a dataset is a collection of items (files), their respective metadata, and annotations. Datasets have a file system structure and are organized into folders and subfolders at multiple levels.

There are 3 types of datasets:

  1. Master - The original dataset which manages the actual binaries.

  2. Clone - Contains pointers to original files, which enables management of virtual items that do not replicate the binaries of the underlying storage once cloned or copied. When cloning a dataset, users can decide if the new copy will overwrite the original metadata and annotations.

  3. Merge - Several datasets can be merged into one, allowing multiple annotations to be combined into the same dataset.

Creating a New Dataset

Before a new dataset can be created, at least one project must exist.

To create a new project:

Run the following command to create a new project named: My-First-Project:

project = dl.projects.create(project_name='My-First-Project')

The new project is created.

The new project must be selected prior to creating a new associated dataset.

To select the new project:

Run the following command to select the new project named created in the above step:

project = dl.projects.get(project_name='My-First-Project')

The new project is selected.

A project can also be referenced in the above command via its unique project_id.

To select the new project using a project_id:

Run The following command to select the new project by referencing the project_id:

project = dl.projects.get(project_id='e4a5e5b3-a22a-4b59-9b76-30417a0859d9')

The new project is selected.

The new dataset can now be created.

To create a new dataset:

Run the following command to create a new dataset named My-First-Dataset associated with the project My-First-Project:

project.datasets.create(dataset_name='My-First-Dataset')

Confirmation of the successfully created dataset is displayed:

Dataset(id='632c24ae3444a86f029acb47', url='https://gate.dataloop.ai/api/v1/datasets/632c1194120a7571664d0de3', name='My-First-Dataset', creator='JohnDoe@gmail.com', items_count=0, expiration_options=None, index_driver='v1', created_at='2022-09-22T07:41:08.324Z')

:information_source: Your Dataset ID will differ from the example above.

Uploading items

Items (files) can be uploaded to datasets in a file system structure and are organized into folders and subfolders. Individual items or entire folders can be uploaded.

Before items can be uploaded, the dataset to which the items will be uploaded must be selected.

To select the dataset:

Run the following command to initialize a new instance (dataset) of the new dataset (My-First-Dataset) in order to upload items:

dataset = project.datasets.get(dataset_name='My-First-Dataset')

Confirmation of the new instance of the selected dataset is displayed:

Dataset(id='632c24ae3444a86f029acb47', url='https://gate.dataloop.ai/api/v1/datasets/632c1194120a7571664d0de3', name='My-First-Dataset', creator='JohnDoe@gmail.com', items_count=0, expiration_options=None, index_driver='v1', created_at='2022-09-22T07:41:08.324Z')

If the selected dataset does not exist the following error message is displayed:

dtlpy.exceptions.NotFound: ('404', "Dataset not found. Name: 'My-First-Dataset')

Once the dataset instance has been successfully initialized, items can be uploaded.

The structure of the Upload Item Command is:

dataset.items.upload(local_path='/path/to/file.extension')

:information_source: directory paths look different in Windows and in Linux, Windows require an “r” at the beginning.

To upload an item to a dataset:
  1. Create a local directory in your file explorer. For this example, C:\UploadDemo is used.

  2. Run The following command to upload an image file from a local directory:

dataset.items.upload(local_path=r'C:\UploadDemo\test1.jpg')

:warning: Ensure the path and file exists before running the command.

Confirmation of the completed upload is displayed:

Upload Items: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.54it/s]
Item(dataset_url='https://gate.dataloop.ai/api/v1/datasets/'632c24ae3444a86f029acb47', created_at='2022-09-22T10:18:03.000Z', dataset_id='632c24ae3444a86f029acb47', filename='/test1.jpg', name='test.jpg', type='file', id='632dadf7b28a0c0da317dfc8', spec=None, creator='JohnDoe@gmail.com', _description=None, annotations_count=0)

The Item ID of the uploaded file is 632dadf7b28a0c0da317dfc8. This ID is used when Listing/Getting items ( See Getting Items).

:information_source: Your Item ID will differ from the example above.

If the item to upload is not found, the following error message is displayed:

dtlpy.exceptions.NotFound: ('404', 'Unknown local path: C:\\UploadDemo\\test1.jpg')

:information_source: By default, files are uploaded to the root directory. Items can be uploaded to an existing folder within a dataset using the remote_path argument (Not in the scope of this guide).

Exercise 1
  1. Write the commands to Upload a 2nd image (test2.jpg) file item to My-First-Dataset.

Getting Items

Items can be retrieved from a dataset individually using the item ID. Alternatively, all items can be retrieved using a loop.

Getting a Single Item

The command structure of Getting a Single Item is:

item = dataset.items.get(item_id='my_item_id')
item.print()
To get a single item:
  1. Run the following command to set an instance of a single item object (item_1) from the dataset (My-First-Dataset) by specifying an item ID:

item_1 = dataset.items.get(item_id='632c365b6002b1266e007830')

:information_source: Your Item ID will differ from the example above.

  1. Run the following command to print the specified item:

item_1.print()

The item details are displayed including the following:

  • Filename

  • Creator of the item

  • Created timestamp

  • Dataset ID of the item

Exercise 2
  1. Write the commands to print the details of the 2nd uploaded item (Test2). Name the item object item_2

:bulb: Remember: The ID of the item (Test2) must be identified first.

Getting All Items

All item details in a dataset can be printed using a loop.

To get all items:

Run the following command to loop through the dataset and print all item details:

pages = dataset.items.list()
for item in pages.all():
    item.print()

or:

pages = dataset.items.list()
for page in pages:
    for item in page:
        item.print()

All dataset item details are displayed.

Annotating Items

Dataset items are annotated using Labels. A Label is composed of various Label Settings and Instructions that are defined by a dataset’s Recipe. For example, an image or video file item can contain 1 label defined as a Classification to categorize the entire image and multiple other labels defined as Point Markers to identify specific objects in an image/video file item.

Classification

Classifications are used to categorize an entire image or scene (in the case of a video file). For example, a Classification label can be used to classify product images under categories, subcategories, and characteristics, such as men’s clothes, polo shirts, etc.

The SDK can add Classification labels to an Item using 2 steps.

  1. Adding a label to a dataset’s Recipe.

  2. Adding the label to an item as a Classification.

To Add a Classification Label to a Dataset Recipe:
  1. Run the following command to add a Label (Person) to the My-First-Dataset dataset recipe.

dataset.add_label(label_name='Person')

The label is created and its Properties are displayed.

[Label(tag='Person', display_data={}, color='#0214a7', display_label='Person', attributes=[], children=[])]
  1. Run the following commands to Annotate and Upload the label (Person) as a Classification to the item (item_1):

builder = item_1.annotations.builder()
builder.add(annotation_definition=dl.Classification(label='Person'))
item_1.annotations.upload(builder)

The label is annotated as a Classification to item_1.

Point Markers

A Point Marker is used to identify specific objects in an image or video item. For example, an image of a person’s face can contain multiple Point Marker labels specifying the person’s eyes, mouth, ears, etc.

Point Marker commands accept 2 coordinate input parameters (x,y) which specify where the label is plotted on the image.

The SDK can add Point Marker labels to an Item using 2 steps.

  1. Adding a label to a dataset’s Recipe.

  2. Adding the label to an item as a Point Marker.

To Add/Upload a Point Marker Label to a Dataset Recipe:
  1. Run the following command to add a Label (Ear) to the My-First-Dataset dataset recipe.

dataset.add_label(label_name='Ear')

The label is created its Properties are displayed.

[Label(tag='Ear', display_data={}, color='#0214a7', display_label='Person', attributes=[], children=[])]
  1. Run the following commands to Annotate and Upload the label (Ear) as 2 Point Markers to the item (item_1):

builder = item_1.annotations.builder()
builder.add(annotation_definition=dl.Point(x=80, y=80, label='Ear'))
builder.add(annotation_definition=dl.Point(x=120, y=120, label='Ear'))
item_1.annotations.upload(builder)

The label is annotated as 2 Point Markers to item_1.

:information_source: Other Label Types include Box, Cube, Polygon etc.

Exercise 3
  1. Annotate 3 items (use item_2 from Exercise 2) with the Classification of ‘Face’.

  2. Annotate 2 random Point Marker annotations with the label ‘Eye’ to an item (use item_2 from Exercise 2).

:bulb: Remember: The label must first be added to the Recipe of the dataset.

Working with Filters

The SDK supports the filtering of item data. You can filter items by creating Filter Queries that define the Parameters of the filter. For example, you can create a Filter Query that filters item data on a specific field name, or by an item’s annotation label.

Multiple Parameters can be added to a Filter Query, for example, you can include a parameter that filters for all items that include Point Marker Annotation types that are Labelled as ’Ear’.

Creating Filters

The first step is to create a Filter Query.

To Create a Filter Query:
  1. Run the following command to create a Filter Query named my_filter

my_filter = dl.Filters()

The Filter Query is created.

Once the Filter Query is created, Filter Parameters can be added.

To Add a Filter Parameter:
  1. Run the following command to add a Filter Parameter to my_filter that filters for all items that include Point Marker Annotation types:

my_filter.add_join(field='type', values='point')

The Filter Parameter is created.

:information_source: Other Fields can be used as Filter Parameters including id, dataset_id, etc.

Additional Filter Parameters can be added to the Filter Query.

To add additional filter parameters:

  1. Run the following command to add Additional Filter Parameter to my_filter that filters for all items that include a Label value of ‘Ear’.

my_filter.add_join(field='label', values='Ear')

The Additional Filter Parameter is added.

The created Filter Query can be applied to the dataset and displayed.

To Apply the Filter Query:
  1. Run the following commands to Apply the Filter Query to the dataset and display the filtered item(s):

pages = dataset.items.list(filters=my_filter)
for item in pages.all():
    item.print()

The Filter Query is applied to the datasetand the filtered item(s) details are displayed:

Iterate Pages:   0%|                                                                                                                                                                        | 0/1 [00:00<?, ?it/s]Item(dataset_url='https://gate.dataloop.ai/api/v1/datasets/632c24ae3444a86f029acb47', created_at='2022-09-23T13:00:39.000Z', dataset_id='632c24ae3444a86f029acb47', filename='/test1.jpg', name='test1.jpg', type='file', id='632dadf7b28a0c0da317dfc8', spec=None, creator='JohnDoe@gmail.com', _description=None, annotations_count=7)
Iterate Pages: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 169.70it/s]
>>>
Using Filters to Replace Data

Filters can be used to Replace existing item data. For example, you can Create and Apply a Filter Query that returns a subset of item data that includes a particular Classification such as ‘Person’ and Replace it with another value, such as ‘Adult’, across the entire subset.

The first step is to Create a new Filter Query with a Filter Parameter that filters for all items that include a Label value of ‘Person’.

To Create the Replacement Filter Query:
  1. Run the following commands to create the Replacement Filter Query and Filter Parameter:

person_filter = dl.Filters(resource=dl.FILTERS_RESOURCE_ITEM)
person_filter.add_join(field='label', values='Person')

The Replacement Filter Query and Filter Parameter are created.

The new label can be added with the value ‘Adult’.

  1. Run the following commands to create the new label:

dataset.add_label(label_name='Adult')
pages = dataset.items.list(filters=person_filter)

The new label is created.

The existing label can be deleted and replaced with the new label.

  1. Run the following commands to delete the existing label and Add the new label:

import dtlpy as dl

person_ann_filter = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
person_ann_filter.add(field='label', values='Person')

for item in pages.all():
    item.annotations.delete(filters=person_ann_filter)
    annotations = item.annotations.builder()
    annotations.add(annotation_definition=dl.Classification(label='Adult'))
    item.annotations.upload(annotations)

All instances of the old label are replaced in each item with the new label.

Exercise 4
  1. Create and Apply a Filter Query (use item_2 from Exercise 3) that filters items and returns all items that include Point Marker Annotations that are labeled ‘Eye’.

  2. Create and Apply a Filter Query (use item_2 from Exercise 3) that filters the items with the ‘Face’ classification, deletes the label, and replaces it with the label ‘Person’.

Working with Item Metadata

Metadata is a dictionary attribute used with items, annotations, and other entities of the Dataloop system, for example, Recipes.

You can add any keys and values to both item and annotation user metadata sections using the SDK. This user metadata can be used for data filtering, sorting, etc.

Adding a New User Metadata Field to an Item

The following example will demonstrate adding a new user metadata field named Date&Time to the item named test1, which in this case has an item ID = 632dadf7b28a0c0da317dfc8

:information_source: Your Item ID will differ from the example above. See Get a Single Item.

The first step is to import the datetime module.

To Import the datetime Module:

Run the following commands to import the datetime module:

import datetime

The datetime module is imported.

An instance of item test1 can be created.

Create an instanceof item test1 named item_1.

item_1 = dataset.items.get(item_id='632dadf7b28a0c0da317dfc8')

An instance of item test1 named item_1 is created.

The current date can be assigned to a new field in the item’s metadata named Date&Time and the item can be updated.

To Assign the Current Date to a New Metadata Field:

Run the following commands to assign the date to a new metadata field and update the item:

now = datetime.datetime.now().isoformat()
# modify metadata for the item
item_1.metadata['user'] = dict()
# add it to the item's metadata
item_1.metadata['user']['dateTime'] = now
# update the item
item_1 = item_1.update()

The date is assigned to the new metadata field and the item is updated.

Metadata fields can also be created for a subset of items at once using filters.

To Create Metadata fields for Multiple Items using Filters:

Run the following commands to create metadata fields for a subset of items that include the label ‘Person’ using a filter:

filters = dl.Filters()
filters.add_join(field='label', values='Person')
now = datetime.datetime.now().isoformat()
dataset.items.update(filters=filters, update_values={'user': {'dateTime': now}})

The date is assigned to the new metadata field and all items that include the label ‘Person’ are updated.

Exercise 5
  1. For the filtered items with the classification ‘Adult’ from Exercise 4, add a new field called ‘date’ In the item’s user metadata and assign it the current date.

Creating Tasks

A Task is used to initiate annotations. A Task requires defining the included data items, the assignee(s), and other options such as due date, etc.

To Create a Task

Run the following commands to create a Task containing items with the label ‘Person’ (from the previous example).

task = dataset.tasks.create(task_name='test',
                            due_date=datetime(day=15, month=7, year=2022).timestamp(),
                            assignee_ids=['JohnDoe@gmail.com'],
                            filters=filters)

The task is created.

Exercise 6
  1. Create a Task that contains those items from Exercise 5, ie all the items filtered for the classification ‘Adult’

Logging out

To Logout Run the following command to Logout of the SDK:

dl.logout()

Data Management Tutorial

Tutorials for data management

Cloud Storage

External Storage Dataset

If you already have your data managed and organized on a cloud storage service, such as GCS/S3/Azure, you may want toutilize that with Dataloop, and not upload the binaries and create duplicates.

Cloud Storage Integration

Access & Permissions - Creating an integration with GCS/S2/Azure cloud requires adding a key/secret with the followingpermissions:

List (Mandatory) - allowing Dataloop to list all the items in the storage.Get (Mandatory) - get the items and perform pre-process functionalities like thumbnails, item info etc.Put / Write (Mandatory) - lets you upload your itemsdirectly to the external storage from the Dataloop platform.Delete - lets you delete your items directly from the external storage using the Dataloop platform.

Create Integration With GCS
Creating an integration GCS requires having JSON file with GCS configuration.
import dtlpy as dl
if dl.token_expired():
    dl.login()
organization = dl.organizations.get(organization_name=org_name)
with open(r"C:\gcsfile.json", 'r') as f:
    gcs_json = json.load(f)
gcs_to_string = json.dumps(gcs_json)
organization.integrations.create(name='gcsintegration',
                                 integrations_type=dl.ExternalStorage.GCS,
                                 options={'key': '',
                                          'secret': '',
                                          'content': gcs_to_string})
Create Integration With Amazon S3
import dtlpy as dl
if dl.token_expired():
    dl.login()
organization = dl.organizations.get(organization_name='my-org')
organization.integrations.create(name='S3integration', integrations_type=dl.ExternalStorage.S3,
                                 options={'key': "my_key", 'secret': "my_secret"})
Create Integration With Azure
import dtlpy as dl
if dl.token_expired():
    dl.login()
organization = dl.organizations.get(organization_name='my-org')
organization.integrations.create(name='azureintegration',
                                 integrations_type=dl.ExternalStorage.AZUREBLOB,
                                 options={'key': 'my_key',
                                          'secret': 'my_secret',
                                          'clientId': 'my_clientId',
                                          'tenantId': 'my_tenantId'})
External Storage Driver

Once you have an integration, you can set up a driver, which adds a specific bucket (and optionally with a specificpath/folder) as a storage resource.

Create Drivers in the Platform (browser)
# param name: the driver name
# param driver_type: ExternalStorage.S3, ExternalStorage.GCS , ExternalStorage.AZUREBLOB
# param integration_id: the integration id
# param bucket_name: the external bucket name
# param project_id:
# param allow_external_delete:
# param region: relevant only for s3 - the bucket region
# param storage_class: relevant only for s3
# param path: Optional. By default, path is the root folder. Path is case sensitive.
# return: driver object
import dtlpy as dl
project = dl.projects.get('prject_name')
driver = project.drivers.create(name='driver_name',
                                driver_type=dl.ExternalStorage.S3,
                                integration_id='integration_id',
                                bucket_name='bucket_name',
                                allow_external_delete=True,
                                region='eu-west-1',
                                storage_class="",
                                path="")

Once the integration and drivers are ready, you can create a Dataloop Dataset and sync all the data:

# create a dataset from a driver name, you can also create by the driver ID
import dtlpy as dl
project: dl.Project
dataset = project.datasets.create(dataset_name=dataset_name,
                                  driver=driver)
dataset.sync()

Dataset Binding with AWS

We will create an AWS Lambda to continuously sync a bucket with Dataloop’s dataset

If you want to catch events from the AWS bucket and update the Dataloop Dataset you need to set up a Lambda.The Lambda will catch the AWS bucket events and will reflect them into the Dataloop Platform.

We have prepared an environment zip file with our SDK for python3.8, so you don’t need to create anything else to use dtlpy in the lambda.

NOTE: For any other custom use (e.g. other python version or more packages) try creating your own layer (We used this tutorial and the python:3.8 docker image).

Create the Lambda
  1. Create a new Lambda

  2. The default timeout is 3[s] so we’ll need to change to 1[m]:Configuration → General configuration → Edit → Timeout

  3. Copy the following code:

import os
import urllib.parse
# Set dataloop path to tmp (to read/write from the lambda)
os.environ["DATALOOP_PATH"] = "/tmp"
import dtlpy as dl
DATASET_ID = ''
DTLPY_USERNAME = ''
DTLPY_PASSWORD = ''
def lambda_handler(event, context):
    dl.login_m2m(email=DTLPY_USERNAME, password=DTLPY_PASSWORD)
    dataset = dl.datasets.get(dataset_id=DATASET_ID,
                              fetch=False  # to avoid GET the dataset each time
                              )
    for record in event['Records']:
        # Get the bucket name
        bucket = record['s3']['bucket']['name']
        # Get the file name
        filename = urllib.parse.unquote_plus(record['s3']['object']['key'], encoding='utf-8')
        if 'ObjectRemoved' in record['eventName']:
            # On delete event - delete the item from Dataloop
            try:
                dtlpy_filename = '/' + filename
                filters = dl.Filters(field='filename', values=dtlpy_filename)
                dataset.items.delete(filters=filters)
            except Exception as e:
                raise e
        elif 'ObjectCreated' in record['eventName']:
            # On create event - add a new item to the Dataset
            try:
                # upload the file
                path = 'external://' + filename
                # dataset.items.upload(local_path=path, overwrite=True) # if overwrite is required
                dataset.items.upload(local_path=path)
            except Exception as e:
                raise e
Add a Layer to the Lambda

We have created an AWS Layer with the Dataloop SDK ready. Click here to download the zip file.Because the layer’s size is larger than 50MB you cannot use it directly (AWS restrictions), but need to upload it to a bucket first.Once uploaded, create a new layer for the dtlpy env:

  1. Go to the layers screen and “click Add Layer”.assets/bind_aws/create_layer.pngadd_layer

  2. Choose a name (dtlpy-env).

  3. Use the link to the bucket layer.zip.

  4. Select the env (x86_64, python3.8).

  5. Click “Create” and the bottom on the page.

Go back to your lambda and add the layer:

  1. Select the “Add Layer”.assets/bind_aws/add_layer.pngadd_layer

  2. Choose “Custom layer” and select the Layer you’ve added and the version.

  3. click “Add” at the bottom.

Create the Bucket Events

Go to the bucket you are using, and create the event:

  1. Go to Properties → Event notifications → Create event notification

  2. Choose a name for the Event

  3. For Event types choose: All object create events, All object delete events

  4. Destination - Lambda function → Choose from your Lambda functions → choose the function you build → SAVE

Deploy and you’re good to go!

Dataset Binding with Azure

We will create an Azure Function App to continuously sync a blob with Dataloop’s dataset

If you want to catch events from the Azure blob and update the Dataloop Dataset you need to set up a blob function.The function will catch the blob storage events and will reflect them into the Dataloop Platform.

If you are familiar with Azure Function App, you can just use our integration function below.

We assume you already have an Azure account with resource group and storage account. If you don’t, follow the Azure docs and create them.

Create the Blob Function
  1. Create a Container in the created Storage account

    • Public access level -> Container OR BlobNOTE this container should be used as the external storage for the Dataloop dataset.

  2. Go back to Resource group and click Create -> Function App

    • Choose Subscription, your Resource group, Name and Region

    • Publish -> Code

    • Runtime stack -> Python

    • Version -> <=3.7

In VScode, flow the instructions in azure docs to configure your environment and deploy the function:

  1. Configure your environment

  2. Sign in to Azure

  3. Create your local project

    • in Select a template for your project’s first function choose -> Azure Blob Storage trigger

    • in Storage account select your Storage account

    • in Resource group select your Resource group

    • Set the ‘Create new Azure Blob Storage trigger’ to your container name (used in the Dataloop platform)assets/bind_azure/trigger_dataset.pngadd_layer

    • open the code file

    • add dtlpy to the requirements.txt file

    • add “disabled”: false to the function.json file

    • add a function code to __init__.py file

import azure.functions as func
import dtlpy as dl
import os
os.environ["DATALOOP_PATH"] = "/tmp"
dataset_id = os.environ.get('DATASET_ID')
dtlpy_username = os.environ.get('DTLPY_USERNAME')
dtlpy_password = os.environ.get('DTLPY_PASSWORD')
def main(myblob: func.InputStream):
    dl.login_m2m(email=dtlpy_username, password=dtlpy_password)
    dataset = dl.datasets.get(dataset_id=dataset_id,
                              fetch=False  # to avoid GET the dataset each time
                              )
    # remove th Container name from the path
    path_parser = myblob.name.split('/')
    file_name = '/'.join(path_parser[1:])
    file_name = 'external://' + file_name
    dataset.items.upload(local_path=file_name)
  1. Deploy the code to the function app you created.

  2. In VS code go to view tab -> Command Palette -> Azure Functions: Upload Local Settings

  3. Go to the Function App -> Select your function -> Configuration (Under Settings section)* add the 3 secrets vars DATASET_ID, DTLPY_USERNAME, DTLPY_PASSWORD

Done! Now your storage blob will be synced with the Dataloop dataset

Dataset Binding with Google Cloud Storage

We will create a GCS cloud function to continuously sync a bucket with Dataloop’s Dataset

If you want to catch events from the GCS bucket and update the Dataloop Dataset you need to set up a Cloud function.The function will catch the GCS bucket events and will reflect them into the Dataloop Platform.

Create the cloud function
  1. Create a cloud function for create event (must add the environment variables DATASET_ID, DTLPY_USERNAME and DTLPY_PASSWORD)assets/bind_gcs/create_function.pngadd_layerassets/bind_gcs/settings.pngadd_layer

  2. Add dtlpy to the requirements.txt

  3. Copy the following code to the main file:

import dtlpy as dl
import os
dataset_id = os.environ.get('DATASET_ID')
dtlpy_username = os.environ.get('DTLPY_USERNAME')
dtlpy_password = os.environ.get('DTLPY_PASSWORD')
def create_gcs(event, context):
    """Triggered by a change to a Cloud Storage bucket.
    Args:
         event (dict): Event payload.
         context (google.cloud.functions.Context): Metadata for the event.
    """
    file = event
    dl.login_m2m(email=dtlpy_username, password=dtlpy_password)
    dataset = dl.datasets.get(dataset_id=dataset_id,
                              fetch=False  # to avoid GET the dataset each time
                              )
    file_name = 'external://' + file['name']
    dataset.items.upload(local_path=file_name)
  1. create another function for delete with delete event with this code

import dtlpy as dl
import os
dataset_id = os.environ.get('DATASET_ID')
dtlpy_username = os.environ.get('DTLPY_USERNAME')
dtlpy_password = os.environ.get('DTLPY_PASSWORD')
def delete_gcs(event, context):
    """Triggered by a change to a Cloud Storage bucket.
    Args:
         event (dict): Event payload.
         context (google.cloud.functions.Context): Metadata for the event.
    """
    file = event
    dl.login_m2m(email=dtlpy_username, password=dtlpy_password)
    dataset = dl.datasets.get(dataset_id=dataset_id,
                              fetch=False  # to avoid GET the dataset each time
                              )
    file_name = file['name']
    dataset.items.delete(filename=file_name)

Deploy and you’re good to go!

Manage Datasets

Datasets are buckets in the dataloop system that hold a collection of data items of any type, regardless of theirstorage location (on Dataloop storage or external cloud storage).

Create Dataset

You can create datasets within a project. There are no limits to the number of dataset a project can have, whichcorrelates with data versioning where datasets can be cloned and merged.

dataset = project.datasets.create(dataset_name='my-dataset-name')
Create Dataset With Cloud Storage Driver

If you’ve created an integration and driver to your cloud storage, you can create a dataset connected to that driver. Asingle integration (for example: S3) can have multiple drivers (per bucket or even per folder), so you need to specifythat.

project = dl.projects.get(project_name='my-project-name')
# Get your drivers list
project.drivers.list().print()
# Create a dataset from a driver name. You can also create by the driver ID.
dataset = project.datasets.create(driver='my_driver_name', dataset_name="my_dataset_name")
Retrieve Datasets

You can read all datasets that exist in a project, and then access the datasets by their ID (or name).

datasets = project.datasets.list()
dataset = project.datasets.get(dataset_id='my-dataset-id')
Create Directory

A dataset can have multiple directories, allowing you to manage files by context, such as upload time, working batch,source, etc.

dataset.items.make_dir(directory="/directory/name")
Deep Copy a Folder to Another Dataset

You can create a clone of a folder into a new dataset, but if you want to actually move between datasets a folder withfiles that are stored in the Dataloop system, you’ll need to download the files and upload again to the destination dataset.

copy_annotations = True
flat_copy = False  # if true, it copies all dir files and sub dir files to the destination folder without sub directories
source_folder = '/source_folder'
destination_folder = '/destination_folder'
source_project_name = 'source_project_name'
source_dataset_name = 'source_dataset_name'
destination_project_name = 'destination_project_name'
destination_dataset_name = 'destination_dataset_name'
# Get source project dataset
project = dl.projects.get(project_name=source_project_name)
dataset_from = project.datasets.get(dataset_name=source_dataset_name)
source_folder = source_folder.rstrip('/')
# Filter to get all files of a specific folder
filters = dl.Filters()
filters.add(field='filename', values=source_folder + '/**')  # Get all items in folder (recursive)
pages = dataset_from.items.list(filters=filters)
# Get destination project and dataset
project = dl.projects.get(project_name=destination_project_name)
dataset_to = project.datasets.get(dataset_name=destination_dataset_name)
# Go over all projects and copy file from src to dst
for page in pages:
    for item in page:
        # Download item (without save to disk)
        buffer = item.download(save_locally=False)
        # Give the item's name to the buffer
        if flat_copy:
            buffer.name = item.name
        else:
            buffer.name = item.filename[len(source_folder) + 1:]
        # Upload item
        print("Going to add {} to {} dir".format(buffer.name, destination_folder))
        new_item = dataset_to.items.upload(local_path=buffer, remote_path=destination_folder)
        if not isinstance(new_item, dl.Item):
            print('The file {} could not be upload to {}'.format(buffer.name, destination_folder))
            continue
        print("{} has been uploaded".format(new_item.filename))
        if copy_annotations:
            new_item.annotations.upload(item.annotations.list())

Data Versioning

Dataloop’s powerful data versioning provides you with unique tools for data management - clone, merge, slice & dice your files, to create multiple versions for various applications. Sample use cases include:Golden training sets managementReproducibility (dataset training snapshot)Experimentation (creating subsets from different kinds)Task/Assignment managementData Version “Snapshot” - Use our versioning feature as a way to save data (items, annotations, metadata) before any major process. For example, a snapshot can serve as a roll-back mechanism to original datasets in case of any error without losing the data.

Clone Datasets

Cloning a dataset creates a new dataset with the same files as the original. Files are actually a reference to the original binary and not a new copy of the original, so your cloud data remains safe and protected. When cloning a dataset, you can add a destination dataset, remote file path, and more…

dataset = project.datasets.get(dataset_id='my-dataset-id')
dataset.clone(clone_name='clone-name',
              filters=None,
              with_items_annotations=True,
              with_metadata=True,
              with_task_annotations_status=True)
Merge Datasets

Dataset merging outcome depends on how similar or different the datasets are.

  • Cloned Datasets - items, annotations, and metadata will be merged. This means that you will see annotations from different datasets on the same item.

  • Different datasets (not clones) with similar recipes - items will be summed up, which will cause duplication of similar items.

  • Datasets with different recipes - Datasets with different default recipes cannot be merged. Use the ‘Switch recipe’ option on dataset level (3-dots action button) to match recipes between datasets and be able to merge them.

dataset_ids = ["dataset-1-id", "dataset-2-id"]
project_ids = ["dataset-1-project-id", "dataset-2-project-id"]
dataset_merge = dl.datasets.merge(merge_name="my_dataset-merge",
                                  project_ids=project_ids,
                                  dataset_ids=dataset_ids,
                                  with_items_annotations=True,
                                  with_metadata=False,
                                  with_task_annotations_status=False)

Upload & Manage Data & Metadata

Upload Specific Files

When you have specific files you want to upload, you can upload them all into a dataset using this script:

import dtlpy as dl
if dl.token_expired():
    dl.login()
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
dataset.items.upload(local_path=[r'C:/home/project/images/John Morris.jpg',
                                 r'C:/home/project/images/John Benton.jpg',
                                 r'C:/home/project/images/Liu Jinli.jpg'],
                     remote_path='/folder_name')  # Remote path is optional, images will go to the root directory by default
Upload All Files in a Folder

If you want to upload all files from a folder, you can do that by just specifying the folder name:

import dtlpy as dl
if dl.token_expired():
    dl.login()
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
dataset.items.upload(local_path=r'C:/home/project/images',
                     remote_path='/folder_name')  # Remote path is optional, images will go to the root directory by default
Upload Items with Metadata

You can upload items as a table using a Pandas DataFrame that will let you upload items with info (annotations, metadata such as confidence, filename, etc.) attached to it.

import pandas
import dtlpy as dl
dataset = dl.datasets.get(dataset_id='id')  # Get dataset
to_upload = list()
# First item and info attached:
to_upload.append({'local_path': r"E:\TypesExamples\000000000064.jpg",  # Local path to image
                  'local_annotations_path': r"E:\TypesExamples\000000000776.json",  # Local path to annotation file
                  'remote_path': "/first",  # Remote directory of uploaded image
                  'remote_name': 'f.jpg',  # Remote name of image
                  'item_metadata': {'user': {'dummy': 'fir'}}})  # Metadata for the created item
# Second item and info attached:
to_upload.append({'local_path': r"E:\TypesExamples\000000000776.jpg",  # Local path to image
                  'local_annotations_path': r"E:\TypesExamples\000000000776.json",  # Local path to annotation file
                  'remote_path': "/second",  # Remote directory of uploaded image
                  'remote_name': 's.jpg',  # Remote name of image
                  'item_metadata': {'user': {'dummy': 'sec'}}})  # Metadata for the created item
df = pandas.DataFrame(to_upload)  # Make data into DF table
items = dataset.items.upload(local_path=df,
                             overwrite=True)  # Upload DF to platform

Upload & Manage Annotations

import dtlpy as dl
item = dl.items.get(item_id="")
annotation = item.annotations.get(annotation_id="")
annotation.metadata["user"] = True
annotation.update()
Upload User Metadata

To upload annotations from JSON and include the user metadata, add the parameter local_annotation_path to the dataset.items.upload function, like so:

project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
dataset.items.upload(local_path=r'<items path>',
                     local_annotations_path=r'<annotation json file path>',
                     item_metadata=dl.ExportMetadata.FROM_JSON,
                     overwrite=True)
Convert Annotations To COCO Format
converter = dl.Converter()
converter.upload_local_dataset(
    from_format=dl.AnnotationFormat.COCO,
    dataset=dataset,
    local_items_path=r'C:/path/to/items',
    # Please make sure the names of the items are the same as written in the COCO JSON file
    local_annotations_path=r'C:/path/to/annotations/file/coco.json'
)
Upload Entire Directory and their Corresponding Dataloop JSON Annotations
# Local path to the items folder
# If you wish to upload items with your directory tree use : r'C:/home/project/images_folder' 
local_items_path = r'C:/home/project/images_folder/*'
# Local path to the corresponding annotations - make sure the file names fit
local_annotations_path = r'C:/home/project/annotations_folder'
dataset.items.upload(local_path=local_items_path,
                     local_annotations_path=local_annotations_path)
Upload Annotations To Video Item

Uploading annotations to video items needs to consider spanning between frames, and toggling visibility (occlusion). In this example, we will use the following CSV file.In this file there is a single ‘person’ box annotation that begins on frame number 20, disappears on frame number 41, reappears on frame number 51 and ends on frame number 90.

Video_annotations_example.CSV

import pandas as pd
# Read CSV file
df = pd.read_csv(r'C:/file.csv')
# Get item
item = dataset.items.get(item_id='my_item_id')
builder = item.annotations.builder()
# Read line by line from the csv file
for i_row, row in df.iterrows():
    # Create box annotation from csv rows and add it to a builder
    builder.add(annotation_definition=dl.Box(top=row['top'],
                                             left=row['left'],
                                             bottom=row['bottom'],
                                             right=row['right'],
                                             label=row['label']),
                object_visible=row['visible'],  # Support hidden annotations on the visible row
                object_id=row['annotation id'],  # Numbering system that separates different annotations
                frame_num=row['frame'])
# Upload all created annotations
item.annotations.upload(annotations=builder)
Upload Annotations In VTT Format

The Dataloop builder support VTT files, for uploading Web Text Tracks for video transcription.

project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# local path to item
local_item_path = r'/Users/local/path/to/item.png'
# local path to vtt
local_vtt_path = r'/Users/local/path/to/subtitles.vtt'
# upload item
item = dataset.items.upload(local_path=local_item_path)
# upload VTT file - wait until the item finishs uploading
builder = item.annotations.builder()
builder.from_vtt_file(filepath=local_vtt_path)
item.annotations.upload(builder)
Upload Audio Annotation to an Audio File
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
item = dataset.items.get(filepath='/my_item.mp4')
# Using annotation builder
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Subtitle(label='<label>',
                                              text='<text>'),
            start_time='<start>',
            end_time='<end>')
Set Attributes On Annotations

You can set attributes on annotations in hte platform using the SDK. Since Dataloop deprecated a legacy attributes mechanism, attributes are refered to as ‘2.0’ version and need to be set as such first.

Free Text Attribute
dl.use_attributes_2(True)
annotation.attributes.update({"ID of the attribute": "value of the attribute"})
annotation = annotation.update(True)
Range Attributes (Slider in UI)
dl.use_attributes_2(True)
annotation.attributes.update({"<attribute-id>": number_on_range})
annotation = annotation.update(system_metadata=True)
CheckBox Attribute (Multiple choice)
dl.use_attributes_2(True)
annotation.attributes.update({"<attribute-id>": ["selection", "selection"]})
annotation = annotation.update(system_metadata=True)
Radio Button Attribute (Single Choice)
dl.use_attributes_2(True)
annotation.attributes.update({"<attribute-id>": "selection"})
annotation = annotation.update(system_metadata=True)
Yes/No Attribute
dl.use_attributes_2(True)
annotation.attributes.update({"<attribute-id>": True / False})
annotation = annotation.update(system_metadata=True)
Show Annotations Over Image

After uploading items and annotations with their metadata, you might want to see some of them and perform visual validation.

To see only the annotations, use the annotation type show option.

# Use the show function for all annotation types
box = dl.Box()
# Must provide all inputs
box.show(image='',
         thickness='',
         with_text='',
         height='',
         width='',
         annotation_format='',
         color='')

To see the item itself with all annotations, use the Annotations option.

# Must input an image or height and width
annotation.show(image='',
                height='', width='',
                annotation_format='dl.ViewAnnotationOptions.*',
                thickness='',
                with_text='')
Download Data, Annotations & Metadata

The item ID for a specific file can be found in the platform UI - Click BROWSE for a dataset, click on the selected file, and the file information will be displayed in the right-side panel. The item ID is detailed, and can be copied in a single click.

Download Items and Annotations

Download dataset items and annotations to your computer folder in two separate folders.To list the download annotation option use dl.ViewAnnotationOptions:

  1. JSON: Download json files with the Dataloop annotation format.

  2. MASK: Save a PNG image file with the RGB annotation drawn.

  3. INSTANCE: Saves a PNG with the annotation label ID as the pixel value.

  4. ANNOTATION_ON_IMAGE: Saves a PNG with the annotation drawn on top of the image.

  5. VTT: Save subtitle annotation type in a VTT format.

  6. OBJECT_ID: Save a PNG with the object ID as the pixel value.

dataset.download(local_path=r'C:/home/project/images',  # The default value is ".dataloop" folder
                 annotation_options=dl.VIEW_ANNOTATION_OPTIONS_JSON)

NOTE: The annotation option can also be a list to download multiple options:

dataset.download(local_path=r'C:/home/project/images',  # The default value is ".dataloop" folder
                 annotation_options=[dl.VIEW_ANNOTATION_OPTIONS_MASK,
                                     dl.VIEW_ANNOTATION_OPTIONS_JSON,
                                     dl.ViewAnnotationOptions.INSTANCE])
Filter by Item and/or Annotation
  • Items filter - download filtered items based on multiple parameters, like their directory.You can also download items based on different filters. Learn all about item filters here.

  • Annotation filter - download filtered annotations based on multiple parameters like their label.You can also download items annotations based on different filters, learn all about annotation filters here.This example will download items and JSONS from a dog folder of the label ‘dog’.

# Filter items from "folder_name" directory
item_filters = dl.Filters(resource='items', field='dir', values='/dog_name')
# Filter items with dog annotations
annotation_filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION, field='label', values='dog')
dataset.download(local_path=r'C:/home/project/images',  # The default value is ".dataloop" folder
                 filters=item_filters,
                 annotation_filters=annotation_filters,
                 annotation_options=dl.VIEW_ANNOTATION_OPTIONS_JSON)
Filter by Annotations
  • Annotation filter - download filtered annotations based on multiple parameters like their label. You can also download items annotations based on different filters, learn all about annotation filters here.

item = dataset.items.get(item_id="item_id")  # Get item from dataset to be able to view the dataset colors on Mask
# Filter items with dog annotations
annotation_filters = dl.Filters(resource='annotations', field='label', values='dog')
item.download(local_path=r'C:/home/project/images',  # the default value is ".dataloop" folder
              annotation_filters=annotation_filters,
              annotation_options=dl.VIEW_ANNOTATION_OPTIONS_JSON)
Download Annotations in COCO/YOLO/VOC Format
  • Items filter - download filtered items based on multiple parameters like their directory. You can also download items based on different filters, learn all about item filters here.

  • Annotation filter - download filtered annotations based on multiple parameters like their label. You can also download items annotations based on different filters, learn all about annotation filters here.

This example will download COCO from a dog items folder of the label ‘dog’ (edit the script to change to YOLO/VOC).

# Filter items from "folder_name" directory
item_filters = dl.Filters(resource='items', field='dir', values='/dog_name')
# Filter items with dog annotations
annotation_filters = dl.Filters(resource='annotations', field='label', values='dog')
converter = dl.Converter()
converter.convert_dataset(dataset=dataset,
                          # Use the converter of choice
                          # to_format='yolo',
                          # to_format='voc',
                          to_format='coco',
                          local_path=r'C:/home/coco_annotations',
                          filters=item_filters,
                          annotation_filters=annotation_filters)
# Param export_version will be set to ExportVersion.V1 by default.
dataset.download(local_path='/path',
                 annotation_options='json',
                 export_version=dl.ExportVersion.V2)
from PIL import Image
item = dl.items.get(item_id='my-item-id')
array = item.download(save_locally=False, to_array=True)
# Check out the downloaded Ndarray with these commands - optional
image = Image.fromarray(array)
image.save(r'C:/home/project/images.jpg')

Advance SDK Filters

To access the filters entity click here.

Filter Operators

To understand more about filter operators please click here.

When adding a filter, several operators are available for use:

Equal

eq -> equal(or dl.FiltersOperation.EQUAL)

For example, filter items from a specific folder directory.

import dtlpy as dl
# Get project and dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# Create filters instance
filters = dl.Filters()
# Filter only items from a specific folder directory
filters.add(field='dir', values='/DatasetFolderName', operator=dl.FILTERS_OPERATIONS_EQUAL)
# optional - return results sorted by ascending file name 
filters.sort_by(field='filename')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Not Equal

ne -> not equal(or dl.FiltersOperation.NOT_EQUAL)

In this example, you will get all items that do not have ONLY a ‘cat’ label.NoteThis Operator is a better fit for filters of a single value because, for example, this filter will return items that have both ‘cat’ and ‘dog’ labels.View an example of the solution here.

filters = dl.Filters()
# Filter ONLY a cat label
filters.add_join(field='label', values='cat', operator=dl.FILTERS_OPERATIONS_NOT_EQUAL)
# optional - return results sorted by ascending file name 
filters.sort_by(field='filename')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in the dataset: {}'.format(pages.items_count))
Greater Than

gt -> greater than(or dl.FiltersOperation.GREATER_THAN)

You will get items with a greater height (in pixels) than the given value in this example.

filters = dl.Filters()
# Filter images with a bigger height size
filters.add(field='metadata.system.height', values=height_number_in_pixels,
            operator=dl.FILTERS_OPERATIONS_GREATER_THAN)
# optional - return results sorted by ascending file name 
filters.sort_by(field='filename')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Less Than

lt -> less than(or dl.FiltersOperation.LESS_THAN)

You will get items with a width (in pixels) less than the given value in this example.

filters = dl.Filters()
# Filter images with a bigger height size
filters.add(field='metadata.system.width', values=width_number_in_pixels, operator=dl.FILTERS_OPERATIONS_LESS_THAN)
# optional - return results sorted by ascending file name 
filters.sort_by(field='filename')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
In a List

in -> is in a list (when using this expression, values should be a list).(or dl.FiltersOperation.IN)In this example, you will get items with dog OR cat labels.

filters = dl.Filters()
# Filter items with dog OR cat labels
filters.add_join(field='label', values=['dog', 'cat'], operator=dl.FILTERS_OPERATIONS_IN)
# optional - return results sorted by ascending file name 
filters.sort_by(field='filename')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Exist

The filter param FILTERS_OPERATIONS_EXISTS checks if an attribute exists. The following example checks if there is an item with user metadata:

filters = dl.Filters()
filters.add(field='metadata.user', values=True, operator=dl.FILTERS_OPERATIONS_EXISTS)
dataset.items.list(filters=filters)
SDK defaults

Filters ignore SDK defaults like hidden items and directories or note annotations as issues.If you wish to change this behavior, you may do the following:

filters = dl.Filters(use_defaults=False)
Hidden Items and Directories

If you wish to only show hidden items & directories in your filters use this code:

filters = dl.Filters()
filters.add(field='type', values='dir')
# or
filters.pop(field='type')
Delete a Filter
filters = dl.Filters()
# For example, if you added the following filter:
filters.add(field='to-delete-field', values='value')
# Use this command to delete the filter
filters.pop(field='to-delete-field')
# or for items by their annotations
filters.pop_join(field='to-delete-annotation-field')
Full Examples
How to filter items that were created between specific dates?

In this example, you will get all of the items that were created in 2018.

import datetime, time
filters = dl.Filters()
# -- time filters -- must be in ISO format and in UTC (offset from local time). converting using datetime package as follows:
earlier_timestamp = datetime.datetime(year=2018, month=1, day=1, hour=0, minute=0, second=0,
                                      tzinfo=datetime.timezone(
                                          datetime.timedelta(seconds=-time.timezone))).isoformat()
later_timestamp = datetime.datetime(year=2019, month=1, day=1, hour=0, minute=0, second=0,
                                    tzinfo=datetime.timezone(
                                        datetime.timedelta(seconds=-time.timezone))).isoformat()
filters.add(field='createdAt', values=earlier_timestamp, operator=dl.FiltersOperations.GREATER_THAN)
filters.add(field='createdAt', values=later_timestamp, operator=dl.FiltersOperations.LESS_THAN)
# change method to OR
filters.method = dl.FiltersMethod.OR
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
How to filter items that don’t have a specific label?

In this example, you will get all items that do not have a ‘cat’ label AT ALL.

Note
This filter will NOT return items that have both 'cat' and 'dog' labels.
# Get all items
all_items = set([item.id for item in dataset.items.list().all()])
# Get all items WITH the label cat
filters = dl.Filters()
filters.add_join(field='label', values='cat')
cat_items = set([item.id for item in dataset.items.list(filters=filters).all()])
# Get the difference between the sets. This will give you a list of the items with no cat
no_cat_items = all_items.difference(cat_items)
print('Number of filtered items in dataset: {}'.format(len(no_cat_items)))
# Iterate through the ID's  - Go over all ID's and print the matching item
for item_id in no_cat_items:
    print(dataset.items.get(item_id=item_id))

To access the filters entity click here.

The Dataloop Query Language - DQL

Using The Dataloop Query Language, you may navigate through massive amounts of data.

You can filter, sort, and update your metadata with it.

Filters

Using filters, you can filter items and get a generator of the filtered items. The filters entity is used to build such filters.

Filters - Field & Value

Filter your items or annotations using the parameters in the JSON code that represent its data within our system.Access your item/annotation JSON using to_json().

Field

Field refers to the attributes you filter by.

For example, “dir” would be used if you wish to filter items by their folder/directory.

Value

Value refers to the input by which you want to filter.For example, “/new_folder” can be the directory/folder name where the items you wish to filter are located.

Sort - Field & Value
Field

Field refers to the field you sort your items/annotations list by.For example, if you sort by filename, you will get the item list sorted in alphabetical order by filename.See the full list of the available fields here.

Value

Value refers to the list order direction. Either ascending or descending.

Filter Annotations

Filter annotations by the annotations’ JSON fields.In this example, you will get all of the note annotations in the dataset sorted by the label.

Note

See all of the items iterator options on the Iterator of Items page.

import dtlpy as dl
# Get project and dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# Create filters instance with annotation resource
filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
# Filter example - only note annotations
filters.add(field='type', values='note')
# optional - return results sorted by descending label 
filters.sort_by(field='label', value=dl.FiltersOrderByDirection.DESCENDING)
pages = dataset.annotations.list(filters=filters)
# Count the annotations
print('Number of filtered annotations in dataset: {}'.format(pages.items_count))
# Iterate through the annotations - Go over all annotations and print the properties
for page in pages:
    for annotation in page:
        annotation.print()
Filter Annotations by the Annotations’ Item

add_join - filter Annotations by the annotations’ items’ JSON fields. For example, filter only box annotations from image items.

Note
See all of the items iterator options on the Iterator of Items page.
# Create filters instance
filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
# Filter all box annotations
filters.add(field='type', values='box')
# AND filter annotations by their items - only items that are of mimetype image
# Meaning you will get 'box' annotations of all image items
filters.add_join(field='metadata.system.mimetype', values="image*")
# optional - return results sorted by descending creation date
filters.sort_by(field='createdAt', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered annotations list in a page object
pages = dataset.annotations.list(filters=filters)
# Count the annotations
print('Number of filtered annotations in dataset: {}'.format(pages.items_count))
Filters Method - “Or” and “And”
Filters Operators
For more advanced filters operators visit the Advanced SDK Filters page.
And

If you wish to filter annotations with the “and” logical operator, you can do so by specifying which filters will be checked with “and”.

AND is the default value and can be used without specifying the method.
In this example, you will get a list of annotations in the dataset of the type box and label car.
filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
# set annotation resource
filters.add(field='type', values='box', method=dl.FiltersMethod.AND)
filters.add(field='label', values='car',
            method=dl.FiltersMethod.AND)  # optional - return results sorted by ascending creation date
filters.sort_by(field='createdAt')
# Get filtered annotations list
pages = dataset.annotations.list(filters=filters)
# Count the annotations
print('Number of filtered annotations in dataset: {}'.format(pages.items_count))
Or

If you wish to filter annotations with the “or” logical operator, you can do so by specifying which filters will be checked with “or”.In this example, you will get a list of the dataset’s annotations that are either a ‘box’ or a ‘point’ type.

filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
# filters with or
filters.add(field='type', values='/box', method=dl.FiltersMethod.OR)
filters.add(field='type', values='/point',
            method=dl.FiltersMethod.OR)  # optional - return results sorted by descending updated date
filters.sort_by(field='createdAt', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered annotations list
pages = dataset.annotations.list(filters=filters)
# Count the annotations
print('Number of filtered annotations in dataset: {}'.format(pages.items_count))
Delete Filtered Items

In this example, you will delete annotations that were created on 30/8/2020 at 8:17 AM.

filters = dl.Filters()
# set annotation resource
filters.resource = dl.FiltersResource.ANNOTATION
# Example - created on 30/8/2020 at 8:17 AM
filters.add(field='createdAt', values="2020-08-30T08:17:08.000Z")
dataset.annotations.delete(filters=filters)
Annotation Filtering Fields
More Filter Options
Use a dot to access parameters within curly brackets. For example use field='metadata.system.status' to filter by the annotation's status.
{
    "id": "5f576f660bb2fb455d79ffdf",
    "datasetId": "5e368bee106a76a61cf05282",
    "type": "segment",
    "label": "Planet",
    "attributes": [],
    "coordinates": [
        [
            {
                "x": 856.25,
                "y": 1031.2499999999995
            },
            {
                "x": 1081.25,
                "y": 1631.2499999999995
            },
            {
                "x": 485.41666666666663,
                "y": 1735.4166666666665
            },
            {
                "x": 497.91666666666663,
                "y": 1172.9166666666665
            }
        ]
    ],
    "metadata": {
        "system": {
            "status": null,
            "startTime": 0,
            "endTime": 1,
            "frame": 0,
            "endFrame": 1,
            "snapshots_": [
                {
                    "fixed": true,
                    "type": "transition",
                    "frame": 0,
                    "objectVisible": true,
                    "data": [
                        [
                            {
                                "x": 856.25,
                                "y": 1031.2499999999995
                            },
                            {
                                "x": 1081.25,
                                "y": 1631.2499999999995
                            },
                            {
                                "x": 485.41666666666663,
                                "y": 1735.4166666666665
                            },
                            {
                                "x": 497.91666666666663,
                                "y": 1172.9166666666665
                            }
                        ]
                    ],
                    "label": "Planet",
                    "attributes": []
                }
            ],
            "automated": false,
            "isOpen": false,
            "system": false
        },
        "user": {}
    },
    "creator": "user@dataloop.ai",
    "createdAt": "2020-09-08T11:47:50.576Z",
    "updatedBy": "user@dataloop.ai",
    "updatedAt": "2020-09-08T11:47:50.576Z",
    "itemId": "5f572f4423a69b8c83408f12",
    "url": "https://gate.dataloop.ai/api/v1/annotations/5f576f660bb2fb455d79ffdf",
    "item": "https://gate.dataloop.ai/api/v1/items/5f572f4423a69b8c83408f12",
    "dataset": "https://gate.dataloop.ai/api/v1/datasets/5e368bee106a76a61cf05282",
    "hash": "11fdc816804faf0f7266b40d1cb67aff38e5c10d"
}
Full Examples
How to filter annotations by their label?
filters = dl.Filters()
# set resource
filters.resource = dl.FiltersResource.ANNOTATION
filters.add(field='label', values='your_label_value')
pages = dataset.annotations.list(filters=filters)
# Count the annotations
print('Number of filtered annotations in dataset: {}'.format(pages.items_count))
Advanced Filtering Operators

Explore advanced filtering options on this page.

To access the filters entity click here.

The Dataloop Query Language - DQL

Using The Dataloop Query Language, you may navigate through massive amounts of data.

You can filter, sort, and update your metadata with it.

Filters

Using filters, you can filter items and get a generator of the filtered items. The filters entity is used to build such filters.

Filters - Field & Value

Filter your items or annotations using the parameters in the JSON code that represent its data within our system.Access your item/annotation JSON using to_json().

Field

Field refers to the attributes you filter by.

For example, “dir” would be used if you wish to filter items by their folder/directory.

Value

Value refers to the input by which you want to filter.For example, “/new_folder” can be the directory/folder name where the items you wish to filter are located.

Sort - Field & Value
Field

Field refers to the field you sort your items/annotations list by.For example, if you sort by filename, you will get the item list sorted in alphabetical order by filename.See the full list of the available fields here.

Value

Value refers to the list order direction. Either ascending or descending.

Filter Items

Filter items by the item’s JSON fields.In this example, you will get all annotated items in a dataset sorted by the filename.

Note
See all of the items iterator options on the Iterator of Items page.
import dtlpy as dl
# Get project and dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# Create filters instance
filters = dl.Filters()
# Filter only annotated items
filters.add(field='annotated', values=True)
# optional - return results sorted by ascending file name 
filters.sort_by(field="filename")
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Filter Items by the Items’ Annotations

add_join - filter items by the items’ annotations JSON fields. For example, filter only items with ‘box’ annotations.

Note
See all of the items iterator options on the Iterator of Items page.
filters = dl.Filters()
# Filter all approved items
filters.add(field='metadata.system.annotationStatus', values="approved")
# AND filter items by their annotation - only items with 'box' annotations
# Meaning you will get approved items with 'box' annotations
filters.add_join(field='type', values='box')
# optional - return results sorted by descending creation date 
filters.sort_by(field='createdAt', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Filters Method - “Or” and “And”
Filters Operators
For more advanced filters operators visit the Advanced SDK Filters page.
And

If you wish to filter annotations with the “and” logical operator, you can do so by specifying which filters will be checked with “and”.

AND is the default value and can be used without specifying the method.
In this example, you will get a list of annotated items with user metadata of the field "is_automated" and value True.
filters = dl.Filters()  # filters with and
filters.add(field='annotated', values=True, method=dl.FiltersMethod.AND)
filters.add(field='metadata.user.is_automated', values=True,
            method=dl.FiltersMethod.AND)  # optional - return results sorted by ascending file name
filters.sort_by(field='name')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Or

If you wish to filter annotations with the “or” logical operator, you can do so by specifying which filters will be checked with “or”.In this example, you will get a list of items that are either on “folder1” or “folder2” directories.

filters = dl.Filters()
# filters with or
filters.add(field='dir', values='/folderName1', method=dl.FiltersMethod.OR)
filters.add(field='dir', values='/folderName2',
            method=dl.FiltersMethod.OR)  # optional - return results sorted by descending directory name
filters.sort_by(field='dir', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Update User Metadata of Filtered Items

Update Filtered Items - The ‘update_value’ must be a dictionary.The dictionary will only update user metadata.Understand more about user metadata <a href=https://github.com/dataloop-ai/dtlpy-documentation/blob/main/tutorials/data_management/working_with_metadata/chapter.md” target=”_blank”>here.In this example, you will update/add user metadata (with the field “BlackDogs” and value True), to items in a specific folder ‘dogs’ with an attribute ‘black’.

filters = dl.Filters()
# For example -  filter only items in a specific folder - like 'dogs'
filters.add(field='dir', values='/dogs')
# For example - filter items by their annotation - only items with 'black' attribute
filters.add_join(field='attributes', values='black')
# to add filed BlackDogs to all filtered items and give value True
# this field will be added to user metadata
# create update order
update_values = {'BlackDogs': True}
# update
pages = dataset.items.update(filters=filters, update_values=update_values)
Delete Filtered Items

In this example, you will delete items that were created on 30/8/2020 at 8:17 AM.

filters = dl.Filters()
# For example -  filter only annotated items
filters.add(field='createdAt', values="2020-08-30T08:17:08.000Z")
dataset.items.delete(filters=filters)
Item Filtering Fields
More Filter Options
Use a dot to access parameters within curly brackets. For example use field='metadata.system.originalname' to filter by the item's original name.
{
    "id": "5f4b60848ced1d50c3df114a",
    "datasetId": "5f4b603d9825b9f191bbd3b3",
    "createdAt": "2020-08-30T08:17:08.000Z",
    "dir": "/new_folder",
    "filename": "/new_folder/optional.jpg",
    "type": "file",
    "hidden": false,
    "metadata": {
        "system": {
            "originalname": "file",
            "size": 3290035,
            "encoding": "7bit",
            "mimetype": "image/jpeg",
            "annotationStatus": [
                "completed"
            ],
            "refs": [
                {
                    "type": "task",
                    "id": "5f4b61f8f81ab6238c331bd2"
                },
                {
                    "type": "assignment",
                    "id": "5f4b61f8f81ab60508331bd3"
                }
            ],
            "executionLogs": {
                "image-metadata-extractor": {
                    "default_module": {
                        "run": {
                            "5f4b60841b892d82eaa2d95b": {
                                "progress": 100,
                                "status": "success"
                            }
                        }
                    }
                }
            },
            "exif": {},
            "height": 2734,
            "width": 4096,
            "statusLog": [
                {
                    "status": "completed",
                    "timestamp": "2020-08-30T14:54:17.014Z",
                    "creator": "user@dataloop.ai",
                    "action": "created"
                }
            ],
            "isBinary": true
        }
    },
    "name": "optional.jpg",
    "url": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a",
    "dataset": "https://gate.dataloop.ai/api/v1/datasets/5f4b603d9825b9f191bbd3b3",
    "annotationsCount": 18,
    "annotated": "discarded",
    "stream": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/stream",
    "thumbnail": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/thumbnail",
    "annotations": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/annotations"
}
Full Examples
How to filter items by their annotations label?
filters = dl.Filters()
filters.add_join(field='label', values='your_label_value')
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))
How to filter items by completed and approved status?
filters = dl.Filters()
filters.add(field='metadata.system.annotationStatus', values=["completed", "approved"])
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
How to filter items by completed status (with items who are approved as well)?
filters = dl.Filters()
# set resource
filters.add(field='metadata.system.annotationStatus', values="completed")
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
How to filter items by only completed status?
filters = dl.Filters()
filters.add(field='metadata.system.annotationStatus', values=["completed"])
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
How to filter unassigned items?
filters = dl.Filters()
filters.add(field='metadata.system.refs', values=[])
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
How to filter items by a specific folder?
filters = dl.Filters()
filters.add(field='dir', values="/folderName")
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Get all items named foo.bar
filters = dl.Filters()
filters.add(field='name', values='foo.bar.*')
# Get filtered item list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))
Sort files of size 0-5 MB by name, in ascending order
filters = dl.Filters()
filters.add(field='metadata.system.size', values='0', operator='gt')
filters.add(field='metadata.system.size', values='5242880', operator='lt')
filters.sort_by(field='filename', value=dl.FILTERS_ORDERBY_DIRECTION_ASCENDING)
# Get filtered item list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))
Sort with multiple fields: Sort Items by labels ascending and createdAt descending
filters = dl.Filters()
# set annotation resource
filters.resource = dl.FiltersResource.ANNOTATION
# return results sorted by descending label
filters.sort_by(field='label', value=dl.FILTERS_ORDERBY_DIRECTION_ASCENDING)
filters.sort_by(field='createdAt', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered item list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))
Advanced Filtering Operators

Explore advanced filtering options on this page.

Response to DQL Query

A typical response to a DQL query will look like the following:

{
    "totalItemsCount": number,
    "items": Array,
    "totalPagesCount": number,
    "hasNextPage": boolean,
}
# A possible result:
{
    "totalItemsCount": 2,
    "totalPagesCount": 1,
    "hasNextPage": false,
    "items": [
        {
            "id": "5d0783852dbc15306a59ef6c",
            "createdAt": "2019-06-18T23:29:15.775Z",
            "filename": "/5546670769_8df950c6b6.jpg",
            "type": "file"
                    // ...
        },
        {
            "id": "5d0783852dbc15306a59ef6d",
            "createdAt": "2019-06-19T23:29:15.775Z",
            "filename": "/5551018983_3ce908ac98.jpg",
            "type": "file"
                    // ...
        }
    ]
}

Pagination

Pages

We use pages instead of a list when we have an object that contains a lot of information.

The page object divides a large list into pages (with a default of 1000 items) in order to save time when going over the items.

It is the same as we display it in the annotation platform, see example here.

You can redefine the number of items on a page with the page_size attribute.When we go over the items we use nested loops to first go to the pages and then go over the items for each page.

Iterator of Items

You can create a generator of items with different filters.

import dtlpy as dl
# Get the project    
project = dl.projects.get(project_name='project_name')
# Get the dataset
dataset = project.datasets.get(dataset_name='dataset_name')
# Get items in pages (1000 item per page)
filters = dl.Filters()
filters.add(field='filename', values='/your/file/path.mimetype')
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
# Go over all item and print the properties
for i_page, page in enumerate(pages):
    print('{} items in page {}'.format(len(page), i_page))
    for item in page:
        item.print()

A Page entity iterator also allows reverse iteration for cases in which you want to change items during the iteration:

# Go over all item and print the properties
for i_page, page in enumerate(reversed(pages)):
    print('{} items in page {}'.format(len(page), i_page))

If you want to iterate through all items within your filter, you can also do so without going through them page by page:

for item in pages.all():
    print(item.name)

If you are planning to do some process on each item, it’s faster to use multi-threads (or multi-process) for parallel computation.The following uses ThreadPoolExecutor with 32 workers to process parallel batches of 32 items:

from concurrent.futures import ThreadPoolExecutor
def single_item(item):
    # do some work on item
    print(item.filename)
    return True
with ThreadPoolExecutor(max_workers=32) as executor:
    executor.map(single_item, pages.all())

Lets compare the runtime to see that now the process is faster:

from concurrent.futures import ThreadPoolExecutor
import time
tic = time.time()
for item in pages.all():
    # do stuff on item
    time.sleep(1)
print('One by one took {:.2f}[s]'.format(time.time() - tic))
def single_item(item):
    # do stuff on item
    time.sleep(1)
    return True
tic = time.time()
with ThreadPoolExecutor(max_workers=32) as executor:
    executor.map(single_item, pages.all())
print('Using threads took {:.2f}[s]'.format(time.time() - tic))

Visualizing the progress with tqdm progress bar:

import tqdm
pbar = tqdm.tqdm(total=pages.items_count)
def single_item(item):
    # do stuff on item
    time.sleep(1)
    pbar.update()
    return True
with ThreadPoolExecutor(max_workers=32) as executor:
    executor.map(single_item, pages.all())
Set page_size

The following example sets the page_size to 50:

# Create filters instance
filters = dl.Filters()
# Get filtered item list in a page object, where the starting page is 1
pages = dataset.items.list(filters=filters, page_offset=1, page_size=50)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))
# Print items from page 1
print('Length of first page: {}'.format(len(pages.items)))

Working with Metadata

import dtlpy as dl
# Get project and dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
User Metadata

As a powerful tool to manage data based on your categories and information, you can add any keys and values to both the item’s and annotations’ user-metadata sections using the Dataloop SDK. Then, you can use your user-metadata for data filtering, sorting, etc.

NoteWhen adding metadata to the same item, the new metadata might overwrite existing metadata. To avoid overwriting a field or the entire metadata, use the list data type.

Metadata Data Types

Metadata is a dictionary attribute used with items, annotations, and other entities of the Dataloop system (task, recipe, and more). As such, it can be used with string, number, boolean, list or null types.

String
item.metadata['user']['MyKey'] = 'MyValue'
annotation.metadata['user']['MyKey'] = 'MyValue'
Number
item.metadata['user']['MyKey'] = 3
annotation.metadata['user']['MyKey'] = 3
Boolean
item.metadata['user']['MyKey'] = True
annotation.metadata['user']['MyKey'] = True
Null – add metadata with no information
item.metadata['user']['MyKey'] = None
annotation.metadata['user']['MyKey'] = None
List
# add metadata of a list (can contain elements of different types).
item.metadata['user']['MyKey'] = ["A", 2, False]
annotation.metadata['user']['MyKey'] = ["A", 2, False]
Add new metadata to a list without losing existing data
item.metadata['user']['MyKey'].append(3)
item = item.update()
annotation.metadata['user']['MyKey'].append(3)
annotation = annotation.update()
Add metadata to an item’s user metadata
# upload and claim item
item = dataset.items.upload(local_path=r'C:/home/project/images/item.mimetype')
# or get item
item = dataset.items.get(item_id='write-your-id-number')
# modify metadata
item.metadata['user'] = dict()
item.metadata['user']['MyKey'] = 'MyValue'
# update and reclaim item
item = item.update()
Modify an existing user metadata field
# upload and claim item
item = dataset.items.upload(local_path=r'C:/home/project/images/item.mimetype')
# or get item
item = dataset.items.get(item_id='write-your-id-number')
# modify metadata
if 'user' not in item.metadata:
    item.metadata['user'] = dict()
item.metadata['user']['MyKey'] = 'MyValue'
# update and reclaim item
item = item.update()

Item in platform should have section ‘user’ in metadata with field ‘MyKey’ and value ‘MyValue’.

Add metadata to annotations’ user metadata
# Get annotation
annotation = dl.annotations.get(annotation_id='my-annotation-id')
# modify metadata
annotation.metadata['user'] = dict()
item.metadata['user']['red'] = True
# update and reclaim annotation
annotation = annotation.update()

annotation in platform should have section ‘user’ in metadata with field ‘red’ and value True

Filter items by user metadata
1. Get your dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
2. Add metadata to an item

You can also add metadata to filtered items

# upload and claim item
item = dataset.items.upload(local_path=r'C:/home/project/images/item.mimetype')
# or get item
item = dataset.items.get(item_id='write-your-id-number')
# modify metadata
item.metadata['user'] = dict()
item.metadata['user']['MyKey'] = 'MyValue'
# update and reclaim item
item = item.update()
3. Create a filter
filters = dl.Filters()
# set resource - optional - default is item
filters.resource = dl.FiltersResource.ITEM
4. Filter by your written key
filters.add(field='metadata.user.Key', values='Value')
5. Get filtered items
pages = dataset.items.list(filters=filters)
# Go over all item and print the properties
for page in pages:
    for item in page:
        item.print()

FaaS Tutorial

Tutorials for FaaS

FaaS Interactive Tutorial

Concept

Dataloop Function-as-a-Service (FaaS) is a compute service that automatically runs your code based on time patterns or in response to trigger events.

You can use Dataloop FaaS to extend other Dataloop services with custom logic. Altogether, FaaS serves as a super flexible unit that provides you with increased capabilities in the Dataloop platform and allows achieving any need while automating processes.

With Dataloop FaaS, you simply upload your code and create your functions. Following that, you can define a time interval or specify a resource event for triggering the function. When a trigger event occurs, the FaaS platform launches and manages the compute resources, and executes the function.

You can configure the compute settings according to your preferences (machine types, concurrency, timeout, etc.) or use the default settings.

Use Cases

Pre annotation processing: Resize, video assembler, video dissembler

Post annotation processing: Augmentation, crop box-annotations, auto-parenting

ML models: Auto-detection

QA models: Auto QA, consensus model, majority vote model

Introduction

This tutorial will help you get started with FaaS.

  1. Prerequisites

  2. Basic use case: Single function

  • Deploy a function as a service

  • Execute the service manually and view the output

  1. Advance use case: Multiple functions

  • Deploy several functions as a package

  • Deploy a service of the package

  • Set trigger events to the functions

  • Execute the functions and view the output and logs

First, log in to the platform by running the following Python code in the terminal or your IDE:

import dtlpy as dl
if dl.token_expired():
    dl.login()

Your browser will open a login screen, allowing you to enter your credentials or log in with Google. Once the “Login Successful” tab appears, you are allowed to close it.

This tutorial requires a project. You can create a new project, or alternatively use an existing one:

# Create a new project
project = dl.projects.create(project_name='project-sdk-tutorial')
# Use an existing project
project = dl.projects.get(project_name='project-sdk-tutorial')

Let’s create a dataset to work with and upload a sample item to it:

dataset = project.datasets.create(dataset_name='dataset-sdk-tutorial')
item = dataset.items.upload(
    local_path=[
        'https://raw.githubusercontent.com/dataloop-ai/dtlpy-documentation/main/assets/images/hamster.jpg'],
    remote_path='/folder_name')
# Remote path is optional, images will go to the main directory by default

Basic Use Case: Single Function

Create and Deploy a Sample Function

Below is an image-manipulation function in Python to use for converting an RGB image to a grayscale image. The function receives a single item, which later can be used as a trigger to invoke the function:

def rgb2gray(item: dl.Item):
    """
    Function to convert RGB image to GRAY
    Will also add a modality to the original item
    :param item: dl.Item to convert
    :return: None
    """
    import numpy as np
    import cv2
    buffer = item.download(save_locally=False)
    bgr = cv2.imdecode(np.frombuffer(buffer.read(), np.uint8), -1)
    gray = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY)
    bgr_equalized_item = item.dataset.items.upload(local_path=gray,
                                                   remote_path='/gray' + item.dir,
                                                   remote_name=item.name)
    # add modality
    item.modalities.create(name='gray',
                           ref=bgr_equalized_item.id)
    item.update(system_metadata=True)

You can now deploy the function as a service using Dataloop SDK. Once the service is ready, you may execute the available function on any input:

project = dl.projects.get(project_name='project-sdk-tutorial')
service = project.services.deploy(func=rgb2gray,
                                  service_name='grayscale-item-service')
Execute the function

An execution means running the function on a service with specific inputs (arguments). The execution input will be provided to the function that the execution runs.

Now that the service is up, it can be executed manually (on-demand) or automatically, based on a set trigger (time/event). As part of this tutorial, we will demonstrate how to manually run the “RGB to Gray” function.

To see the item we uploaded, run the following code:

item.open_in_web()

Multiple Functions and Modules

Multiple Functions
Create and Deploy a Package of Several Functions

First, login to the Dataloop platform:

import dtlpy as dl
if dl.token_expired():
    dl.login()

Let’s define the project and dataset you will work with in this tutorial.To create a new project and dataset:

project = dl.projects.create(project_name='project-sdk-tutorial')
project.datasets.create(dataset_name='dataset-sdk-tutorial')

To use an existing project and dataset:

project = dl.projects.get(project_name='project-sdk-tutorial')
dataset = project.datasets.get(dataset_name='dataset-sdk-tutorial')
Write your code

The following code consists of two image-manipulation methods:

  • RGB to grayscale over an image

  • CLAHE Histogram Equalization over an image - Contrast Limited Adaptive Histogram Equalization (CLAHE) to equalize images

To proceed with this tutorial, copy the following code and save it as a main.py file.

import dtlpy as dl
import cv2
import numpy as np
class ImageProcess(dl.BaseServiceRunner):
    @staticmethod
    def rgb2gray(item: dl.Item):
        """
        Function to convert RGB image to GRAY
        Will also add a modality to the original item
        :param item: dl.Item to convert
        :return: None
        """
        buffer = item.download(save_locally=False)
        bgr = cv2.imdecode(np.frombuffer(buffer.read(), np.uint8), -1)
        gray = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY)
        gray_item = item.dataset.items.upload(local_path=gray,
                                              remote_path='/gray' + item.dir,
                                              remote_name=item.name)
        # add modality
        item.modalities.create(name='gray',
                               ref=gray_item.id)
        item.update(system_metadata=True)
    @staticmethod
    def clahe_equalization(item: dl.Item):
        """
        Function to perform histogram equalization (CLAHE)
        Will add a modality to the original item
        Based on opencv https://docs.opencv.org/4.x/d5/daf/tutorial_py_histogram_equalization.html
        :param item: dl.Item to convert
        :return: None
        """
        buffer = item.download(save_locally=False)
        bgr = cv2.imdecode(np.frombuffer(buffer.read(), np.uint8), -1)
        # create a CLAHE object (Arguments are optional).
        lab = cv2.cvtColor(bgr, cv2.COLOR_BGR2LAB)
        lab_planes = cv2.split(lab)
        clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
        lab_planes[0] = clahe.apply(lab_planes[0])
        lab = cv2.merge(lab_planes)
        bgr_equalized = cv2.cvtColor(lab, cv2.COLOR_LAB2BGR)
        bgr_equalized_item = item.dataset.items.upload(local_path=bgr_equalized,
                                                       remote_path='/equ' + item.dir,
                                                       remote_name=item.name)
        # add modality
        item.modalities.create(name='equ',
                               ref=bgr_equalized_item.id)
        item.update(system_metadata=True)
Define the module

Multiple functions may be defined in a single package under a “module” entity. This way you will be able to use a single codebase for various services.

Here, we will create a module containing the two functions we discussed. The “main.py” file you downloaded is defined as the module entry point. Later, you will specify its directory file path.

modules = [dl.PackageModule(name='image-processing-module',
                            entry_point='main.py',
                            class_name='ImageProcess',
                            functions=[dl.PackageFunction(name='rgb2gray',
                                                          description='Converting RGB to gray',
                                                          inputs=[dl.FunctionIO(type=dl.PackageInputType.ITEM,
                                                                                name='item')]),
                                       dl.PackageFunction(name='clahe_equalization',
                                                          description='CLAHE histogram equalization',
                                                          inputs=[dl.FunctionIO(type=dl.PackageInputType.ITEM,
                                                                                name='item')])
                                       ])]
Push the package

When you deployed the service in the previous tutorial (“Single Function”), a module and a package were automatically generated.

Now we will explicitly create and push the module as a package in the Dataloop FaaS library (application hub). For that, please specify the source path (src_path) of the “main.py” file you downloaded, and then run the following code:

src_path = 'functions/opencv_functions'
project = dl.projects.get(project_name='project-sdk-tutorial')
package = project.packages.push(package_name='image-processing',
                                modules=modules,
                                src_path=src_path)
Deploy a service

Now that the package is ready, it can be deployed to the Dataloop platform as a service.To create a service from a package, you need to define which module the service will serve. Notice that a service can only contain a single module. All the module functions will be automatically added to the service.

Multiple services can be deployed from a single package. Each service can get its own configuration: a different module and settings (computing resources, triggers, UI slots, etc.).

In our example, there is only one module in the package. Let’s deploy the service:

service = package.services.deploy(service_name='image-processing',
                                  runtime=dl.KubernetesRuntime(concurrency=32),
                                  module_name='image-processing-module')
Trigger the service

Once the service is up, we can configure a trigger to automatically run the service functions. When you bind a trigger to a function, that function will execute when the trigger fires. The trigger is defined by a given time pattern or by an event in the Dataloop system.

Event based trigger is related to a combination of resource and action. A resource can be any entity in our system (item, dataset, annotation, etc.) and the associated action will define a change in the resource that will prompt the trigger (update, create, delete). You can only have one resource per trigger.

The resource object that triggered the function will be passed as the function’s parameter (input).

Let’s set a trigger in the event a new item is created:

filters = dl.Filters()
filters.add(field='datasetId', values=dataset.id)
trigger = service.triggers.create(name='image-processing2',
                                  function_name='clahe_equalization',
                                  execution_mode=dl.TriggerExecutionMode.ONCE,
                                  resource=dl.TriggerResource.ITEM,
                                  actions=dl.TriggerAction.CREATED,
                                  filters=filters)

In the defined filters we specified a dataset. Once a new item is uploaded (created) in this dataset, the CLAHE function will be executed for this item. You can also add filters to specify the item type (image, video, JSON, directory, etc.) or a certain format (jpeg, jpg, WebM, etc.).

A separate trigger must be set for each function in your service.Now, we will define a trigger for the second function in the module rgb2gray. Each time an item is updated, invoke the rgb2gray function:

trigger = service.triggers.create(name='image-processing-rgb',
                                  function_name='rgb2gray',
                                  execution_mode=dl.TriggerExecutionMode.ALWAYS,
                                  resource=dl.TriggerResource.ITEM,
                                  actions=dl.TriggerAction.UPDATED,
                                  filters=filters)

To trigger the function only once (only on the first item update), set TriggerExecutionMode.ONCE instead of TriggerExecutionMode.ALWAYS.

Execute the function

Now we can upload (“create”) an image to our dataset to trigger the service. The function clahe_equalization will be invoked:

item = dataset.items.upload(
    local_path=['https://github.com/dataloop-ai/dtlpy-documentation/raw/main/assets/images/hamster.jpg'])

Remote path is optional, images will go to the main directory by default.

To see the original item, please click here.

Review the function’s logs

You can review the execution log history to check that your execution succeeded:

service.log()

The transformed image will be saved in your dataset.Once you see in the log that the execution succeeded, you may open the item to see its transformation:

item.open_in_web()
Pause the service:

We recommend pausing the service you created for this tutorial so it will not be triggered:

service.pause()

Congratulations! You have successfully created, deployed, and tested Dataloop functions!

Multiple Modules

You can define multiple different modules in a package. A typical use-case for multiple-modules is to have a single code base that can be used by a number of services (for different applications). For example, having a single YoloV4 codebase, but creating different modules for training, inference, etc.

When creating a service from that package, you will need to define which module the service will serve (a service can only serve a single module with all its functions). For example, to push a 2 module package, you will need to have 2 entry points, one for each module, and this is how you define the modules:

modules = [
    dl.PackageModule(
        name='first-module',
        entry_point='first_module_main.py',
        functions=[
            dl.PackageFunction(
                name='run',
                inputs=[dl.FunctionIO(name='item',
                                      type=dl.PackageInputType.ITEM)]
            )
        ]
    ),
    dl.PackageModule(
        name='second-module',
        entry_point='second_module_main.py',
        functions=[
            dl.PackageFunction(
                name='run',
                inputs=[dl.FunctionIO(name='item',
                                      type=dl.PackageInputType.ITEM)]
            )
        ]
    )
]

Create the package with your modules

package = project.packages.push(package_name='two-modules-test',
                                modules=modules,
                                src_path='<path to where the entry point is located>'
                                )

You will pass these modules as a param to packages.push()After that, when you deploy the package, you will need to specify the module name:Note: A service can only implement one module.

service = package.deploy(
    module_name='first-module',
    service_name='first-module-test-service'
)

UI Slots

Define a UI slot in the platform

UI slots can be created for any function, making it possible to invoke the function through a button click.Binding functions to UI slots will enable you to manually trigger them on selected items, annotations, or tasks.

Dataloop currently supports the following UI slots:

  1. Item as a resource:

    1. Dataset Browser

    2. Annotation Studio

  2. Annotation as a resource:

    1. Annotation Studio

  3. Task as a resource:

    1. Tasks browser

Let’s define a UI button for the “RGB to Gray” function.For that, we should create a slot entity in the SDK, that can be later activated from the UI to quickly invoke functions.

In this case, the input for the RGB function is an item, so the slot resource should be item as well (i.e. SlotDisplayScopeResource.ITEM).As a result, the function will be accessible in the annotations studio under “Applications” dropdown:

import dtlpy as dl
slots = [
    dl.PackageSlot(module_name='image-processing',
                   function_name='rgb2gray',
                   display_name='RGB2GRAY',
                   post_action=dl.SlotPostAction(type=dl.SlotPostActionType.NO_ACTION),
                   display_scopes=[
                       dl.SlotDisplayScope(
                           resource=dl.SlotDisplayScopeResource.ITEM,
                           panel=dl.UiBindingPanel.ALL,
                           filters={})])
]

Once the function finishes executing, you can decide what the function outputs. Currently, 3 Post-Action outputs are available for UI slots:

  1. SlotPostActionType.DOWNLOAD - Download the output, available only for item output.

  2. SlotPostActionType.DRAW_ANNOTATION - Available only for annotation output. Draw the annotation on the item.

  3. SlotPostActionType.NO_ACTION - Take no further actions

Additionally, you can use filters to specify which items in are eligible to be used in app (e.g. filtered by item type, item format, etc.).

Update the Package and Service with the Slot

Now you can update your package and service with the new slot you added:

# Update package with the slot
package.slots = slots
package = package.update()
# Update service with the new package version
service.package_revision = package.version
service.update()
Activate the UI slot

To make the UI slot visible to users in the platform, you need to activate the slots:

package.services.activate_slots(service=service,
                                project_id=project.id,
                                slots=slots)

Notice that clicking on the UI slot button will trigger the service only if it is active.

Pause the service:We recommend pausing the service you created for this tutorial so it will not be triggered:

service.pause()

Congratulations! You have successfully created, deployed, and tested Dataloop functions!

Executions Control

Execution Termination

Sometimes when we run long term executions, such as model training, we need the option to terminate the execution. This is facilitated using terminate at Checkpoint.To stop an execution set the code checkpoints to check if this execution received a termination and if it did, raise the Termination Exception.This allows you to save some work that was already done before terminating.For example:

class ServiceRunner(dl.BaseServiceRunner):
    def detect(self, item: dl.Item):
        # Do some work
        foo = 0
        self.kill_event()
        # Do some more work
        bar = 1
        self.kill_event()
        # Sleep for a while
        import time
        time.sleep(1)
        # And... done!
        return

Each time there is a “kill_event” the service runner checks to see if this execution received a termination request.To kill such execution we use

execution.terminate()
Execution Timeout

You can tell an execution to stop after a given number of seconds with the timeout parameter (the default time is 1 hour).In case a service reset, such as in timeout or service update, If there are running executions the service will wait for the execution timeout before resetting.The number have to be a natural number (int).

service.execution_timeout = 60  # 1 minute

You can decide what to do to executions that have experienced a timeout. There are 2 options of timeout handling:

  1. Mark execution as failed

  2. Retry

service.on_reset = 'failed'
service.on_reset = 'rerun'
# The service must be updated after changing these attributes
service.update()

Using Secrets in a Function

You can use organization integration secrets (key-value) as environment variable inside the function’s env.First you’ll need to add the integrations (UI only), then simply add the integrations’ ids whe deploying the service:

package: dl.Package
package.services.deploy(name='with-integrations',
                        secrets=['integrationId'])

Inside the service you can access the values using os package:

import os
print(os.environ['INTEGRATION_NAME'])

FaaS Docker Image

Dataloop enables you to deploy in the FaaS module a custom docker image, to enrich your application with literallyanything required for your project. Deploying a docker image is as easy as providing the Docker image path whendeploying the service:

service = package.deploy(service_name='my-service',
                         runtime=dl.KubernetesRuntime(
                             runner_image='docker.io/python:3.8'
                         ))

or if you want to update an existing service:

service = dl.services.get('service-id-or-name')
service.runtime.runner_image = 'python:3.8'
service.update()
Our Docker Image

We have our list of docker images publicly available in DockerhubYou can see the env of each docker on the dockerfile here

Public Docker Images

You can use any public docker image, and on runtime, the Dataloop agent will install:

  1. Package requirements

  2. dtlpy package (version as defined on the service)

  3. dtlpy-agent (same version as the SDK)

For example, using docker.io/python:3.9.13 will run the function with Python 3.9.

Build Your Own Docker Image

If you want other environment or need to add some apt-get installation, you can create any docker image and use it directly.You will need to set the HOME directory to /tmp and install the python packages with –user (or as USER 1000).For instance:

FROM dockerhub.io/dataloopai/dtlpy-agent:latest.gpu.cuda11.5.py3.8.opencv  
  
RUN apt update && apt install -y zip ffmpeg  
  
USER 1000  
ENV HOME=/tmp  
RUN pip3 install --user \  
    dtlpy==1.54.10 \  
    dtlpy-agent==1.54.10 \  
    torch \  
    torchvision \  
    imgaug \  
    scikit-image==0.17.2  
Using Private Docker Registry

To connect a private registry, you’ll need to add the docker container registry credentials as an Organization Secret (ONLY in the UI) and just create use the runner image.

Example: Model Annotations Service

This tutorial demonstrates creating and deploying a service that pre-annotates an items before manual annotation work is performed, as part of active-learning process.

Service Code

Your code can perform any action you need toe execute as part of pre-annotating items for example:

  • Access a remote server/API to retrieve annotations

  • Run your algorithm or ML model as a service in Dataloop FaaS

In this example we use a simple face detection algorithm that uses Cv2 and Caffe model

import numpy as np
import os
import cv2
import dtlpy as dl
class ServiceRunner:
    def __init__(self,
                 model_filename: str,
                 prototxt_filename: str,
                 min_confidence: float):
        prototxt = os.path.join(os.getcwd(), prototxt_filename)
        weights = os.path.join(os.getcwd(), model_filename)
        print("[INFO] loading model...")
        self.net = cv2.dnn.readNetFromCaffe(prototxt, weights)
        self.min_confidence = min_confidence
    def detect(self, item: dl.Item):
        print("[INFO] downloading image...")
        filename = item.download()
        try:
            # load the input image and construct an input blob for the image
            # by resizing to a fixed 300x300 pixels and then normalizing it
            print("[INFO] opening image...")
            image = cv2.imread(filename)
            (h, w) = image.shape[:2]
            blob = cv2.dnn.blobFromImage(cv2.resize(image,
                                                    (300, 300)), 1.0,
                                         (300, 300),
                                         (104.0, 177.0, 123.0))
            # pass the blob through the network and obtain the detections and
            # predictions
            print("[INFO] computing object detections...")
            self.net.setInput(blob)
            detections = self.net.forward()
            # create annotation builder to add annotations to item
            print("[INFO] uploading annotations...")
            builder = item.annotations.builder()
            # loop over the detections
            for i in range(0, detections.shape[2]):
                # extract the confidence (i.e., probability) associated with the
                # prediction
                confidence = detections[0, 0, i, 2]
                # filter out weak detections by ensuring the `confidence` is
                # greater than the minimum confidence
                if confidence > self.min_confidence:
                    # compute the (x, y)-coordinates of the bounding box for the
                    # object
                    box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
                    (startX, startY, endX, endY) = box.astype("int")
                    # draw the bounding box of the face along with the associated
                    # probability
                    builder.add(
                        annotation_definition=dl.Box(
                            top=startY,
                            left=startX,
                            right=endX,
                            bottom=endY,
                            label='person'
                        ),
                        model_info={
                            'name': 'Caffe',
                            'confidence': confidence
                        }
                    )
                    # upload annotations
            builder.upload()
            print("[INFO] Done!")
        finally:
            os.remove(filename)
Define the module

In this example, we load the model in the init method, which runs only once at deployment time, saving us time bu not loading on each execution/This can inputs are attributes that we want the service to include for its entire lifetime. In this case, it’s the model and weights files we want the service to use and the confidence limit for accepting detections.

module = dl.PackageModule(
    init_inputs=[
        dl.FunctionIO(name='model_filename', type=dl.PackageInputType.STRING),
        dl.FunctionIO(name='prototxt_filename', type=dl.PackageInputType.STRING),
        dl.FunctionIO(name='min_confidence', type=dl.PackageInputType.FLOAT)
    ],
    functions=[
        dl.PackageFunction(
            name='detect',
            description='OpenCV face detection using Caffe model',
            inputs=[
                dl.FunctionIO(name='item', type=dl.PackageInputType.ITEM)
            ]
        )
    ]
)
numpy == 1.18
opencv - python == 3.4
Model and weights files

The function uses 2 files containing the model and its weights for inferencing detections. We need to have these files at the same folder as the entry point.To get these files please download them here.https://storage.googleapis.com/dtlpy/model_assets/faas-tutorial/model_weights.zip

##Package RequirementsOur package’s codebase uses 2 Python libraries that are not standard ones. Therefore, we need to make sure they are pre-installed before running the entry point. One way to do so is to use a custom Docker Image (information on this process can be found here. The other way is to add a requirements.txt file to the package codebase. To do so, simply add the following requirements.txt file in the same folder of the entry point (main.py):https://dataloop.ai/docs/service-runtime#customimage

package = project.packages.push(
    src_path='<path to folder containing the codebase>',
    package_name='face-detector',
    modules=[module]
)
Push the Package

Make sure you have the following files in one directory:

  • main.py

  • requirements.txt

  • res10_300x300_ssd_iter_140000.caffemodel

  • deploy.prototxt.txt

Run this to push your package:

service = package.deploy(
    service_name='face-detector',
    init_input=[
        dl.FunctionIO(name='model_filename',
                      type=dl.PackageInputType.STRING,
                      value='res10_300x300_ssd_iter_140000.caffemodel'),
        dl.FunctionIO(name='prototxt_filename',
                      type=dl.PackageInputType.STRING,
                      value='deploy.prototxt.txt'),
        dl.FunctionIO(name='min_confidence',
                      type=dl.PackageInputType.FLOAT,
                      value=0.5)
    ],
    runtime=dl.KubernetesRuntime(concurrency=1)
    # The runtime argument Concurrency=1 means that only one execution can run at a time (no parallel executions).
)
Deploy The Service

The package is now ready to be deployed as a service in the Dataloop Platform.Whenever executed, your package will run as a service on default instance type. Review the service configuration to configure it to your needs, for example

  • Change instance-type to use stronger instances with more memory, CPU and GPU

  • Increase auto-scaling to handle larger loads

  • Increase timeouts to allow longer execution time

filters = dl.Filters(resource=dl.FiltersResource.ITEM)
filters.add(field='metadata.system.mimetype', values='image*')
trigger = service.triggers.create(
    name='face-detector',
    function_name="detect",
    resource=dl.TriggerResource.ITEM,
    actions=dl.TriggerAction.CREATED,
    filters=filters
)
Trigger the Service

Once the service is deployed, we can create a trigger to run it automatically when a certain event occurs.In our example we trigger the face-detection service whenever an item is uploaded to the platform.Consider using other triggers or different ways to run you service:

  • Add the services to a FaaS-node in a pipeline, before annotation tasks

  • Use DQL trigger to run only on specific datasets, or in specific tasks

  • Run the service when an item is updated

self.package.artifacts.download(artifact_name=artifact_filename,
                                local_path=full_weight_path)
Uploading Model Weights as Artifacts

Large data files such as ML model weights can be too big to include in a package. These and other large files can be uploaded as artifact.

Task Workflows

Tutorials for workforce management

Creating Tasks

Tasks are created in the Dataloop platform to initiate annotation or QA work.It requires defining the data items to be included, the assignees working on the task, and various options such as work-load, custom-statuses and more.

Create A Task (Annotation task or QA task) Using Filter

The following example demonstrates creating a task from an items filter.The script includes 2 example, for filtering an entire folder/directory, and for filtering by item annotation status.

import dtlpy as dl
import datetime
if dl.token_expired():
    dl.login()
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name>')
# Create a task with all items in a specific folder
filters = dl.Filters(field='<dir>', values='</my/folder/directory>')
# filter items without annotations
filters = dl.Filters(field='<annotated>', values=False)
# Create annotation task with filters
task = dataset.tasks.create(
    task_name='<task_name>',
    due_date=datetime.datetime(day=1, month=1, year=2029).timestamp(),
    assignee_ids=['<annotator1@dataloop.ai>', '<annotator2@dataloop.ai>'],
    # The items will be divided equally between assignments
    filters=filters  # filter by folder directory or use other filters
)
# Create QA task with filters
qa_task = dataset.tasks.create_qa_task(task=task,
                                       due_date=datetime.datetime(day=1, month=1, year=2029).timestamp(),
                                       assignee_ids=['<annotator1@dataloop.ai>', '<annotator2@dataloop.ai>'],
                                       filters=filters  # this filter is for "completed items"
                                       )
List of Items

Create a task from a list of items. The items will be divided equally between annotator’s assignments:

import dtlpy as dl
import datetime
if dl.token_expired():
    dl.login()
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name>')
items = dataset.items.list()
items_list = [item for item in items.all()]
task = dataset.tasks.create(
    task_name='<task_name>',
    due_date=datetime.datetime(day=1, month=1, year=2029).timestamp(),
    assignee_ids=['<annotator1@dataloop.ai>', '<annotator2@dataloop.ai>'],
    # The items will be divided equally between assignments
    items=items_list
)
Entire Dataset

Create a task from all items in a dataset. The items will be divided equally between annotator’s assignments:

import dtlpy as dl
import datetime
if dl.token_expired():
    dl.login()
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name>')
# Create annotation task
task = dataset.tasks.create(
    task_name='<task_name>',
    due_date=datetime.datetime(day=1, month=1, year=2029).timestamp(),
    assignee_ids=['<annotator1@dataloop.ai>', '<annotator2@dataloop.ai>']
    # The items will be divided equally between assignments
)

Add items to an existing task

Adding items to an existing task will create new assignments (for new assignee/s).

By Filters
import dtlpy as dl
import datetime
if dl.token_expired():
    dl.login()
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name>')
filters = dl.Filters(field='<metadata.system.refs>', values=[])  # filter on unassigned items
# Create annotation task
task.add_items(
    filters=filters,  # filter by folder directory or use other filters
    assignee_ids=['<annotator1@dataloop.ai>', '<annotator2@dataloop.ai>'])

Managing Tasks & Assignments

Get Task
import dtlpy as dl
# Get task by ID
task = dl.tasks.get(task_id='<my-task-id>')
# Get task by name - in a project
project = dl.projects.get(project_name='<project_name>')
task = project.tasks.get(task_name='<my-task-name>')
# Get task by name - in a Dataset
dataset = project.datasets.get(dataset_name='<dataset_name>')
task = project.tasks.get(task_name='<my-task-name>')
# Get all tasks (list) in a project
tasks = project.tasks.list()
# Get all tasks (list( in a dataset
tasks = dataset.tasks.list()
Get Assignments
# Get assignment by assignment ID
assignment = dl.assignments.get(assignment_id='<my-assignment-id>')
# Get assignment by name – in a project
project = dl.projects.get(project_name='<project_name>')
assignment = project.assignments.get(assignment_name='<my-assignment-name>')
# Get assignment by name – in a dataset
dataset = project.datasets.get(dataset_name='<dataset_name>')
assignment = dataset.assignments.get(assignment_name='<my-assignment-name>')
# Get assignment by name – in a task
task = project.tasks.get(task_name='<my-task-name>')
assignment = task.assignments.get(assignment_name='<my-assignment-name>')
# Get assignments list - in a project
assignments = project.assignments.list()
# Get assignments list - in a dataset
assignments = dataset.assignments.list()
# Get assignments list - in a task
assignments = task.assignments.list()
Get Assignment Items
assignment_items = assignment.get_items()

Redistribute and Reassign Assignments

import dtlpy as dl
import datetime
if dl.token_expired():
    dl.login()
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name>')
task = dl.tasks.get(task_id='<my-task-id>')
assignment = task.assignments.get(assignment_name='<my-assignment-name>')
Redistribute

Redistributing an assignment means to distribute the items among an combination of assignees.The process is identical to annotation and QA tasks.

# load is the workload percentage for each annotator
assignment.redistribute(dl.Workload([dl.WorkloadUnit(assignee_id='<annotator1@dataloop.ai>', load=50),
                                     dl.WorkloadUnit(assignee_id='<annotator2@dataloop.ai>', load=50)]))
Reassign

Reassigning an assignment changes the assignee from its original one to another.

assignment.reassign(assignee_ids['<annotator1@dataloop.ai>'])
Delete Task and Assignments
Delete Task
Note
When a task is deleted, all its assignments will be deleted as well.
task.delete()
Delete Assignment
assignment.delete()

Image Annotations

Tutorials for creating all types of image annotations

Setup

This tutorial guides you through the process using the Dataloop SDK to create and upload annotations into items.The tutorial includes chapters with different tools, and the last chapter includes various more advanced scripts

import dtlpy as dl
if dl.token_expired():
    dl.login()
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
Initiation

Using the annotation definitions classes you can create, edit, view and upload platform annotations. Each annotation init receives the coordinates for the specific type, label, and optional attributes.

Optional Plotting

Before updating items with annotations, you can optionally plot the annotation you created and review it before uploading it. This applies to all annotations described in the following section.

import matplotlib.pyplot as plt
plt.figure()
plt.imshow(builder.show())
for annotation in builder:
    plt.figure()
    plt.imshow(annotation.show())
    plt.title(annotation.label)

Classification Point and Pose

Classification

Classify a single item

# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Classify
builder.add(annotation_definition=dl.Classification(label=label))
# Upload classification to the item
item.annotations.upload(builder)
Classify Multiple Items

Classifying multiple items requires using an Items entity with a filter.

# mutiple items classification using filter
...
Create a Point Annotation
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create point annotation with label and attribute
builder.add(annotation_definition=dl.Point(x=100,
                                           y=100,
                                           label='my-label',
                                           attributes={'color': 'red'}))
# Upload point to the item
item.annotations.upload(builder)
Pose Annotation

Pose annotations are a collection of points that follows a certain template, for example a ‘skeleton’ for tracking key-point on people showing in image or video items.Templates are created in the Dataloop platform, at the instructions settings of a recipe.

# Pose annotation is based on pose template. Create the pose template from the platform UI and use it in the script by its ID
template_id = recipe.get_annotation_template_id(template_name="my_template_name")
# Get item
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Define the Pose parent annotation and upload it to the item
parent_annotation = item.annotations.upload(
    dl.Annotation.new(annotation_definition=dl.Pose(label='my_parent_label',
                                                    template_id=template_id,
                                                    # instance_id is optional
                                                    instance_id=None)))[0]
# Add child points
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Point(x=x,
                                           y=y,
                                           label='my_point_label'),
            parent_id=parent_annotation.id)
builder.upload()

Bounding Box and Cuboid

Create Box Annotation
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create box annotation with label
builder.add(annotation_definition=dl.Box(top=10,
                                         left=10,
                                         bottom=100,
                                         right=100,
                                         label='my-label'))
# Upload box to the item
item.annotations.upload(builder)
Create a Rotated Bounding Box Annotation

A rotated box is created by setting its top-left and bottom-right coordinates, and providing its rotation angle.

# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create box annotation with label
builder.add(annotation_definition=dl.Box(top=10,
                                         left=10,
                                         bottom=100,
                                         right=100,
                                         angle=80,
                                         label='my-label'))
# Upload box to the item
item.annotations.upload(builder)
Convert Semantic Segmentation to Bounding Box

Convert all semantic segmentation annotations in an item into box annotation

annotations = item.annotations.list()
builder = item.annotations.builder()
# run over all annotation in item
for annotation in annotations:
    if annotation.type == dl.AnnotationType.SEGMENTATION:
        print("Found binary annotation - id:", annotation.id)
        builder.add(annotation_definition=annotation.annotation_definition.to_box())
item.annotations.upload(annotations=builder)
Create Cuboid (3D Box) Annotation

Create cuboid annotation in one of two ways :

# A.Bring front and back rectangles and the angel of the cuboid
builder.add(annotation_definition=dl.Cube.from_boxes_and_angle(label="label",
                                                               front_top=100,
                                                               front_left=100,
                                                               front_right=300,
                                                               front_bottom=300,
                                                               back_top=200,
                                                               back_left=200,
                                                               back_right=400,
                                                               back_bottom=400,
                                                               angle=0
                                                               ))
# B.Bring all 8 points of the Cuboid
builder.add(annotation_definition=dl.Cube(label="label",
                                          # front top left point coordinates
                                          front_tl=[200, 200],
                                          # front top right point coordinates
                                          front_tr=[500, 250],
                                          # front bottom right point coordinates
                                          front_br=[500, 550],
                                          # front bottom left point coordinates
                                          front_bl=[200, 500],
                                          # back top left point coordinates
                                          back_tl=[300, 300],
                                          # back top right point coordinates
                                          back_tr=[600, 350],
                                          # back bottom right point coordinates
                                          back_br=[600, 650],
                                          # back bottom left point coordinates
                                          back_bl=[300, 600]
                                          ))
item.annotations.upload(builder)

Polygon and Polyline

Create Single Polygon/Polyline Annotation
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create polygon annotation with label
# with array of points: [[x1, y1], [x2, y2], ..., [xn, yn]]
builder.add(annotation_definition=dl.Polygon(geo=[[100, 50],
                                                  [80, 120],
                                                  [110, 130]],
                                             label='my-label'))
# create Polyline annotation with label
builder.add(annotation_definition=dl.Polyline(geo=[[100, 50],
                                                   [80, 120],
                                                   [110, 130]],
                                              label='my-label'))
# Upload polygon to the item
item.annotations.upload(builder)
Create Multiple Polygons from Mask
annotations = item.annotations.list()
mask_annotation = annotations[0]
builder = item.annotations.builder()
builder.add(dl.Polygon.from_segmentation(mask_annotation.geo,
                                         max_instances=2,
                                         label=mask_annotation.label))
item.annotations.upload(builder)
Convert Mask Annotations to Polygon

More about from_segmentation() function on here.

annotations = item.annotations.list()
builder = item.annotations.builder()
# run over all annotation in item
for annotation in annotations:
    if annotation.type == dl.AnnotationType.SEGMENTATION:
        print("Found binary annotation - id:", annotation.id)
        builder.add(dl.Polygon.from_segmentation(mask=annotation.annotation_definition.geo,
                                                 # binary mask of the annotation
                                                 label=annotation.label,
                                                 max_instances=None))
        annotation.delete()
item.annotations.upload(annotations=builder)
Convert Polygon Annotation to Mask

More about from_polygon() function on here.This script uses module CV2, please use this page to install it.

if annotation.type == dl.AnnotationType.POLYGON:
    print("Found polygon annotation - id:", annotation.id)
    builder.add(dl.Segmentation.from_polygon(geo=annotation.annotation_definition.geo,
                                             # binary mask of the annotation
                                             label=annotation.label,
                                             shape=img.size[::-1]  # (h,w)
                                             ))
annotation.delete()
item.annotations.upload(annotations=builder)

Ellipse and Item Description

Create Ellipse Annotation
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create ellipse annotation with label - With params for an ellipse; x and y for the center, rx, and ry for the radius and rotation angle:
builder.add(annotations_definition=dl.Ellipse(x=x,
                                              y=y,
                                              rx=rx,
                                              ry=ry,
                                              angle=angle,
                                              label=label))
# Upload the ellipse to the item
item.annotations.upload(builder)
Item Description

Item description is added as a “system annotation”, and serves as a way to save information about the item, that can be seen by anyone accessing it.

# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Add description (update if already exists)- if text is empty it will remove the description from the item
item.set_description(text="this is item description")

Segmentation

Init Segmentation

Each annotation init receives the coordinates for the specific type, label, and optional attributes. A binary mask should be exactly the same dimensions as the image item, with 0 for background and 1 for the annotation.

annotations_definition = dl.Segmentation(geo=geo, label=label)
Create a Semantic Segmentation Annotation
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create semantic segmentation mask with label and attribute
mask = np.zeros(shape=(item.height, item.width), dtype=np.uint8)
# mark some part in the middle
mask[50:100, 200:250] = 1
# Add annotations of type segmentation
builder.add(annotation_definition=dl.Segmentation(geo=mask,
                                                  label='my-label'))
# Optional: Plot all of the annotations you created before uploading them to the platform
import matplotlib.pyplot as plt
plt.figure()
plt.imshow(builder.show())
for annotation in builder:
    plt.figure()
    plt.imshow(annotation.show())
    plt.title(annotation.label)
# Upload semantic segmentation to the item
item.annotations.upload(builder)
Convert Mask to Polygon

The Dataloop SDK includes a function to convert a semantic mask to a polygon annotation, which is often easier to edit and work with in the UI.The following example filters for items with semantic mask annotations, and converts them into Polygon annotations.

filters = dl.Filters()
# set resource
filters.resource = 'items'
# add filter - only files
filters.add(field='type', values='file')
# add annotation filters - only items with 'binary' annotations
filters.add_join(field='type', values='binary')
# get results
pages = dataset.items.list(filters=filters)
# run over all items in page
for page in pages:
    for item in page:
        print('item=' + item.id)
        annotations = item.annotations.list()
        builder = item.annotations.builder()
        # run over all annotation in item
        for annotation in annotations:
            # print(annotation)
            if annotation.type == 'binary':
                print("Found binary annotation - id:", annotation.id)
                builder.add(dl.Polygon.from_segmentation(mask=annotation.annotation_definition.geo,
                                                         # binary mask of the annotation
                                                         label=annotation.label,
                                                         max_instances=None))
                annotation.delete()
        item.annotations.upload(annotations=builder)
Convert Polygon to Mask

The Dataloop SDK also includes a function to convert a Polygon annotation into semantic mask annotation.The following example filters for items with Polygon annotations, and converts them into semantic mask annotations.This script uses module CV2, please make sure it is installed.

from PIL import Image
filters = dl.Filters()
# set resource
filters.resource = 'items'
# add filter - only files
filters.add(field='type', values='file')
# add annotation filters - only items with polygon annotations
filters.add_join(field='type', values='segment')
# get results
pages = dataset.items.list(filters=filters)
# run over all items in page
for page in pages:
    for item in page:
        print('item=' + item.id)
        annotations = item.annotations.list()
        item = dataset.items.get(item_id=item.id)
        buffer = item.download(save_locally=False)
        img = Image.open(buffer)
        builder = item.annotations.builder()
        # run over all annotation in item
        for annotation in annotations:
            # print(annotation)
            if annotation.type == 'segment':
                print("Found polygon annotation - id:", annotation.id)
                builder.add(dl.Segmentation.from_polygon(geo=annotation.annotation_definition.geo,
                                                         # binary mask of the annotation
                                                         label=annotation.label,
                                                         shape=img.size[::-1]  # (h,w)
                                                         ))
                annotation.delete()
        item.annotations.upload(annotations=builder)
Create Semantic Segmentation from Image Mask and Upload

The following script creates a semantic mask based on RGB colors of an image item and upload them to the Dataloop platformPlease notice that directory paths look different in OS and Linux and does not require “r” at the beginningMake sure to use install OpenCV package to version 3.4.8.x with the scriptpip install opencv-python == 3 .4.8.latest

from PIL import Image
import numpy as np
import dtlpy as dl
# Get project and dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# image filepath
image_filepath = r'C:/home/images/with_family.png'
# annotations filepath - RGB with color for each label
annotations_filepath = r'C:/home/masks/with_family.png'
# upload item to root directory
item = dataset.items.upload(local_path=image_filepath,
                            remote_path='/')
# read mask from file
mask = np.array(Image.open(annotations_filepath))
# get unique color (labels)
unique_colors = np.unique(mask.reshape(-1, mask.shape[2]), axis=0)
# init dataloop annotations builder
builder = item.annotations.builder()
# for each label - create a dataloop mask annotation
for i, color in enumerate(unique_colors):
    print(color)
    if i == 0:
        # ignore background
        continue
    # get mask of same color
    class_mask = np.all(color == mask, axis=2)
    # add annotation to builder
    builder.add(annotation_definition=dl.Segmentation(geo=class_mask,
                                                      label=str(i)))
    # upload all annotations
    item.annotations.upload(builder)

Advance Tutorials

Copy Annotations Between Items

By setting annotations entity from one item, and uploading it into another, we can copy annotations between items. Running through all items in a filter allows us to copy from one item into multiple items, for example video snapshots with the same object.

# Set the source item with the annotations we want to copy
project = dl.projects.get(project_name='second-project_name')
dataset = project.datasets.get(dataset_name='second-dataset_name')
item = dataset.items.get(item_id='first-id-number')
annotations = item.annotations.list()
# Set the target item where we want to copy to. If located on a different Project or Dataset, set these accordingly
item = dataset.items.get(item_id='second-id-number')
item.annotations.upload(annotations=annotations)
# Copy the annotation into multiple items, based on a filter entity. In this example, the filter is based on directory
filters = dl.Filters()
filters.add(field='filename', values='/fighting/**')  # take files from the directory only (recursive)
filters.add(field='type', values='file')  # only files
pages = dataset.items.list(filters=filters)
for page in pages:
    for item in page:
        # upload annotations
        item.annotations.upload(annotations=annotations)
Show Images & Annotations

This script uses module CV2, please use this page to install it.

from PIL import Image
# Get item
item = dataset.items.get(item_id='write-your-id-number')
# download item as a buffer
buffer = item.download(save_locally=False)
# open image
image = Image.open(buffer)
# download annotations
annotations = item.annotations.show(width=image.size[0],
                                    height=image.size[1],
                                    thickness=3)
annotations = Image.fromarray(annotations.astype(np.uint8))
# show the annotations and the image separately
annotations.show()
image.show()
# Show the annotations with the image
image.paste(annotations, (0, 0), annotations)
image.show()
Show Annotations from JSON file (Dataloop format)

Please notice that directory paths look different in OS and Linux and does not require “r” at the beginning

from PIL import Image
import json
with open(r'C:/home/project/images/annotation.json', 'r') as f:
    data = json.load(f)
for annotation in data['annotations']:
    annotations = dl.Annotation.from_json(annotation)
    mask = annotations.show(width=640,
                            height=480,
                            thickness=3,
                            color=(255, 0, 0))
    mask = Image.fromarray(mask.astype(np.uint8))
    mask.show()
Count total number of annotations

The following script counts the number of annotations in a filter. The filter can be set to any context - Dataset, folder or any specific criteria. In the following example, it is set to a dataset.

# Create annotations filters instance
filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
filters.page_size = 0
# Count the annotations
annotations_count = dataset.annotations.list(filters=filters).items_count
Parenting Annotations

Parenting establishes a relation between 2 annotations, executed by setting the parent_id parameter. The Dataloop system will reject an attempt to set circular parenting.The following script demonstrate setting parenting relation while uploading/creating annotations

builder = item.annotations.builder()
builder.add(annotation_definition=dl.Box(top=10, left=10, bottom=100, right=100,
                                         label='my-parent-label'))
# upload parent annotation
annotations = item.annotations.upload(annotations=builder)
# create the child annotation
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Box(top=10, left=10, bottom=100, right=100,
                                         label='my-child-label'),
            parent_id=annotations[0].id)
# upload annotations to item
item.annotations.upload(annotations=builder)

The following script demonstrate setting parenting relation on existing annotations:

# create and upload parent annotation
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Box(top=10, left=10, bottom=100, right=100,
                                         label='my-parent-label'))
parent_annotation = item.annotations.upload(annotations=builder)[0]
# create and upload child annotation
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Box(top=10, left=10, bottom=100, right=100,
                                         label='my-child-label'))
child_annotation = item.annotations.upload(annotations=builder)[0]
# set the child parent ID to the parent
child_annotation.parent_id = parent_annotation.id
# update the annotation
child_annotation.update(system_metadata=True)
Change Annotations’ Label

The following example creates a new label in the recipe (an optional step, you can also use an existing label), then applies it to all annotations in a certain filter.

# Create a new label
dataset.add_label(label_name='newLabel', color=(2, 43, 123))
# Filter annotations with the "oldLabel" label.
filters = dl.Filters()
filters.resource = dl.FiltersResource.ANNOTATION
filters.add(field='label', values='oldLabel')
pages = dataset.annotations.list(filters=filters)
# Change the Label of the Annotations - For every annotation we filtered out, Change it's Label to the "newLabel".
for annotation in pages.all():
    annotation.label = 'newLabel'
    annotation.update()

Video Annotations

Tutorials for annotating videos

In this tutorial we create and upload annotations into a video item. Video annotations differ from image annotations since they span over frames, and need to be set with their scope.This script uses module CV2, please use this page to install it.

Setup

import dtlpy as dl
if dl.token_expired():
    dl.login()
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
item = dataset.items.get(filepath='/my_item.mp4')

Create A Single annotation

Create a single annotations for a video item and upload it

annotation = dl.Annotation.new(item=item)
# Span the annotation over 100 frames. Change this or use a different approach based on your context
for i_frame in range(100):
    # go over 100 frame
    annotation.add_frame(annotation_definition=dl.Box(top=2 * i_frame,
                                                      left=2 * (i_frame + 10),
                                                      bottom=2 * (i_frame + 50),
                                                      right=2 * (i_frame + 100),
                                                      label="my-label"),
                         frame_num=i_frame,  # set the frame for the annotation
                         )
# upload to platform
annotation.upload()

Adding Multiple Annotations Using Annotation Builder

The following scripts demonstrate adding 10 annotations into each frame

# create annotation builder
builder = item.annotations.builder()
for i_frame in range(100):
    # go over 100 frames
    for i_detection in range(10):
        # for each frame we have 10 different detections (location is just for the example)
        builder.add(annotation_definition=dl.Box(top=2 * i_frame,
                                                 left=2 * i_detection,
                                                 bottom=2 * i_frame + 10,
                                                 right=2 * i_detection + 100,
                                                 label="my-label"),
                    # set the frame for the annotation
                    frame_num=i_frame,
                    # need to input the element id to create the connection between frames
                    object_id=i_detection + 1,
                    )
# Upload the annotations to platform
item.annotations.upload(builder)

Read Frames of an Annotation

The following example reads all the frames an annotation exist in, e.g. the frame range an annotation spans over.

for annotation in item.annotations.list():
    print(annotation.object_id)
    for key in annotation.frames:
        frame = annotation.frames[key]
        print(frame.left, frame.right, frame.top, frame.bottom)

Create Frame Snapshots from Video

One of Dataloop video utilities enables creating a frame snapshot from a video item every X frames (frame_interval).You will need FFmpeg needs to be installed on your system using this official website.

dl.utilities.Videos.video_snapshots_generator(item=item, frame_interval=30)

Play An Item In Video Player

Play a video item with its annotations and labels with a video player

from dtlpy.utilities.videos.video_player import VideoPlayer
VideoPlayer(project_name=project_name,
            dataset_name=dataset_name,
            item_filepath=item_filepath)

Show Annotations in a Specified Frame

import matplotlib.pyplot as plt
# Get from platform
annotations = item.annotations.list()
# Plot the annotations in frame 55 of the created annotations
frame_annotation = annotations.get_frame(frame_num=55)
plt.figure()
plt.imshow(frame_annotation.show())
plt.title(frame_annotation.label)
# Play video with the Dataloop video player
annotations.video_player()

Recipe and Ontology

Tutorials for managing ontologies, labels, and recipes

Recipe and Ontology Concepts

The Dataloop Recipe & Ontology concepts are detailed in our documentation. In short:

  • Ontology - an entity that contains labels and attributes. An attribute is linked to a label

  • Recipe - An entity that ties an ontology with labeling instructions

    • Linked with an ontology

    • Labeling tools (e.g. box, polygon etc)

    • Optional PDF instructions

    • And more…

In this chapter we will create an ontology and populate it with labels

Preparing - Entities setup

import dtlpy as dl
if dl.token_expired():
    dl.login()
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# Get recipe from list
recipe = dataset.recipes.list()[0]
# Or get specific recipe:
recipe = dataset.recipes.get(recipe_id='id')
# Get ontology from list or create it using the "Create Ontology" script
ontology = recipe.ontologies.list()[0]
# Or get specific ontology:
ontology = recipe.ontologies.get(ontology_id='id')
# Print entities:
recipe.print()
ontology.print()

Create an Ontology

project = dl.projects.get(project_name='project_name')
ontology = project.ontologies.create(title="your_created_ontology_title",
                                     labels=[dl.Label(tag="Chameleon", color=(255, 0, 0))])

Labels

Ontology uses the ‘Labels’ entity, which is a python list object, and as such you can use python list methods such as sort(). Be sure to use ontology.update() after each python list action.

ontology.add_labels(label_list=['Shark', 'Whale', 'Animal.Donkey'], update_ontology=True)

Labels can be added with branched hierarchy to facilitate sub-labels at up-to 5 levels.Labels hierarchy is created by adding ‘.’ between parent and child labels.In the above example, this script will get the Donkey Label:

child_label = ontology.labels[-1].children[0]
print(child_label.tag, child_label.rgb)

Attributes

An attribute describes a label, without having to add more labels. For example “Car” is a label, but its color is an attribute. You can add multiple attributes to the ontology, and map it to labels. For example create the “color” attribute once, but have multiple labels use it.Attributes can be multiple-selection (e.g checkbox), single selection (radio button), value over slider, a yes/no question and free-text.An attribute can be set as a mandatory one, so annotators have to answer it before they can complete the item.

Add attributes to the ontology

The following example adds 1 attribute of every type, all as a mandatory attribute:

  • Multiple-choice attribute

  • Single-choice attributes

  • Slider attribute

  • Yes/no question attribute

  • Free text attribute

# checkbox attribute
ontology.update_attributes(key='color',
                           title='Choose a color',
                           attribute_type=dl.AttributesTypes.CHECKBOX,
                           values=['red', 'blue', 'green'],
                           scope=['<label1>', '<label2>'])
# radio button attribute
ontology.update_attributes(key='occluded',
                           title='Level of occlusion',
                           attribute_type=dl.AttributesTypes.RADIO_BUTTON,
                           values=['no', 'mid', 'high'],
                           scope=['*'])
# slider attribute
ontology.update_attributes(key='height',
                           title='Persons height[cm]',
                           attribute_type=dl.AttributesTypes.SLIDER,
                           attribute_range=dl.AttributesRange(0, 200, 10),
                           scope=['*'])
# yes/no attribute
ontology.update_attributes(key='female',
                           title='Is mosquito female?',
                           attribute_type=dl.AttributesTypes.YES_NO,
                           scope=['*'])
# free text attribute
ontology.update_attributes(key='age',
                           title='How old is the person',
                           attribute_type=dl.AttributesTypes.FREE_TEXT,
                           scope=['*'])

Read Ontology Attributes

Read & print the all the ontology attributes:

print(ontology.metadata['attributes'])
keys = [att['key'] for att in ontology.metadata['attributes']]

Getting all labels is (including children):

print(ontology.labels_flat_dict)

Since a recipe is linked with an ontology, it allows for making changes with labels and attributes. When the recipe is set as the default one for a dataset, the same applies for the dataset entity - it can be used for making changes with the labels and attributes which are ultimately linked to it through the recipe and its ontology.

Working With Recipes

# Get recipe from a list
recipe = dataset.recipes.list()[0]
# Get recipe by ID - ID can be retrieved from the page URL when opening the recipe in the platform
recipe = dataset.recipes.get(recipe_id='your-recipe-id')
# Delete recipe - applies only for deleted datasets
dataset.recipes.get(recipe_id='your-recipe-id').delete()

Cloning Recipes

When you want to create a new recipe that’s only slightly different from an existing recipe, it can be easier to start by cloning the original recipe and then making changes on its clone.shallow: If True, link to existing ontology,If false clone all ontologies that are links to the recipe as well.

dataset = project.datasets.get(dataset_name="myDataSet")
recipe = dataset.recipes.get(recipe_id="recipe_id")
recipe2 = recipe.clone(shallow=False)

View Dataset Labels

# as objects
labels = dataset.labels
# as instance map
labels = dataset.instance_map

Add Labels by Dataset

Working with dataset labels can be done one-by-one or as a list.The Dataset entity documentation details all label options - read here.

# Add multiple labels
dataset.add_labels(label_list=['person', 'animal', 'object'])
# Add single label with specific color and attributes
dataset.add_label(label_name='person', color=(34, 6, 231))
# Add single label with a thumbnail/icon
dataset.add_label(label_name='person', icon_path='/home/project/images/icon.jpg')

Add Labels Using Label Object

# Create Labels list using Label object
labels = [
    dl.Label(tag='Donkey', color=(255, 100, 0)),
    dl.Label(tag='Mammoth', color=(34, 56, 7)),
    dl.Label(tag='Bird', color=(100, 14, 150))
]
# Add Labels to Dataset
dataset.add_labels(label_list=labels)
# or you can also create a recipe from the label list
recipe = dataset.recipes.create(recipe_name='My-Recipe-name', labels=labels)

Add a Label and Sub-Labels

label = dl.Label(tag='Fish',
                 color=(34, 6, 231),
                 children=[dl.Label(tag='Shark',
                                    color=(34, 6, 231)),
                           dl.Label(tag='Salmon',
                                    color=(34, 6, 231))]
                 )
dataset.add_labels(label_list=label)
# or you can also create a recipe from the label list
recipe = dataset.recipes.create(recipe_name='My-Recipe-name', labels=labels)

Add Hierarchy Labels with Nested

Different options for hierarchy label creation.

# Option A
# add father label
labels = dataset.add_label(label_name="animal", color=(123, 134, 64))
# add child label
labels = dataset.add_label(label_name="animal.Dog", color=(45, 34, 164))
# add grandchild label
labels = dataset.add_label(label_name="animal.Dog.poodle")
# Option B: only if you dont have attributes
# parent and grandparent (animal and dog) will be generated automatically
labels = dataset.add_label(label_name="animal.Dog.poodle")
# Option C: with the Big Dict
nested_labels = [
    {'label_name': 'animal.Dog',
     'color': '#220605',
     'children': [{'label_name': 'poodle',
                   'color': '#298345'},
                  {'label_name': 'labrador',
                   'color': '#298651'}]},
    {'label_name': 'animal.cat',
     'color': '#287605',
     'children': [{'label_name': 'Persian',
                   'color': '#298345'},
                  {'label_name': 'Balinese',
                   'color': '#298651'}]}
]
# Add Labels to the dataset:
labels = dataset.add_labels(label_list=nested_labels)

Delete Labels by Dataset

dataset.delete_labels(label_names=['Cat', 'Dog'])

Update Label Features

# update existing label , if not exist fails
dataset.update_label(label_name='Cat', color="#000080")
# update label, if not exist add it
dataset.update_label(label_name='Cat', color="#fcba03", upsert=True)

Model Management

Tutorials for creating and managing model and snapshots

Model Management

Quick Overview

Dataloop’s model management allows machine learning engineers to manage their research and production processes in one centralized place.

Models are run using a combination of Packagess, Datasets, and Artifacts.

Model architectures are pushed to the cloud via Packages. Packages are bundles of code that contain the codebase required for the model to run. Datasets will include the images being used for training or inference, and they also indicate which images are included within a dataset subset (e.g. train/validation/test, or dividing your datset in other ways to achieve specific model training objectives).

Models can come from ready-to-go packages of open source algorithms (e.g. ResNet, Yolo). Models can also be created from pre-trained models for fine-tuning or transfer learning.

You can also upload your own models and compare model performance by viewing training metrics.

All models can be integrated into the Dataloop platform, connected to the UI via buttons or slots, or added to pipelines.

Introduction

In this tutorial we will cover the required Dataloop entities to create, compare, restore, manage, and deploy model training sessions and trained models.

https://github.com/dataloop-ai/dtlpy-documentation/blob/model_mgmt_3/assets/images/model_management/model_diagram.pngComponents of a Model

Package and Model Entities
Package

We will use the Package entity to save the architecture of the model (e.g Yolov5, Inception, SVM, etc.) and the model algorithm code.

  • In “online” mode (see “Model Comparison” below), Packages should include a Model Adapter to create the Dataloop API

Model algorithms that come as-is can be found in the AI Library. All public packages listed in the AI Library are pretrained and include the model algorithm code and default configurations. Users can download the codebase of any packages pushed to the cloud.

Model

Using the Package (code), Dataset and Ontology (data and labels) and configuration (a dictionary) we can now create a Model.

The Model contains the weights and any other artifacts needed to load the trained model and inference.

A model can also be cloned to be a starting point for a new model (for fine-tuning or transfer learning).

Additional Package components

Some users may want to further customize their models, such as uploading their own model weights or creating their own custom model. This can be achieved with Artifacts and a Model Adapter.

Artifacts and Codebase

Artifacts are any additional files necessary for a given model to run on the cloud. For example, if a user wanted to upload their own weights to create a pre-trained model, the weights file would be included as an Artifact. These artifacts can be uploaded via one of the following:

  1. local directory or path

  2. dl.Item

  3. Git repository

  4. other link

The Model Adapter

The Model Adapter is a python class that creates a single API between Dataloop’s platform and your model. The Model Adapter class contains standardized methods that make it possible to integrate models into other parts of the Dataloop platform. Model Adapters allow the following model functions:

  1. train

  2. predict

  3. load/save model weights

  4. annotation conversion (if needed)

Model comparison

All models can be viewed in one place, and different model versions can be compared and evaluated with user-selected metrics.

Offline vs online mode

Model management can be used in two modes: offline (for local model training) or online (for integration into the Dataloop platform).

In “offline” mode, users can run and train models on their local machine using local data, and can compare model configurations and metrics on the platform. “Offline” requires minimal platform integration, and can be used after dl.Package and dl.Model entities are created. This mode allows only visualizing metrics exported from the (local) model training session.

In “offline” mode, code and weights are not saved anywhere in the Dataloop platform. Only model metrics are saved and viewable at a later time.

https://github.com/dataloop-ai/dtlpy-documentation/blob/model_mgmt_3/assets/images/model_management/metrics_example.pngAn example of model metrics

In “online” mode, models can be trained to be deployed anywhere on the platform. For example, you can easily create a button interface to use your model to inference on a new data item and view it on the platform.

To do this, you need to create a ModelAdapter class and implement the required functions to build the Dataloop API.

“Online” mode also includes all the platform features mentioned above in “offline” mode.

Create your own Package and Model

You can use your own model on the platform by creating Package and Model entities, and then use a model adapter to create an API with Dataloop.

The first thing a model adapter does is create a model adapter class. The example here inherits from dl.BaseModelAdapter, which contains all the Dataloop methods required to interact with the Package and Model. You must implement these methods in the model adapter class in order for them to work: load, save, train, predict.

In this example, the adapter is defined in a script called “adapter_script.py” and is separate from the rest of the code on this page. This script will load a model from a saved model weights file in the root directory called ‘model.pth’.

import dtlpy as dl
import torch
import os
@dl.Package.decorators.module(name='model-adapter',
                              description='Model Adapter for my model',
                              init_inputs={'model_entity': dl.Model})
class SimpleModelAdapter(dl.BaseModelAdapter):
    def load(self, local_path, **kwargs):
        print('loading a model')
        self.model = torch.load(os.path.join(local_path, 'model.pth'))
    def save(self, local_path, **kwargs):
        print('saving a model to {}'.format(local_path))
        torch.save(self.model, os.path.join(local_path, 'model.pth'))
    def train(self, data_path, output_path, **kwargs):
        print('running a training session')
    def predict(self, batch, **kwargs):
        print('predicting batch of size: {}'.format(len(batch)))
        preds = self.model(batch)
        return preds

NOTE: The code above is an example for a torch model adapter. This example will NOT run if copied as-is. For working examples please refer to the examples in the Dataloop Github.

To create our Package entity, we first need to pack our package code to a dl.ItemCodebase, retrieve the metadata, and indicate where the entry point to the package is. If you’re creating a Package with code from Git, change the codebase type to be dl.GitCodebase.

import dtlpy as dl
from adapter_script import SimpleModelAdapter
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name')
codebase = project.codebases.pack(directory='<path to local dir>')
# codebase: dl.GitCodebase = dl.GitCodebase(git_url='github.com/mygit', git_tag='v25.6.93')
metadata = dl.Package.get_ml_metadata(cls=SimpleModelAdapter,
                                      default_configuration={'weights_filename': 'model.pth',
                                                             'input_size': 256},
                                      output_type=dl.AnnotationType.CLASSIFICATION
                                      )
module = dl.PackageModule.from_entry_point(entry_point='adapter_script.py')

Then we can push the package and all its parts to the cloud. To change the computing configurations, see the Dataloop docs for the Instance Catalog.

package = project.packages.push(package_name='My-Package',
                                src_path=os.getcwd(),
                                package_type='ml',
                                codebase=codebase,
                                modules=[module],
                                is_global=False,
                                service_config={
                                    'runtime': dl.KubernetesRuntime(pod_type=dl.INSTANCE_CATALOG_GPU_K80_S,
                                                                    autoscaler=dl.KubernetesRabbitmqAutoscaler(
                                                                        min_replicas=0,
                                                                        max_replicas=1),
                                                                    concurrency=1).to_json()},
                                metadata=metadata)

Now you can create a model and upload pretrained model weights with an Artifact Item. Here, the Artfiact item is where the saved model weights are. You can upload any weights file here and name it according to the ‘weights_filename’ in the configuration.

artifact = dl.LocalArtifact(local_path='<path to weights>')
model = package.models.create(model_name='tutorial-model',
                              description='first model we are uploading',
                              tags=['pretrained', 'tutorial'],
                              dataset_id=None,
                              configuration={'weights_filename': 'model.pth'
                                             },
                              project_id=package.project.id,
                              model_artifacts=[artifact],
                              labels=['car', 'fish', 'pizza']
                              )

Finally, build the model adapter and call one of the adapter’s methods to see that your custom model works. If you’ve entered a dataset_id when creating the model, you can also train the model on that dataset.

adapter = package.build()
adapter.load_from_model(model_entity=model)

Dataloop Dataloader

A dl.Dataset image and annotation generator for training and for items visualization

We can visualize the data with augmentation for debugging and exploration.After that, we will use the Data Generator as an input to the training functions.

from dtlpy.utilities import DatasetGenerator
import dtlpy as dl
dataset = dl.datasets.get(dataset_id='611b86e647fe2f865323007a')
datagen = DatasetGenerator(data_path='train',
                           dataset_entity=dataset,
                           annotation_type=dl.AnnotationType.BOX)
Object Detection Examples

We can visualize a random item from the dataset:

for i in range(5):
    datagen.visualize()

Or get the same item using its index:

for i in range(5):
    datagen.visualize(10)

Adding augmentations using imgaug repository:

from imgaug import augmenters as iaa
import numpy as np
augmentation = iaa.Sequential([
    iaa.Resize({"height": 256, "width": 256}),
    # iaa.Superpixels(p_replace=(0, 0.5), n_segments=(10, 50)),
    iaa.flip.Fliplr(p=0.5),
    iaa.flip.Flipud(p=0.5),
    iaa.GaussianBlur(sigma=(0.0, 0.8)),
])
tfs = [
    augmentation,
    np.copy,
    # transforms.ToTensor()
]
datagen = DatasetGenerator(data_path='train',
                           dataset_entity=dataset,
                           annotation_type=dl.AnnotationType.BOX,
                           transforms=tfs)
datagen.visualize()
datagen.visualize(10)

All of the Data Generator options (from the function docstring):

:param dataset_entity: dl.Dataset entity:param annotation_type: dl.AnnotationType - type of annotation to load from the annotated dataset:param filters: dl.Filters - filtering entity to filter the dataset items:param data_path: Path to Dataloop annotations (root to “item” and “json”).:param overwrite::param label_to_id_map: dict - {label_string: id} dictionary:param transforms: Optional transform to be applied on a sample. list or torchvision.Transform:param num_workers::param shuffle: Whether to shuffle the data (default: True) If set to False, sorts the data in alphanumeric order.:param seed: Optional random seed for shuffling and transformations.:param to_categorical: convert label id to categorical format:param class_balancing: if True - performing random over-sample with class ids as the target to balance training data:param return_originals: bool - If True, return ALSO images and annotations before transformations (for debug):param ignore_empty: bool - If True, generator will NOT collect items without annotations

The output of a single element is a dictionary holding all the relevant information.the keys for the DataGen above are: [’image_filepath’, ‘item_id’, ‘box’, ‘class’, ‘labels’, ‘annotation_filepath’, ‘image’, ‘annotations’, ‘orig_image’, ‘orig_annotations’]

print(list(datagen[0].keys()))

We’ll add the flag to return the origin items to understand better how the augmentations look like.Let’s set the flag and we can plot:

import matplotlib.pyplot as plt
datagen = DatasetGenerator(data_path='train',
                           dataset_entity=dataset,
                           annotation_type=dl.AnnotationType.BOX,
                           return_originals=True,
                           shuffle=False,
                           transforms=tfs)
fig, ax = plt.subplots(2, 2)
for i in range(2):
    item_element = datagen[np.random.randint(len(datagen))]
    ax[i, 0].imshow(item_element['image'])
    ax[i, 0].set_title('After Augmentations')
    ax[i, 1].imshow(item_element['orig_image'])
    ax[i, 1].set_title('Before Augmentations')
Segmentation Examples

First we’ll load a semantic dataset and view some images and the output structure

dataset = dl.datasets.get(dataset_id='6197985a104eb81cb728e4ac')
datagen = DatasetGenerator(data_path='semantic',
                           dataset_entity=dataset,
                           transforms=tfs,
                           return_originals=True,
                           annotation_type=dl.AnnotationType.SEGMENTATION)
for i in range(5):
    datagen.visualize()

Visualize original vs augmented image and annotations mask:

fig, ax = plt.subplots(2, 4)
for i in range(2):
    item_element = datagen[np.random.randint(len(datagen))]
    ax[i, 0].imshow(item_element['orig_image'])
    ax[i, 0].set_title('Original Image')
    ax[i, 1].imshow(item_element['orig_annotations'])
    ax[i, 1].set_title('Original Annotations')
    ax[i, 2].imshow(item_element['image'])
    ax[i, 2].set_title('Augmented Image')
    ax[i, 3].imshow(item_element['annotations'])
    ax[i, 3].set_title('Augmented Annotations')

Converting to 3d one-hot encoding to visualize the binary mask per label. We will plot only 8 labels (there might be more on the item):

item_element = datagen[np.random.randint(len(datagen))]
annotations = item_element['annotations']
unique_labels = np.unique(annotations)
one_hot_annotations = np.arange(len(datagen.id_to_label_map)) == annotations[..., None]
print('unique label indices in the item: {}'.format(unique_labels))
print('unique labels in the item: {}'.format([datagen.id_to_label_map[i] for i in unique_labels]))
plt.figure()
plt.imshow(item_element['image'])
fig = plt.figure()
for i_label_ind, label_ind in enumerate(unique_labels[:8]):
    ax = fig.add_subplot(2, 4, i_label_ind + 1)
    ax.imshow(one_hot_annotations[:, :, label_ind])
    ax.set_title(datagen.id_to_label_map[label_ind])
Setting a Label Map

One of the inputs to the DatasetGenerator is ‘label_to_id_map’. This variable can be used to change the label mapping for the annotationsand allow using the dataset ontology in a greater variety of cases.For example, you can map multiple labels so a single id or add a default value for all the unlabeled pixels in segmentation annotations.This is what the annotation looks like without any mapping:

# project = dl.projects.get(project_name='Semantic')
# dataset = project.datasets.get(dataset_name='Hamster')
# dataset.items.upload(local_path='assets/images/hamster.jpg',
#                      local_annotations_path='assets/images/hamster.json')
dataset = dl.datasets.get(dataset_id='621ddc855c2a3d151451ec58')
datagen = DatasetGenerator(data_path='semantic',
                           dataset_entity=dataset,
                           return_originals=True,
                           overwrite=True,
                           annotation_type=dl.AnnotationType.SEGMENTATION)
datagen.visualize()
data_item = datagen[0]
plt.imshow(data_item['annotations'])
print('BG value: {}'.format(data_item['annotations'][0, 0]))

Now, we’ll map both the ‘eye’ label and the background to 2 and the ‘fur’ to 1:

dataset = dl.datasets.get(dataset_id='6197985a104eb81cb728e4ac')
label_to_id_map = {'cat': 1,
                   'dog': 1,
                   '$default': 0}
dataloader = DatasetGenerator(data_path='semantic',
                              dataset_entity=dataset,
                              transforms=tfs,
                              return_originals=True,
                              label_to_id_map=label_to_id_map,
                              annotation_type=dl.AnnotationType.SEGMENTATION)
for i in range(5):
    dataloader.visualize()
Batch size and batch_size and collate_fn

If batch_size is not None, the returned structure will be a list with batch_size data items.Setting a collate function will convert the returned structure to a tensor of any kind.The default collate will convert everything to ndarrays. We also have tensorflow and torch collate to convert to the corresponding tensors.

dataset = dl.datasets.get(dataset_id='611b86e647fe2f865323007a')
datagen = DatasetGenerator(data_path='train',
                           dataset_entity=dataset,
                           batch_size=10,
                           annotation_type=dl.AnnotationType.BOX)
batch = datagen[0]
print('type: {}, len: {}'.format(type(batch), len(batch)))
print('single element in the list: {}'.format(batch[0]['image']))
# with collate
from dtlpy.utilities.dataset_generators import collate_default
datagen = DatasetGenerator(data_path='train',
                           dataset_entity=dataset,
                           collate_fn=collate_default,
                           batch_size=10,
                           annotation_type=dl.AnnotationType.BOX)
batch = datagen[0]
print('type: {}, len: {}, shape: {}'.format(type(batch['images']), len(batch['images']), batch['images'].shape))

Indices and tables