Dataloop’s SDK and CLI documentation¶
Drive your AI to production with end-to-end data management, automation pipelines and a quality-first data labeling platform
Command Line Interface¶
Options:
CLI for Dataloop
usage: dlp [-h] [-v]
{shell,upgrade,logout,login,login-token,login-secret,login-m2m,init,checkout-state,help,version,api,projects,datasets,items,videos,app,services,triggers,deploy,generate,packages,ls,pwd,cd,mkdir,clear,exit}
...
Positional Arguments¶
- operation
Possible choices: shell, upgrade, logout, login, login-token, login-secret, login-m2m, init, checkout-state, help, version, api, projects, datasets, items, videos, app, services, triggers, deploy, generate, packages, ls, pwd, cd, mkdir, clear, exit
supported operations
Named Arguments¶
- -v, --version
dtlpy version
Default: False
Sub-commands:¶
shell¶
Open interactive Dataloop shell
dlp shell [-h]
upgrade¶
Update dtlpy package
dlp upgrade [-h] [-u ]
optional named arguments¶
- -u, --url
Package url. default ‘dtlpy’
logout¶
Logout
dlp logout [-h]
login¶
Login using web Auth0 interface
dlp login [-h]
login-token¶
Login by passing a valid token
dlp login-token [-h] -t
required named arguments¶
- -t, --token
valid token
login-secret¶
Login client id and secret
dlp login-secret [-h] [-e ] [-p ] [-i ] [-s ]
required named arguments¶
- -e, --email
user email
- -p, --password
user password
- -i, --client-id
client id
- -s, --client-secret
client secret
login-m2m¶
Login client id and secret
dlp login-m2m [-h] [-e ] [-p ] [-i ] [-s ]
required named arguments¶
- -e, --email
user email
- -p, --password
user password
- -i, --client-id
client id
- -s, --client-secret
client secret
init¶
Initialize a .dataloop context
dlp init [-h]
checkout-state¶
Print checkout state
dlp checkout-state [-h]
help¶
Get help
dlp help [-h]
version¶
DTLPY SDK version
dlp version [-h]
api¶
Connection and environment
dlp api [-h] {info,setenv} ...
Positional Arguments¶
- api
Possible choices: info, setenv
gate operations
Sub-commands:¶
info¶
Print api information
dlp api info [-h]
setenv¶
Set platform environment
dlp api setenv [-h] -e
- -e, --env
working environment
projects¶
Operations with projects
dlp projects [-h] {ls,create,checkout,web} ...
Positional Arguments¶
- projects
Possible choices: ls, create, checkout, web
projects operations
Sub-commands:¶
ls¶
List all projects
dlp projects ls [-h]
create¶
Create a new project
dlp projects create [-h] [-p ]
- -p, --project-name
project name
checkout¶
checkout a project
dlp projects checkout [-h] [-p ]
- -p, --project-name
project name
web¶
Open in web browser
dlp projects web [-h] [-p ]
- -p, --project-name
project name
datasets¶
Operations with datasets
dlp datasets [-h] {web,ls,create,checkout} ...
Positional Arguments¶
- datasets
Possible choices: web, ls, create, checkout
datasets operations
Sub-commands:¶
web¶
Open in web browser
dlp datasets web [-h] [-p ] [-d ]
- -p, --project-name
project name
- -d, --dataset-name
dataset name
ls¶
List of datasets in project
dlp datasets ls [-h] [-p ]
- -p, --project-name
project name. Default taken from checked out (if checked out)
create¶
Create a new dataset
dlp datasets create [-h] -d [-p ] [-c]
- -d, --dataset-name
dataset name
- -p, --project-name
project name. Default taken from checked out (if checked out)
- -c, --checkout
checkout the new dataset
Default: False
checkout¶
checkout a dataset
dlp datasets checkout [-h] [-d ] [-p ]
- -d, --dataset-name
dataset name
- -p, --project-name
project name. Default taken from checked out (if checked out)
items¶
Operations with items
dlp items [-h] {web,ls,upload,download} ...
Positional Arguments¶
- items
Possible choices: web, ls, upload, download
items operations
Sub-commands:¶
web¶
Open in web browser
dlp items web [-h] [-r ] [-p ] [-d ]
- -r, --remote-path
remote path
- -p, --project-name
project name
- -d, --dataset-name
dataset name
ls¶
List of items in dataset
dlp items ls [-h] [-p ] [-d ] [-o ] [-r ] [-t ]
- -p, --project-name
project name. Default taken from checked out (if checked out)
- -d, --dataset-name
dataset name. Default taken from checked out (if checked out)
- -o, --page
page number (integer)
Default: 0
- -r, --remote-path
remote path
- -t, --type
Item type
upload¶
Upload directory to dataset
dlp items upload [-h] -l [-p ] [-d ] [-r ] [-f ] [-lap ] [-ow]
- -l, --local-path
local path
- -p, --project-name
project name. Default taken from checked out (if checked out)
- -d, --dataset-name
dataset name. Default taken from checked out (if checked out)
- -r, --remote-path
remote path to upload to. default: /
- -f, --file-types
Comma separated list of file types to upload, e.g “.jpg,.png”. default: all
- -lap, --local-annotations-path
Path for local annotations to upload with items
- -ow, --overwrite
Overwrite existing item
Default: False
download¶
Download dataset to a local directory
dlp items download [-h] [-p ] [-d ] [-ao ] [-aft ] [-afl ] [-r ] [-ow]
[-t] [-wt] [-th ] [-l ] [-wb]
- -p, --project-name
project name. Default taken from checked out (if checked out)
- -d, --dataset-name
dataset name. Default taken from checked out (if checked out)
- -ao, --annotation-options
which annotation to download. options: json,instance,mask
- -aft, --annotation-filter-type
annotation type filter when downloading annotations. options: box,segment,binary etc
- -afl, --annotation-filter-label
labels filter when downloading annotations.
- -r, --remote-path
remote path to upload to. default: /
- -ow, --overwrite
Overwrite existing item
Default: False
- -t, --not-items-folder
Download WITHOUT ‘items’ folder
Default: False
- -wt, --with-text
Annotations will have text in mask
Default: False
- -th, --thickness
Annotation line thickness
Default: “1”
- -l, --local-path
local path
- -wb, --without-binaries
Don’t download item binaries
Default: False
videos¶
Operations with videos
dlp videos [-h] {play,upload} ...
Positional Arguments¶
- videos
Possible choices: play, upload
videos operations
Sub-commands:¶
play¶
Play video
dlp videos play [-h] [-l ] [-p ] [-d ]
- -l, --item-path
Video remote path in platform. e.g /dogs/dog.mp4
- -p, --project-name
project name. Default taken from checked out (if checked out)
- -d, --dataset-name
dataset name. Default taken from checked out (if checked out)
upload¶
Upload a single video
dlp videos upload [-h] -f -p -d [-r ] [-sc ] [-ss ] [-st ] [-e]
- -f, --filename
local filename to upload
- -p, --project-name
project name
- -d, --dataset-name
dataset name
- -r, --remote-path
remote path
Default: “/”
- -sc, --split-chunks
Video splitting parameter: Number of chunks to split
- -ss, --split-seconds
Video splitting parameter: Seconds of each chuck
- -st, --split-times
Video splitting parameter: List of seconds to split at. e.g 600,1800,2000
- -e, --encode
encode video to mp4, remove bframes and upload
Default: False
app¶
Operations with application
dlp app [-h] {init,pack,publish,update,install,pull} ...
Positional Arguments¶
- app
Possible choices: init, pack, publish, update, install, pull
application operations
Sub-commands:¶
init¶
Initialize the structure in order to deploy a dpk
dlp app init [-h] [--name NAME] [--description DESCRIPTION]
[--categories CATEGORIES] [--icon ICON] [--scope SCOPE]
- --name
the name of the app
- --description
the description of the app
- --categories
the categories of the app (comma seperated)
- --icon
the icon of the app
- --scope
the scope of the app (default is organization)
pack¶
Pack the project as dpk file
dlp app pack [-h]
publish¶
Publish the app
dlp app publish [-h] --project-name PROJECT_NAME
- --project-name
The name of the project
update¶
Update the app
dlp app update [-h] --app-name APP_NAME --new-version NEW_VERSION
--project-name PROJECT_NAME
- --app-name
Locates the app by the name
- --new-version
Sets the new version of the specified app
- --project-name
The name of the project
install¶
Install the app to the platform
dlp app install [-h] --dpk-id DPK_ID [--project-name PROJECT_NAME]
[--org-id ORG_ID]
- --dpk-id
The id of the dpk
- --project-name
The name of the project
- --org-id
The name of the org
pull¶
Pull the app from the marketplace
dlp app pull [-h] --dpk-name APP_NAME
- --dpk-name
The name of the dpk
services¶
Operations with services
dlp services [-h] {execute,tear-down,ls,log,delete} ...
Positional Arguments¶
- services
Possible choices: execute, tear-down, ls, log, delete
services operations
Sub-commands:¶
execute¶
Create an execution
dlp services execute [-h] [-f FUNCTION_NAME] [-s SERVICE_NAME]
[-pr PROJECT_NAME] [-as] [-i ITEM_ID] [-d DATASET_ID]
[-a ANNOTATION_ID] [-in INPUTS]
- -f, --function-name
which function to run
- -s, --service-name
which service to run
- -pr, --project-name
Project name
- -as, --async
Async execution
Default: True
- -i, --item-id
Item input
- -d, --dataset-id
Dataset input
- -a, --annotation-id
Annotation input
- -in, --inputs
Dictionary string input
Default: “{}”
tear-down¶
tear-down service of service.json file
dlp services tear-down [-h] [-l LOCAL_PATH] [-pr PROJECT_NAME]
- -l, --local-path
path to service.json file
- -pr, --project-name
Project name
ls¶
List project’s services
dlp services ls [-h] [-pr PROJECT_NAME] [-pkg PACKAGE_NAME]
- -pr, --project-name
Project name
- -pkg, --package-name
Package name
log¶
Get services log
dlp services log [-h] [-pr PROJECT_NAME] [-f SERVICE_NAME] [-t START]
- -pr, --project-name
Project name
- -f, --service-name
Project name
- -t, --start
Log start time
delete¶
Delete Service
dlp services delete [-h] [-f SERVICE_NAME] [-p PROJECT_NAME]
[-pkg PACKAGE_NAME]
- -f, --service-name
Service name
- -p, --project-name
Project name
- -pkg, --package-name
Package name
triggers¶
Operations with triggers
dlp triggers [-h] {create,delete,ls} ...
Positional Arguments¶
- triggers
Possible choices: create, delete, ls
triggers operations
Sub-commands:¶
create¶
Create a Service Trigger
dlp triggers create [-h] -r RESOURCE -a ACTIONS [-p PROJECT_NAME]
[-pkg PACKAGE_NAME] [-f SERVICE_NAME] [-n NAME]
[-fl FILTERS] [-fn FUNCTION_NAME]
- -r, --resource
Resource name
- -a, --actions
Actions
- -p, --project-name
Project name
- -pkg, --package-name
Package name
- -f, --service-name
Service name
- -n, --name
Trigger name
- -fl, --filters
Json filter
Default: “{}”
- -fn, --function-name
Function name
Default: “run”
delete¶
Delete Trigger
dlp triggers delete [-h] -t TRIGGER_NAME [-f SERVICE_NAME] [-p PROJECT_NAME]
[-pkg PACKAGE_NAME]
- -t, --trigger-name
Trigger name
- -f, --service-name
Service name
- -p, --project-name
Project name
- -pkg, --package-name
Package name
ls¶
List triggers
dlp triggers ls [-h] [-pr PROJECT_NAME] [-pkg PACKAGE_NAME] [-s SERVICE_NAME]
- -pr, --project-name
Project name
- -pkg, --package-name
Package name
- -s, --service-name
Service name
deploy¶
deploy with json file
dlp deploy [-h] [-f JSON_FILE] [-p PROJECT_NAME]
required named arguments¶
- -f
Path to json file
- -p
Project name
generate¶
generate a json file
dlp generate [-h] [--option PACKAGE_TYPE] [-p PACKAGE_NAME]
optional named arguments¶
- --option
cataluge of examples
- -p, --package-name
Package name
packages¶
Operations with packages
dlp packages [-h] {ls,push,deploy,test,checkout,delete} ...
Positional Arguments¶
- packages
Possible choices: ls, push, deploy, test, checkout, delete
package operations
Sub-commands:¶
ls¶
List packages
dlp packages ls [-h] [-p PROJECT_NAME]
- -p, --project-name
Project name
push¶
Create package in platform
dlp packages push [-h] [-src ] [-cid ] [-pr ] [-p ] [-c]
- -src, --src-path
Revision to deploy if selected True
- -cid, --codebase-id
Revision to deploy if selected True
- -pr, --project-name
Project name
- -p, --package-name
Package name
- -c, --checkout
checkout the new package
Default: False
deploy¶
Deploy package to platform
dlp packages deploy [-h] [-p ] [-pr ] [--module-name ] [-c]
- -p, --package-name
Package name
- -pr, --project-name
Project name
- --module-name
Package module name
Default: “default_module”
- -c, --checkout
checkout the new package
Default: False
test¶
Tests that Package locally using mock.json
dlp packages test [-h] [-c ] [-f ]
- -c, --concurrency
Revision to deploy if selected True
Default: 10
- -f, --function-name
Function to test
Default: “run”
checkout¶
checkout a package
dlp packages checkout [-h] [-p ]
- -p, --package-name
package name
delete¶
Delete Package
dlp packages delete [-h] [-pkg PACKAGE_NAME] [-p PROJECT_NAME]
- -pkg, --package-name
Package name
- -p, --project-name
Project name
ls¶
List directories
dlp ls [-h]
pwd¶
Get current working directory
dlp pwd [-h]
cd¶
Change current working directory
dlp cd [-h] dir
Positional Arguments¶
- dir
mkdir¶
Make directory
dlp mkdir [-h] name
Positional Arguments¶
- name
clear¶
Clear shell
dlp clear [-h]
exit¶
Exit interactive shell
dlp exit [-h]
Repositories¶
Organizations¶
- class Organizations(client_api: ApiClient)[source]¶
Bases:
object
Organizations Repository
Read our documentation and SDK documentation to learn more about Organizations in the Dataloop platform.
- add_member(email: str, role: MemberOrgRole = MemberOrgRole.MEMBER, organization_id: Optional[str] = None, organization_name: Optional[str] = None, organization: Optional[Organization] = None)[source]¶
Add members to your organization. Read about members and groups here.
Prerequisities: To add members to an organization, you must be an owner in that organization.
You must provide at least ONE of the following params: organization, organization_name, or organization_id.
- Parameters
- Returns
True if successful or error if unsuccessful
- Return type
Example:
success = dl.organizations.add_member(email='user@domain.com', organization_id='organization_id', role=dl.MemberOrgRole.MEMBER)
- cache_action(organization_id: Optional[str] = None, organization_name: Optional[str] = None, organization: Optional[Organization] = None, mode=CacheAction.APPLY, pod_type=PodType.SMALL)[source]¶
Add or remove Cache for the org
Prerequisites: You must be an organization owner
You must provide at least ONE of the following params: organization, organization_name, or organization_id.
- Parameters
- Returns
True if success
- Return type
Example:
success = dl.organizations.enable_cache(organization_id='organization_id', mode=dl.CacheAction.APPLY)
- delete_member(user_id: str, organization_id: Optional[str] = None, organization_name: Optional[str] = None, organization: Optional[Organization] = None, sure: bool = False, really: bool = False) bool [source]¶
Delete member from the Organization.
Prerequisites: Must be an organization owner to delete members.
You must provide at least ONE of the following params: organization_id, organization_name, organization.
- Parameters
- Returns
True if success and error if not
- Return type
Example:
success = dl.organizations.delete_member(user_id='user_id', organization_id='organization_id', sure=True, really=True)
- get(organization_id: Optional[str] = None, organization_name: Optional[str] = None, fetch: Optional[bool] = None) Organization [source]¶
Get Organization object to be able to use it in your code.
Prerequisites: You must be a superuser to use this method.
You must provide at least ONE of the following params: organization_name or organization_id.
- Parameters
- Returns
Organization object
- Return type
Example:
org = dl.organizations.get(organization_id='organization_id')
- list() List[Organization] [source]¶
Lists all the organizations in Dataloop.
Prerequisites: You must be a superuser to use this method.
- Returns
List of Organization objects
- Return type
Example:
organizations = dl.organizations.list()
- list_groups(organization: Optional[Organization] = None, organization_id: Optional[str] = None, organization_name: Optional[str] = None)[source]¶
List all organization groups (groups that were created within the organization).
Prerequisites: You must be an organization owner to use this method.
You must provide at least ONE of the following params: organization, organization_name, or organization_id.
- Parameters
- Returns
groups list
- Return type
Example:
groups_list = dl.organizations.list_groups(organization_id='organization_id')
- list_integrations(organization: Optional[Organization] = None, organization_id: Optional[str] = None, organization_name: Optional[str] = None, only_available=False)[source]¶
List all organization integrations with external cloud storage.
Prerequisites: You must be an organization owner to use this method.
You must provide at least ONE of the following params: organization_id, organization_name, or organization.
- Parameters
- Returns
integrations list
- Return type
Example:
list_integrations = dl.organizations.list_integrations(organization='organization-entity', only_available=True)
- list_members(organization: Optional[Organization] = None, organization_id: Optional[str] = None, organization_name: Optional[str] = None, role: Optional[MemberOrgRole] = None)[source]¶
List all organization members.
Prerequisites: You must be an organization owner to use this method.
You must provide at least ONE of the following params: organization_id, organization_name, or organization.
- Parameters
- Returns
projects list
- Return type
Example:
list_members = dl.organizations.list_members(organization='organization-entity', role=dl.MemberOrgRole.MEMBER)
- update(plan: str, organization: Optional[Organization] = None, organization_id: Optional[str] = None, organization_name: Optional[str] = None) Organization [source]¶
Update an organization.
Prerequisites: You must be a superuser to update an organization.
You must provide at least ONE of the following params: organization, organization_name, or organization_id.
- Parameters
- Returns
organization object
- Return type
Example:
org = dl.organizations.update(organization='organization-entity', plan=dl.OrganizationsPlans.FREEMIUM)
- update_member(email: str, role: MemberOrgRole = MemberOrgRole.MEMBER, organization_id: Optional[str] = None, organization_name: Optional[str] = None, organization: Optional[Organization] = None)[source]¶
Update member role.
Prerequisites: You must be an organization owner to update a member’s role.
You must provide at least ONE of the following params: organization, organization_name, or organization_id.
- Parameters
- Returns
json of the member fields
- Return type
Example:
member_json = dl.organizations.update_member(email='user@domain.com', organization_id='organization_id', role=dl.MemberOrgRole.MEMBER)
Integrations¶
Integrations Repository
- class Integrations(client_api: ApiClient, org: Optional[Organization] = None, project: Optional[Project] = None)[source]¶
Bases:
object
Integrations Repository
The Integrations class allows you to manage data integrtion from your external storage (e.g., S3, GCS, Azure) into your Dataloop’s Dataset storage, as well as sync data in your Dataloop’s Datasets with data in your external storage.
For more information on Organization Storgae Integration see the Dataloop documentation and SDK External Storage.
- create(integrations_type: ExternalStorage, name: str, options: dict)[source]¶
Create an integration between an external storage and the organization.
Examples for options include: s3 - {key: “”, secret: “”}; gcs - {key: “”, secret: “”, content: “”}; azureblob - {key: “”, secret: “”, clientId: “”, tenantId: “”}; key_value - {key: “”, value: “”} aws-sts - {key: “”, secret: “”, roleArns: “”}
Prerequisites: You must be an owner in the organization.
- Parameters
- Returns
success
- Return type
Example:
project.integrations.create(integrations_type=dl.ExternalStorage.S3, name='S3ntegration', options={key: "Access key ID", secret: "Secret access key"})
- delete(integrations_id: str, sure: bool = False, really: bool = False) bool [source]¶
Delete integrations from the organization.
Prerequisites: You must be an organization owner to delete an integration.
- Parameters
- Returns
success
- Return type
Example:
project.integrations.delete(integrations_id='integrations_id', sure=True, really=True)
- get(integrations_id: str)[source]¶
Get organization integrations. Use this method to access your integration and be able to use it in your code.
Prerequisites: You must be an owner in the organization.
- Parameters
integrations_id (str) – integrations id
- Returns
Integration object
- Return type
Example:
project.integrations.get(integrations_id='integrations_id')
- list(only_available=False)[source]¶
List all the organization’s integrations with external storage.
Prerequisites: You must be an owner in the organization.
- Parameters
only_available (bool) – if True list only the available integrations.
- Returns
groups list
- Return type
Example:
project.integrations.list(only_available=True)
Projects¶
- class Projects(client_api: ApiClient, org=None)[source]¶
Bases:
object
Projects Repository
The Projects class allows the user to manage projects and their properties.
For more information on Projects see the Dataloop documentation and SDK documentation.
- add_member(email: str, project_id: str, role: MemberRole = MemberRole.DEVELOPER)[source]¶
Add a member to the project.
Prerequisites: You must be in the role of an owner to add a member to a project.
- Parameters
- Returns
dict that represent the user
- Return type
Example:
user_json = dl.projects.add_member(project_id='project_id', email='user@dataloop.ai', role=dl.MemberRole.DEVELOPER)
- checkout(identifier: Optional[str] = None, project_name: Optional[str] = None, project_id: Optional[str] = None, project: Optional[Project] = None)[source]¶
Checkout (switch) to a project to work on.
Prerequisites: All users can open a project in the web.
You must provide at least ONE of the following params: project_id, project_name.
- Parameters
identifier (str) – project name or partial id that you wish to switch
project_name (str) – The Name of the project
project_id (str) – The Id of the project
project (dtlpy.entities.project.Project) – project object
Example:
dl.projects.checkout(project_id='project_id')
- create(project_name: str, checkout: bool = False) Project [source]¶
Create a new project.
Prerequisites: Any user can create a project.
- Parameters
- Returns
Project object
- Return type
Example:
project = dl.projects.create(project_name='project_name')
- delete(project_name: Optional[str] = None, project_id: Optional[str] = None, sure: bool = False, really: bool = False) bool [source]¶
Delete a project forever!
Prerequisites: You must be in the role of an owner to delete a project.
- Parameters
- Returns
True if success, error if not
- Return type
Example:
is_deleted = dl.projects.delete(project_id='project_id', sure=True, really=True)
- get(project_name: Optional[str] = None, project_id: Optional[str] = None, checkout: bool = False, fetch: Optional[bool] = None, log_error=True) Project [source]¶
Get a Project object.
Prerequisites: You must be in the role of an owner to get a project object.
You must check out to a project or provide at least one of the following params: project_id, project_name
- Parameters
- Returns
Project object
- Return type
Example:
project = dl.projects.get(project_id='project_id')
- list() List[Project] [source]¶
Get the user’s project list
Prerequisites: You must be a superuser to list all users’ projects.
- Returns
List of Project objects
Example:
projects = dl.projects.list()
- list_members(project: Project, role: Optional[MemberRole] = None)[source]¶
Get a list of the project members.
Prerequisites: You must be in the role of an owner to list project members.
- Parameters
project (dtlpy.entities.project.Project) – Project object
role – The required role for the user. Use the enum dl.MemberRole
- Returns
list of the project members
- Return type
Example:
users_jsons_list = dl.projects.list_members(project_id='project_id', role=dl.MemberRole.DEVELOPER)
- open_in_web(project_name: Optional[str] = None, project_id: Optional[str] = None, project: Optional[Project] = None)[source]¶
Open the project in our web platform.
Prerequisites: All users can open a project in the web.
- Parameters
project_name (str) – The Name of the project
project_id (str) – The Id of the project
project (dtlpy.entities.project.Project) – project object
Example:
dl.projects.open_in_web(project_id='project_id')
- remove_member(email: str, project_id: str)[source]¶
Remove a member from the project.
Prerequisites: You must be in the role of an owner to delete a member from a project.
- Parameters
- Returns
dict that represents the user
- Return type
Example:
user_json = dl.projects.remove_member(project_id='project_id', email='user@dataloop.ai')
- update(project: Project, system_metadata: bool = False) Project [source]¶
Update a project information (e.g., name, member roles, etc.).
Prerequisites: You must be in the role of an owner to add a member to a project.
- Parameters
project (dtlpy.entities.project.Project) – project object
system_metadata (bool) – optional - True, if you want to change metadata system
- Returns
Project object
- Return type
Example:
project = dl.projects.delete(project='project_entity')
- update_member(email: str, project_id: str, role: MemberRole = MemberRole.DEVELOPER)[source]¶
Update member’s information/details in the project.
Prerequisites: You must be in the role of an owner to update a member.
- Parameters
- Returns
dict that represent the user
- Return type
Example:
user_json = = dl.projects.update_member(project_id='project_id', email='user@dataloop.ai', role=dl.MemberRole.DEVELOPER)
Datasets¶
Datasets Repository
- class Datasets(client_api: ApiClient, project: Optional[Project] = None)[source]¶
Bases:
object
Datasets Repository
The Datasets class allows the user to manage datasets. Read more about datasets in our documentation and SDK documentation.
- checkout(identifier: Optional[str] = None, dataset_name: Optional[str] = None, dataset_id: Optional[str] = None, dataset: Optional[Dataset] = None)[source]¶
Checkout (switch) to a dataset to work on it.
Prerequisites: You must be an owner or developer to use this method.
You must provide at least ONE of the following params: dataset_id, dataset_name.
- Parameters
identifier (str) – project name or partial id that you wish to switch
dataset_name (str) – The Name of the dataset
dataset_id (str) – The Id of the dataset
dataset (dtlpy.entities.dataset.Dataset) – dataset object
Example:
project.datasets.checkout(dataset_id='dataset_id')
- clone(dataset_id: str, clone_name: str, filters: Optional[Filters] = None, with_items_annotations: bool = True, with_metadata: bool = True, with_task_annotations_status: bool = True)[source]¶
Clone a dataset. Read more about cloning datatsets and items in our documentation and SDK documentation.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
dataset_id (str) – id of the dataset you wish to clone
clone_name (str) – new dataset name
filters (dtlpy.entities.filters.Filters) – Filters entity or a query dict
with_items_annotations (bool) – true to clone with items annotations
with_metadata (bool) – true to clone with metadata
with_task_annotations_status (bool) – true to clone with task annotations’ status
- Returns
dataset object
- Return type
Example:
dataset = project.datasets.clone(dataset_id='dataset_id', clone_name='dataset_clone_name', with_metadata=True, with_items_annotations=False, with_task_annotations_status=False)
- create(dataset_name: str, labels=None, attributes=None, ontology_ids=None, driver: Optional[Driver] = None, driver_id: Optional[str] = None, checkout: bool = False, expiration_options: Optional[ExpirationOptions] = None, index_driver: Optional[IndexDriver] = None, recipe_id: Optional[str] = None) Dataset [source]¶
Create a new dataset
Prerequisites: You must be in the role of an owner or developer.
- Parameters
dataset_name (str) – The Name of the dataset
labels (list) – dictionary of {tag: color} or list of label entities
attributes (list) – dataset’s ontology’s attributes
ontology_ids (list) – optional - dataset ontology
driver (dtlpy.entities.driver.Driver) – optional - storage driver Driver object or driver name
driver_id (str) – optional - driver id
checkout (bool) – set the dataset as a default dataset object (cookies)
expiration_options (ExpirationOptions) – dl.ExpirationOptions object that contain definitions for dataset like MaxItemDays
index_driver (str) – dl.IndexDriver, dataset driver version
recipe_id (str) – optional - recipe id
- Returns
Dataset object
- Return type
Example:
dataset = project.datasets.create(dataset_name='dataset_name', ontology_ids='ontology_ids')
- delete(dataset_name: Optional[str] = None, dataset_id: Optional[str] = None, sure: bool = False, really: bool = False)[source]¶
Delete a dataset forever!
Prerequisites: You must be an owner or developer to use this method.
Example:
is_deleted = project.datasets.delete(dataset_id='dataset_id', sure=True, really=True)
- directory_tree(dataset: Optional[Dataset] = None, dataset_name: Optional[str] = None, dataset_id: Optional[str] = None)[source]¶
Get dataset’s directory tree.
Prerequisites: You must be an owner or developer to use this method.
You must provide at least ONE of the following params: dataset, dataset_name, dataset_id.
- Parameters
dataset (dtlpy.entities.dataset.Dataset) – dataset object
dataset_name (str) – The Name of the dataset
dataset_id (str) – The Id of the dataset
- Returns
DirectoryTree
Example:
directory_tree = project.datasets.directory_tree(dataset='dataset_entity')
- static download_annotations(dataset: Dataset, local_path: Optional[str] = None, filters: Optional[Filters] = None, annotation_options: Optional[ViewAnnotationOptions] = None, annotation_filters: Optional[Filters] = None, overwrite: bool = False, thickness: int = 1, with_text: bool = False, remote_path: Optional[str] = None, include_annotations_in_output: bool = True, export_png_files: bool = False, filter_output_annotations: bool = False, alpha: Optional[float] = None, export_version=ExportVersion.V1) str [source]¶
Download dataset’s annotations by filters.
You may filter the dataset both for items and for annotations and download annotations.
Optional – download annotations as: mask, instance, image mask of the item.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
dataset (dtlpy.entities.dataset.Dataset) – dataset object
local_path (str) – local folder or filename to save to.
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
annotation_options (list) – type of download annotations: list(dl.ViewAnnotationOptions)
annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download
overwrite (bool) – optional - default = False to overwrite the existing files
thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1
with_text (bool) – optional - add text to annotations, default = False
remote_path (str) – DEPRECATED and ignored
include_annotations_in_output (bool) – default - False , if export should contain annotations
export_png_files (bool) – default - if True, semantic annotations should be exported as png files
filter_output_annotations (bool) – default - False, given an export by filter - determine if to filter out annotations
alpha (float) – opacity value [0 1], default 1
export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames
- Returns
local_path of the directory where all the downloaded item
- Return type
Example:
file_path = project.datasets.download_annotations(dataset='dataset_entity', local_path='local_path', annotation_options=dl.ViewAnnotationOptions, overwrite=False, thickness=1, with_text=False, alpha=1 )
- get(dataset_name: Optional[str] = None, dataset_id: Optional[str] = None, checkout: bool = False, fetch: Optional[bool] = None) Dataset [source]¶
Get dataset by name or id.
Prerequisites: You must be an owner or developer to use this method.
You must provide at least ONE of the following params: dataset_id, dataset_name.
- Parameters
- Returns
Dataset object
- Return type
Example:
dataset = project.datasets.get(dataset_id='dataset_id')
- list(name=None, creator=None) List[Dataset] [source]¶
List all datasets.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
- Returns
List of datasets
- Return type
Example:
datasets = project.datasets.list(name='name')
- merge(merge_name: str, dataset_ids: list, project_ids: str, with_items_annotations: bool = True, with_metadata: bool = True, with_task_annotations_status: bool = True, wait: bool = True)[source]¶
Merge a dataset. See our SDK docs for more information.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
merge_name (str) – new dataset name
dataset_ids (list) – list id’s of the datatsets you wish to merge
project_ids (str) – the project id that include the datasets
with_items_annotations (bool) – true to merge with items annotations
with_metadata (bool) – true to merge with metadata
with_task_annotations_status (bool) – true to merge with task annotations’ status
wait (bool) – wait for the command to finish
- Returns
True if success
- Return type
Example:
success = project.datasets.merge(dataset_ids=['dataset_id1','dataset_id2'], merge_name='dataset_merge_name', with_metadata=True, with_items_annotations=False, with_task_annotations_status=False)
- open_in_web(dataset_name: Optional[str] = None, dataset_id: Optional[str] = None, dataset: Optional[Dataset] = None)[source]¶
Open the dataset in web platform.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
dataset_name (str) – The Name of the dataset
dataset_id (str) – The Id of the dataset
dataset (dtlpy.entities.dataset.Dataset) – dataset object
Example:
project.datasets.open_in_web(dataset_id='dataset_id')
- set_readonly(state: bool, dataset: Dataset)[source]¶
Set dataset readonly mode.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
state (bool) – state to update readonly mode
dataset (dtlpy.entities.dataset.Dataset) – dataset object
Example:
project.datasets.set_readonly(dataset='dataset_entity', state=True)
- sync(dataset_id: str, wait: bool = True)[source]¶
Sync dataset with external storage.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
- Returns
True if success
- Return type
Example:
success = project.datasets.sync(dataset_id='dataset_id')
- update(dataset: Dataset, system_metadata: bool = False, patch: Optional[dict] = None) Dataset [source]¶
Update dataset field.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
dataset (dtlpy.entities.dataset.Dataset) – dataset object
system_metadata (bool) – True, if you want to change metadata system
patch (dict) – Specific patch request
- Returns
Dataset object
- Return type
Example:
dataset = project.datasets.update(dataset='dataset_entity')
- upload_annotations(dataset, local_path, filters: Optional[Filters] = None, clean=False, remote_root_path='/', export_version=ExportVersion.V1)[source]¶
Upload annotations to dataset.
Example for remote_root_path: If the item filepath is a/b/item and remote_root_path is /a the start folder will be b instead of a
Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.
- Parameters
dataset (dtlpy.entities.dataset.Dataset) – dataset to upload to
local_path (str) – str - local folder where the annotations files is
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
clean (bool) – True to remove the old annotations
remote_root_path (str) – the remote root path to match remote and local items
export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames
Example:
project.datasets.upload_annotations(dataset='dataset_entity', local_path='local_path', clean=False, export_version=dl.ExportVersion.V1 )
Drivers¶
- class Drivers(client_api: ApiClient, project: Optional[Project] = None)[source]¶
Bases:
object
Drivers Repository
The Drivers class allows users to manage drivers that are used to connect with external storage. Read more about external storage in our documentation and SDK documentation.
- create(name: str, driver_type: ExternalStorage, integration_id: str, bucket_name: str, integration_type: ExternalStorage, project_id: Optional[str] = None, allow_external_delete: bool = True, region: Optional[str] = None, storage_class: str = '', path: str = '')[source]¶
Create a storage driver.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
name (str) – the driver name
driver_type (str) – ExternalStorage.S3, ExternalStorage.GCS, ExternalStorage.AZUREBLOB
integration_id (str) – the integration id
bucket_name (str) – the external bucket name
integration_type (str) – ExternalStorage.S3, ExternalStorage.GCS, ExternalStorage.AZUREBLOB, ExternalStorage.AWS_STS
project_id (str) – project id
allow_external_delete (bool) – true to allow deleting files from external storage when files are deleted in your Dataloop storage
region (str) – relevant only for s3 - the bucket region
storage_class (str) – rilevante only for s3
path (str) – Optional. By default path is the root folder. Path is case sensitive integration
- Returns
driver object
- Return type
Example:
project.drivers.create(name='driver_name', driver_type=dl.ExternalStorage.S3, integration_id='integration_id', bucket_name='bucket_name', project_id='project_id', region='ey-west-1')
- delete(driver_name: Optional[str] = None, driver_id: Optional[str] = None, sure: bool = False, really: bool = False)[source]¶
Delete a driver forever!
Prerequisites: You must be an owner or developer to use this method.
Example:
project.drivers.delete(dataset_id='dataset_id', sure=True, really=True)
- get(driver_name: Optional[str] = None, driver_id: Optional[str] = None) Driver [source]¶
Get a Driver object to use in your code.
Prerequisites: You must be in the role of an owner or developer.
You must provide at least ONE of the following params: driver_name, driver_id.
- Parameters
- Returns
Driver object
- Return type
Example:
project.drivers.get(driver_id='driver_id')
Items¶
- class Items(client_api: ApiClient, datasets: Optional[Datasets] = None, dataset: Optional[Dataset] = None, dataset_id=None, items_entity=None, project=None)[source]¶
Bases:
object
Items Repository
The Items class allows you to manage items in your datasets. For information on actions related to items see Organizing Your Dataset, Item Metadata, and Item Metadata-Based Filtering.
- clone(item_id: str, dst_dataset_id: str, remote_filepath: Optional[str] = None, metadata: Optional[dict] = None, with_annotations: bool = True, with_metadata: bool = True, with_task_annotations_status: bool = False, allow_many: bool = False, wait: bool = True)[source]¶
Clone item. Read more about cloning datatsets and items in our documentation and SDK documentation.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
item_id (str) – item to clone
dst_dataset_id (str) – destination dataset id
remote_filepath (str) – complete filepath
metadata (dict) – new metadata to add
with_annotations (bool) – clone annotations
with_metadata (bool) – clone metadata
with_task_annotations_status (bool) – clone task annotations status
allow_many (bool) – bool if True, using multiple clones in single dataset is allowed, (default=False)
wait (bool) – wait for the command to finish
- Returns
Item object
- Return type
Example:
dataset.items.clone(item_id='item_id', dst_dataset_id='dist_dataset_id', with_metadata=True, with_task_annotations_status=False, with_annotations=False)
- delete(filename: Optional[str] = None, item_id: Optional[str] = None, filters: Optional[Filters] = None)[source]¶
Delete item from platform.
Prerequisites: You must be in the role of an owner or developer.
You must provide at least ONE of the following params: item id, filename, filters.
- Parameters
filename (str) – optional - search item by remote path
item_id (str) – optional - search item by id
filters (dtlpy.entities.filters.Filters) – optional - delete items by filter
- Returns
True if success
- Return type
Example:
dataset.items.delete(item_id='item_id')
- download(filters: Optional[Filters] = None, items=None, local_path: Optional[str] = None, file_types: Optional[list] = None, save_locally: bool = True, to_array: bool = False, annotation_options: Optional[ViewAnnotationOptions] = None, annotation_filters: Optional[Filters] = None, overwrite: bool = False, to_items_folder: bool = True, thickness: int = 1, with_text: bool = False, without_relative_path=None, avoid_unnecessary_annotation_download: bool = False, include_annotations_in_output: bool = True, export_png_files: bool = False, filter_output_annotations: bool = False, alpha: float = 1, export_version=ExportVersion.V1)[source]¶
Download dataset items by filters.
Filters the dataset for items and saves them locally.
Optional – download annotation, mask, instance, and image mask of the item.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
items (List[dtlpy.entities.item.Item] or dtlpy.entities.item.Item) – download Item entity or item_id (or a list of item)
local_path (str) – local folder or filename to save to.
file_types (list) – a list of file type to download. e.g [‘video/webm’, ‘video/mp4’, ‘image/jpeg’, ‘image/png’]
save_locally (bool) – bool. save to disk or return a buffer
to_array (bool) – returns Ndarray when True and local_path = False
annotation_options (list) – download annotations options: list(dl.ViewAnnotationOptions)
annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download
overwrite (bool) – optional - default = False
to_items_folder (bool) – Create ‘items’ folder and download items to it
thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1
with_text (bool) – optional - add text to annotations, default = False
without_relative_path (bool) – bool - download items without the relative path from platform
avoid_unnecessary_annotation_download (bool) – default - False
include_annotations_in_output (bool) – default - False , if export should contain annotations
export_png_files (bool) – default - if True, semantic annotations should be exported as png files
filter_output_annotations (bool) – default - False, given an export by filter - determine if to filter out annotations
alpha (float) – opacity value [0 1], default 1
export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames
- Returns
generator of local_path per each downloaded item
- Return type
generator or single item
Example:
dataset.items.download(local_path='local_path', annotation_options=dl.ViewAnnotationOptions, overwrite=False, thickness=1, with_text=False, alpha=1, save_locally=True )
- get(filepath: Optional[str] = None, item_id: Optional[str] = None, fetch: Optional[bool] = None, is_dir: bool = False) Item [source]¶
Get Item object
Prerequisites: You must be in the role of an owner or developer.
- Parameters
- Returns
Item object
- Return type
Example:
dataset.items.get(item_id='item_id')
- get_all_items(filters: Optional[Filters] = None) [<class 'dtlpy.entities.item.Item'>] [source]¶
Get all items in dataset.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
filters (dtlpy.entities.filters.Filters) – dl.Filters entity to filters items
- Returns
list of all items
- Return type
Example:
dataset.items.get_all_items()
- list(filters: Optional[Filters] = None, page_offset: Optional[int] = None, page_size: Optional[int] = None) PagedEntities [source]¶
List items in a dataset.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
page_offset (int) – start page
page_size (int) – page size
- Returns
Pages object
- Return type
Example:
dataset.items.list(page_offset=0, page_size=100)
- make_dir(directory, dataset: Optional[Dataset] = None) Item [source]¶
Create a directory in a dataset.
Prerequisites: All users.
- Parameters
directory (str) – name of directory
dataset (dtlpy.entities.dataset.Dataset) – dataset object
- Returns
Item object
- Return type
Example:
dataset.items.make_dir(directory='directory_name')
- move_items(destination: str, filters: Optional[Filters] = None, items=None, dataset: Optional[Dataset] = None) bool [source]¶
Move items to another directory. If directory does not exist we will create it
Prerequisites: You must be in the role of an owner or developer.
- Parameters
destination (str) – destination directory
filters (dtlpy.entities.filters.Filters) – optional - either this or items. Query of items to move
items – optional - either this or filters. A list of items to move
dataset (dtlpy.entities.dataset.Dataset) – dataset object
- Returns
True if success
- Return type
Example:
dataset.items.move_items(destination='directory_name')
- open_in_web(filepath=None, item_id=None, item=None)[source]¶
Open the item in web platform
Prerequisites: You must be in the role of an owner or developer or be an annotation manager/annotator with access to that item through task.
- Parameters
filepath (str) – item file path
item_id (str) – item id
item (dtlpy.entities.item.Item) – item entity
Example:
dataset.items.open_in_web(item_id='item_id')
- set_items_entity(entity)[source]¶
Set the item entity type to Artifact, Item, or Codebase.
- Parameters
entity (entities.Item, entities.Artifact, entities.Codebase) – entity type [entities.Item, entities.Artifact, entities.Codebase]
- update(item: Optional[Item] = None, filters: Optional[Filters] = None, update_values=None, system_update_values=None, system_metadata: bool = False)[source]¶
Update item metadata.
Prerequisites: You must be in the role of an owner or developer.
You must provide at least ONE of the following params: update_values, system_update_values.
- Parameters
item (dtlpy.entities.item.Item) – Item object
filters (dtlpy.entities.filters.Filters) – optional update filtered items by given filter
update_values – optional field to be updated and new values
system_update_values – values in system metadata to be updated
system_metadata (bool) – True, if you want to update the metadata system
- Returns
Item object
- Return type
Example:
dataset.items.update(item='item_entity')
- update_status(status: ItemStatus, items=None, item_ids=None, filters=None, dataset=None, clear=False)[source]¶
Update item status in task
Prerequisites: You must be in the role of an owner or developer or annotation manager who has been assigned a task with the item.
You must provide at least ONE of the following params: items, item_ids, filters.
- Parameters
status (str) – ItemStatus.COMPLETED, ItemStatus.APPROVED, ItemStatus.DISCARDED
items (list) – list of items
item_ids (list) – list of items id
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
dataset (dtlpy.entities.dataset.Dataset) – dataset object
clear (bool) – to delete status
Example:
dataset.items.update_status(item_ids='item_id', status=dl.ItemStatus.COMPLETED)
- upload(local_path: str, local_annotations_path: ~typing.Optional[str] = None, remote_path: str = '/', remote_name: ~typing.Optional[str] = None, file_types: ~typing.Optional[~dtlpy.repositories.items.Items.list] = None, overwrite: bool = False, item_metadata: ~typing.Optional[dict] = None, output_entity=<class 'dtlpy.entities.item.Item'>, no_output: bool = False, export_version: str = ExportVersion.V1, item_description: ~typing.Optional[str] = None)[source]¶
Upload local file to dataset. Local filesystem will remain unchanged. If “*” at the end of local_path (e.g. “/images/*”) items will be uploaded without the head directory.
Prerequisites: Any user can upload items.
- Parameters
local_path (str) – list of local file, local folder, BufferIO, numpy.ndarray or url to upload
local_annotations_path (str) – path to dataloop format annotations json files.
remote_path (str) – remote path to save.
remote_name (str) – remote base name to save. when upload numpy.ndarray as local path, remote_name with .jpg or .png ext is mandatory
file_types (list) – list of file type to upload. e.g [‘.jpg’, ‘.png’]. default is all
item_metadata (dict) – metadata dict to upload to item or ExportMetadata option to export metadata from annotation file
overwrite (bool) – optional - default = False
output_entity – output type
no_output (bool) – do not return the items after upload
export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames
item_description (str) – add a string description to the uploaded item
- Returns
Output (generator/single item)
- Return type
generator or single item
Example:
dataset.items.upload(local_path='local_path', local_annotations_path='local_annotations_path', overwrite=True, item_metadata={'Hellow': 'Word'} )
Annotations¶
- class Annotations(client_api: ApiClient, item=None, dataset=None, dataset_id=None)[source]¶
Bases:
object
Annotations Repository
The Annotation class allows you to manage the annotations of data items. For information on annotations explore our documentation at: Classification SDK, Annotation Labels and Attributes, Show Video with Annotations.
- builder()[source]¶
Create Annotation collection.
- Prerequisites: You must have an item to be annotated. You must have the role of an owner or developer
or be assigned a task that includes that item as an annotation manager or annotator.
- Returns
Annotation collection object
- Return type
Example:
annotation_collection = item.annotations.builder()
- delete(annotation: Optional[Annotation] = None, annotation_id: Optional[str] = None, filters: Optional[Filters] = None) bool [source]¶
Remove an annotation from item.
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.
- Parameters
annotation (dtlpy.entities.annotation.Annotation) – Annotation object
annotation_id (str) – The id of the annotation
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
- Returns
True/False
- Return type
Example:
is_deleted = item.annotations.delete(annotation_id='annotation_id')
- download(filepath: str, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.JSON, img_filepath: Optional[str] = None, height: Optional[float] = None, width: Optional[float] = None, thickness: int = 1, with_text: bool = False, alpha: float = 1)[source]¶
Save annotation to file.
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.
- Parameters
filepath (str) – Target download directory
annotation_format (str) – the format that want to download ,options: list(dl.ViewAnnotationOptions)
img_filepath (str) – img file path - needed for img_mask
height (float) – optional - image height
width (float) – optional - image width
thickness (int) – optional - line thickness, default=1
with_text (bool) – optional - draw annotation with text, default = False
alpha (float) – opacity value [0 1], default 1
- Returns
file path to where save the annotations
- Return type
Example:
file_path = item.annotations.download( filepath='file_path', annotation_format=dl.ViewAnnotationOptions.MASK, img_filepath='img_filepath', height=100, width=100, thickness=1, with_text=False, alpha=1)
- get(annotation_id: str) Annotation [source]¶
Get a single annotation.
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.
- Parameters
annotation_id (str) – The id of the annotation
- Returns
Annotation object or None
- Returns
Annotation object or None
- Return type
Example:
annotation = item.annotations.get(annotation_id='annotation_id')
- list(filters: Optional[Filters] = None, page_offset: Optional[int] = None, page_size: Optional[int] = None)[source]¶
List Annotations of a specific item. You must get the item first and then list the annotations with the desired filters.
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
page_offset (int) – starting page
page_size (int) – size of page
- Returns
Pages object
- Return type
Example:
annotations = item.annotations.list(filters=dl.Filters( resource=dl.FiltersResource.ANNOTATION, field='type', values='box'), page_size=100, page_offset=0)
- show(image=None, thickness: int = 1, with_text: bool = False, height: Optional[float] = None, width: Optional[float] = None, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.MASK, alpha: float = 1)[source]¶
Show annotations. To use this method, you must get the item first and then show the annotations with the desired filters. The method returns an array showing all the annotations.
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.
- Parameters
image (ndarray) – empty or image to draw on
thickness (int) – optional - line thickness, default=1
with_text (bool) – add label to annotation
height (float) – item height
width (float) – item width
annotation_format (str) – the format that want to show ,options: list(dl.ViewAnnotationOptions)
alpha (float) – opacity value [0 1], default 1
- Returns
ndarray of the annotations
- Return type
ndarray
Example:
image = item.annotations.show(image='nd array', thickness=1, with_text=False, height=100, width=100, annotation_format=dl.ViewAnnotationOptions.MASK, alpha=1)
- update(annotations, system_metadata=False)[source]¶
Update an existing annotation. For example, you may change the annotation’s label and then use the update method.
- Prerequisites: You must have an item that has been annotated. You must have the role of an owner or
developer or be assigned a task that includes that item as an annotation manager or annotator.
- Parameters
annotations (dtlpy.entities.annotation.Annotation) – Annotation object
system_metadata (bool) – bool - True, if you want to change metadata system
- Returns
True if successful or error if unsuccessful
- Return type
bool
Example:
annotations = item.annotations.update(annotation='annotation')
- update_status(annotation: Optional[Annotation] = None, annotation_id: Optional[str] = None, status: AnnotationStatus = AnnotationStatus.ISSUE) Annotation [source]¶
Set status on annotation.
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager.
- Parameters
annotation (dtlpy.entities.annotation.Annotation) – Annotation object
annotation_id (str) – optional - annotation id to set status
status (str) – can be AnnotationStatus.ISSUE, APPROVED, REVIEW, CLEAR
- Returns
Annotation object
- Return type
Example:
annotation = item.annotations.update_status(annotation_id='annotation_id', status=dl.AnnotationStatus.ISSUE)
- upload(annotations) AnnotationCollection [source]¶
Upload a new annotation/annotations. You must first create the annotation using the annotation builder method.
Prerequisites: Any user can upload annotations.
- Parameters
annotations (List[dtlpy.entities.annotation.Annotation] or dtlpy.entities.annotation.Annotation) – list or
single annotation of type Annotation :return: list of annotation objects :rtype: entities.AnnotationCollection
Example:
annotations = item.annotations.upload(annotations='builder')
Recipes¶
- class Recipes(client_api: ApiClient, dataset: Optional[Dataset] = None, project: Optional[Project] = None, project_id: Optional[str] = None)[source]¶
Bases:
object
Recipes Repository
The Recipes class allows you to manage recipes and their properties. For more information on Recipes, see our documentation and SDK documentation.
- clone(recipe: Optional[Recipe] = None, recipe_id: Optional[str] = None, shallow: bool = False)[source]¶
Clone recipe.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
recipe (dtlpy.entities.recipe.Recipe) – Recipe object
recipe_id (str) – Recipe id
shallow (bool) – If True, link to existing ontology, clones all ontologies that are linked to the recipe as well
- Returns
Cloned ontology object
- Return type
Example:
dataset.recipes.clone(recipe_id='recipe_id')
- create(project_ids=None, ontology_ids=None, labels=None, recipe_name=None, attributes=None, annotation_instruction_file=None) Recipe [source]¶
Create a new Recipe. Note: If the param ontology_ids is None, an ontology will be created first.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
- Returns
Recipe entity
- Return type
Example:
dataset.recipes.create(recipe_name='My Recipe', labels=labels))
- delete(recipe_id: str, force: bool = False)[source]¶
Delete recipe from platform.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
- Returns
True if success
- Return type
Example:
dataset.recipes.delete(recipe_id='recipe_id')
- get(recipe_id: str) Recipe [source]¶
Get a Recipe object to use in your code.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
recipe_id (str) – recipe id
- Returns
Recipe object
- Return type
Example:
dataset.recipes.get(recipe_id='recipe_id')
- list(filters: Optional[Filters] = None) List[Recipe] [source]¶
List recipes for a dataset.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
- Returns
list of all recipes
- Retype
list
Example:
dataset.recipes.list()
- open_in_web(recipe: Optional[Recipe] = None, recipe_id: Optional[str] = None)[source]¶
Open the recipe in web platform.
Prerequisites: All users.
- Parameters
recipe (dtlpy.entities.recipe.Recipe) – recipe entity
recipe_id (str) – recipe id
Example:
dataset.recipes.open_in_web(recipe_id='recipe_id')
- update(recipe: Recipe, system_metadata=False) Recipe [source]¶
Update recipe.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
recipe (dtlpy.entities.recipe.Recipe) – Recipe object
system_metadata (bool) – True, if you want to change metadata system
- Returns
Recipe object
- Return type
Example:
dataset.recipes.update(recipe='recipe_entity')
Ontologies¶
- class Ontologies(client_api: ApiClient, recipe: Optional[Recipe] = None, project: Optional[Project] = None, dataset: Optional[Dataset] = None)[source]¶
Bases:
object
Ontologies Repository
The Ontologies class allows users to manage ontologies and their properties. Read more about ontology in our SDK docs.
- create(labels, title=None, project_ids=None, attributes=None) Ontology [source]¶
Create a new ontology.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
- Returns
Ontology object
- Return type
Example:
recipe.ontologies.create(labels='labels_entity', title='new_ontology', project_ids='project_ids')
- delete(ontology_id)[source]¶
Delete Ontology from the platform.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
ontology_id – ontology id
- Returns
True if success
- Return type
Example:
recipe.ontologies.delete(ontology_id='ontology_id')
- delete_attributes(ontology_id, keys: list)[source]¶
Delete a bulk of attributes
- Parameters
- Returns
True if success
- Return type
Example:
ontology.delete_attributes(['1'])
- get(ontology_id: str) Ontology [source]¶
Get Ontology object to use in your code.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
ontology_id (str) – ontology id
- Returns
Ontology object
- Return type
Example:
recipe.ontologies.get(ontology_id='ontology_id')
- static labels_to_roots(labels)[source]¶
Converts labels dictionary to a list of platform representation of labels.
- Parameters
labels (dict) – labels dict
- Returns
platform representation of labels
- list(project_ids=None) List[Ontology] [source]¶
List ontologies for recipe
Prerequisites: You must be in the role of an owner or developer.
- Parameters
project_ids –
- Returns
list of all the ontologies
Example:
recipe.ontologies.list(project_ids='project_ids')
- update(ontology: Ontology, system_metadata=False) Ontology [source]¶
Update the Ontology metadata.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
ontology (dtlpy.entities.ontology.Ontology) – Ontology object
system_metadata (bool) – bool - True, if you want to change metadata system
- Returns
Ontology object
- Return type
Example:
recipe.ontologies.delete(ontology='ontology_entity')
- update_attributes(ontology_id: str, title: str, key: str, attribute_type: AttributesTypes, scope: Optional[list] = None, optional: Optional[bool] = None, values: Optional[list] = None, attribute_range: Optional[AttributesRange] = None)[source]¶
ADD a new attribute or update if exist
- Parameters
ontology_id (str) – ontology_id
title (str) – attribute title
key (str) – the key of the attribute must br unique
attribute_type (AttributesTypes) – dl.AttributesTypes your attribute type
scope (list) – list of the labels or * for all labels
optional (bool) – optional attribute
values (list) – list of the attribute values ( for checkbox and radio button)
attribute_range (dict or AttributesRange) – dl.AttributesRange object
- Returns
true in success
- Return type
Example:
ontology.update_attributes(key='1', title='checkbox', attribute_type=dl.AttributesTypes.CHECKBOX, values=[1,2,3])
Tasks¶
- class Tasks(client_api: ApiClient, project: Optional[Project] = None, dataset: Optional[Dataset] = None, project_id: Optional[str] = None)[source]¶
Bases:
object
Tasks Repository
The Tasks class allows the user to manage tasks and their properties. For more information, read in our SDK documentation about Creating Tasks, Redistributing and Reassigning Tasks, and Task Assignment.
- add_items(task: Optional[Task] = None, task_id=None, filters: Optional[Filters] = None, items=None, assignee_ids=None, query=None, workload=None, limit=None, wait=True) Task [source]¶
Add items to a Task.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.
- Parameters
task (dtlpy.entities.task.Task) – task object
task_id (str) – the Id of the task
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
items (list) – list of items (item Ids or objects) to add to the task
assignee_ids (list) – list to assignee who works in the task
query (dict) – query to filter the items for the task
workload (list) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]
limit (int) – the limit items that task can include
wait (bool) – wait until add items will to finish
- Returns
task entity
- Return type
Example:
dataset.tasks.add_items(task= 'task_entity', items = [items])
- create(task_name, due_date=None, assignee_ids=None, workload=None, dataset=None, task_owner=None, task_type='annotation', task_parent_id=None, project_id=None, recipe_id=None, assignments_ids=None, metadata=None, filters=None, items=None, query=None, available_actions=None, wait=True, check_if_exist: Filters = False, limit=None, batch_size=None, max_batch_workload=None, allowed_assignees=None, priority=TaskPriority.MEDIUM) Task [source]¶
Create a new Task (Annotation or QA).
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.
- Parameters
task_name (str) – the name of the task
due_date (float) – date by which the task should be finished; for example, due_date=datetime.datetime(day=1, month=1, year=2029).timestamp()
assignee_ids (list) – list the task assignees (contributors) that should be working on the task. Provide a list of users’ emails
workload (List[WorkloadUnit] List[WorkloadUnit]) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]
dataset (entities.Dataset) – dataset object, the dataset that refer to the task
task_owner (str) – task owner. Provide user email
task_type (str) – task type “annotation” or “qa”
task_parent_id (str) – optional if type is qa - parent annotation task id
project_id (str) – the Id of the project where task will be created
recipe_id (str) – recipe id for the task
assignments_ids (list) – assignments ids to the task
metadata (dict) – metadata for the task
filters (entities.Filters) – dl.Filters entity to filter items for the task
items (List[entities.Item]) – list of items (item Id or objects) to insert to the task
query (dict DQL) – filter items for the task
available_actions (list) – list of available actions (statuses) that will be available for the task items; The default statuses are: “Completed” and “Discarded”
wait (bool) – wait until create task finish
check_if_exist (entities.Filters) – dl.Filters check if task exist according to filter
limit (int) – the limit items that the task can include
batch_size (int) – Pulling batch size (items), use with pulling allocation method. Restrictions - Min 3, max 100
max_batch_workload (int) – max_batch_workload: Max items in assignment, use with pulling allocation method. Restrictions - Min batchSize + 2, max batchSize * 2
allowed_assignees (list) – list the task assignees (contributors) that should be working on the task. Provide a list of users’ emails
priority (entities.TaskPriority) – priority of the task options in entities.TaskPriority
- Returns
Task object
- Return type
Example:
dataset.tasks.create(task= 'task_entity', due_date = datetime.datetime(day= 1, month= 1, year= 2029).timestamp(), assignee_ids =[ 'annotator1@dataloop.ai', 'annotator2@dataloop.ai'])
- create_qa_task(task: Task, assignee_ids, due_date=None, filters=None, items=None, query=None, workload=None, metadata=None, available_actions=None, wait=True, batch_size=None, max_batch_workload=None, allowed_assignees=None, priority=TaskPriority.MEDIUM) Task [source]¶
Create a new QA Task.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.
- Parameters
task (dtlpy.entities.task.Task) – the parent annotation task object
assignee_ids (list) – list the QA task assignees (contributors) that should be working on the task. Provide a list of users’ emails
due_date (float) – date by which the QA task should be finished; for example, due_date=datetime.datetime(day=1, month=1, year=2029).timestamp()
filters (entities.Filters) – dl.Filters entity to filter items for the task
items (List[entities.Item]) – list of items (item Id or objects) to insert to the task
query (dict DQL) – filter items for the task
workload (List[WorkloadUnit]) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]
metadata (dict) – metadata for the task
available_actions (list) – list of available actions (statuses) that will be available for the task items; The default statuses are: “Approved” and “Discarded”
wait (bool) – wait until create task finish
batch_size (int) – Pulling batch size (items), use with pulling allocation method. Restrictions - Min 3, max 100
max_batch_workload (int) – Max items in assignment, use with pulling allocation method. Restrictions - Min batchSize + 2, max batchSize * 2
allowed_assignees (list) – list the task assignees (contributors) that should be working on the task. Provide a list of users’ emails
priority (entities.TaskPriority) – priority of the task options in entities.TaskPriority
- Returns
task object
- Return type
Example:
dataset.tasks.create_qa_task(task= 'task_entity', due_date = datetime.datetime(day= 1, month= 1, year= 2029).timestamp(), assignee_ids =[ 'annotator1@dataloop.ai', 'annotator2@dataloop.ai'])
- delete(task: Optional[Task] = None, task_name: Optional[str] = None, task_id: Optional[str] = None, wait: bool = True)[source]¶
Delete the Task.
Prerequisites: You must be in the role of an owner or developer or annotation manager who created that task.
- Parameters
task (dtlpy.entities.task.Task) – the task object
task_name (str) – the name of the task
task_id (str) – the Id of the task
wait (bool) – wait until delete task finish
- Returns
True is success
- Return type
Example:
dataset.tasks.delete(task_id='task_id')
- get(task_name=None, task_id=None) Task [source]¶
Get a Task object to use in your code.
Prerequisites: You must be in the role of an owner or developer or annotation manager who has been assigned the task.
- Parameters
- Returns
task object
- Return type
Example:
dataset.tasks.get(task_id='task_id')
- get_items(task_id: Optional[str] = None, task_name: Optional[str] = None, dataset: Optional[Dataset] = None, filters: Optional[Filters] = None) PagedEntities [source]¶
Get the task items to use in your code.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.
If a filters param is provided, you will receive a PagedEntity output of the task items. If no filter is provided, you will receive a list of the items.
- Parameters
task_id (str) – the Id of the task
task_name (str) – the name of the task
dataset (dtlpy.entities.dataset.Dataset) – dataset object that refer to the task
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
- Returns
list of the items or PagedEntity output of items
- Return type
Example:
dataset.tasks.get_items(task_id= 'task_id')
- list(project_ids=None, status=None, task_name=None, pages_size=None, page_offset=None, recipe=None, creator=None, assignments=None, min_date=None, max_date=None, filters: Optional[Filters] = None) Union[List[Task], PagedEntities] [source]¶
List all tasks.
Prerequisites: You must be in the role of an owner or developer or annotation manager who has been assigned the task.
- Parameters
project_ids – search tasks by given list of project ids
status (str) – search tasks by a given task status
task_name (str) – search tasks by a given task name
pages_size (int) – pages size of the output generator
page_offset (int) – page offset of the output generator
recipe (dtlpy.entities.recipe.Recipe) – Search tasks that use a given recipe. Provide the required recipe object
creator (str) – search tasks created by a given creator (user email)
assignments (dtlpy.entities.assignment.Assignment recipe) – assignments object
min_date (double) – search all tasks created AFTER a given date, use a milliseconds format. For example: 1661780622008
max_date (double) – search all tasks created BEFORE a given date, use a milliseconds format. For example: 1661780622008
filters (dtlpy.entities.filters.Filters) – dl.Filters entity to filters tasks using DQL
- Returns
List of Task objects
Example:
dataset.tasks.list(project_ids='project_ids',pages_size=100, page_offset=0)
- open_in_web(task_name: Optional[str] = None, task_id: Optional[str] = None, task: Optional[Task] = None)[source]¶
Open the task in the web platform.
Prerequisites: You must be in the role of an owner or developer or annotation manager who has been assigned the task.
- Parameters
task_name (str) – the name of the task
task_id (str) – the Id of the task
task (dtlpy.entities.task.Task) – the task object
Example:
dataset.tasks.open_in_web(task_id='task_id')
- query(filters=None, project_ids=None)[source]¶
List all tasks by filter.
Prerequisites: You must be in the role of an owner or developer or annotation manager who has been assigned the task.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
project_ids (list) – list of project ids of the required tasks
- Returns
Paged entity - task pages generator
- Return type
Example:
dataset.tasks.query(project_ids='project_ids')
- remove_items(task: Optional[Task] = None, task_id=None, filters: Optional[Filters] = None, query=None, items=None, wait=True)[source]¶
remove items from Task.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.
- Parameters
task (dtlpy.entities.task.Task) – task object
task_id (str) – the Id of the task
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
query (dict) – query to filter the items use it
items (list) – list of items to add to the task
wait (bool) – wait until remove items finish
- Returns
True if success and an error if failed
- Return type
Examples:
dataset.tasks.remove_items(task= 'task_entity', items = [items])
- set_status(status: str, operation: str, task_id: str, item_ids: List[str])[source]¶
Update an item status within a task.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.
- Parameters
- Returns
True if success
- Return type
Example:
dataset.tasks.set_status(task_id= 'task_id', status='complete', operation='create')
- update(task: Optional[Task] = None, system_metadata=False) Task [source]¶
Update a Task.
Prerequisites: You must be in the role of an owner or developer or annotation manager who created that task.
- Parameters
task (dtlpy.entities.task.Task) – the task object
system_metadata (bool) – True, if you want to change metadata system
- Returns
Task object
- Return type
Example:
dataset.tasks.update(task='task_entity')
Assignments¶
- class Assignments(client_api: ApiClient, project: Optional[Project] = None, task: Optional[Task] = None, dataset: Optional[Dataset] = None, project_id=None)[source]¶
Bases:
object
Assignments Repository
The Assignments class allows users to manage assignments and their properties. Read more about Task Assignment in our SDK documentation.
- create(assignee_id: str, task: Optional[Task] = None, filters: Optional[Filters] = None, items: Optional[list] = None) Assignment [source]¶
Create a new assignment.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
assignee_id (str) – the email of the user that want to assign the assignment
task (dtlpy.entities.task.Task) – the task object that include the assignment
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
items (list) – list of items (item Id or objects) to insert to the assignment
- Returns
Assignment object
- Return type
dtlpy.entities.assignment.Assignment assignment
Example:
assignment = task.assignments.create(assignee_id='annotator1@dataloop.ai')
- get(assignment_name: Optional[str] = None, assignment_id: Optional[str] = None)[source]¶
Get Assignment object to use it in your code.
- Parameters
- Returns
Assignment object
- Return type
Example:
assignment = task.assignments.get(assignment_id='assignment_id')
- get_items(assignment: Optional[Assignment] = None, assignment_id=None, assignment_name=None, dataset=None, filters=None) PagedEntities [source]¶
Get all the items in the assignment.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
assignment (dtlpy.entities.assignment.Assignment) – assignment object
assignment_id – the Id of the assignment
assignment_name (str) – the name of the assignment
dataset (dtlpy.entities.dataset.Dataset) – dataset object, the dataset that refer to the assignment
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
- Returns
pages of the items
- Return type
Example:
items = task.assignments.get_items(assignment_id='assignment_id')
- list(project_ids: Optional[list] = None, status: Optional[str] = None, assignment_name: Optional[str] = None, assignee_id: Optional[str] = None, pages_size: Optional[int] = None, page_offset: Optional[int] = None, task_id: Optional[int] = None) List[Assignment] [source]¶
Get Assignment list to be able to use it in your code.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
project_ids (list) – search assignment by given list of project ids
status (str) – search assignment by a given task status
assignment_name (str) – search assignment by a given assignment name
assignee_id (str) – the user email that assignee the assignment to it
pages_size (int) – pages size of the output generator
page_offset (int) – page offset of the output generator
task_id (str) – search assignment by given task id
- Returns
List of Assignment objects
- Return type
miscellaneous.List[dtlpy.entities.assignment.Assignment]
Example:
assignments = task.assignments.list(status='complete', assignee_id='user@dataloop.ai', pages_size=100, page_offset=0)
- open_in_web(assignment_name: Optional[str] = None, assignment_id: Optional[str] = None, assignment: Optional[str] = None)[source]¶
Open the assignment in the platform.
Prerequisites: All users.
- Parameters
assignment_name (str) – the name of the assignment
assignment_id (str) – the Id of the assignment
assignment (dtlpy.entities.assignment.Assignment) – assignment object
Example:
task.assignments.open_in_web(assignment_id='assignment_id')
- reassign(assignee_id: str, assignment: Optional[Assignment] = None, assignment_id: Optional[str] = None, task: Optional[Task] = None, task_id: Optional[str] = None, wait: bool = True)[source]¶
Reassign an assignment.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
assignee_id (str) – the email of the user that want to assign the assignment
assignment (dtlpy.entities.assignment.Assignment) – assignment object
assignment_id – the Id of the assignment
task (dtlpy.entities.task.Task) – task object
task_id (str) – the Id of the task that include the assignment
wait (bool) – wait until reassign assignment finish
- Returns
Assignment object
- Return type
Example:
assignment = task.assignments.reassign(assignee_ids='annotator1@dataloop.ai')
- redistribute(workload: Workload, assignment: Optional[Assignment] = None, assignment_id: Optional[str] = None, task: Optional[Task] = None, task_id: Optional[str] = None, wait: bool = True)[source]¶
Redistribute an assignment.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
Example:
- Parameters
workload (dtlpy.entities.assignment.Workload) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]
assignment (dtlpy.entities.assignment.Assignment) – assignment object
assignment_id (str) – the Id of the assignment
task (dtlpy.entities.task.Task) – the task object that include the assignment
task_id (str) – the Id of the task that include the assignment
wait (bool) – wait until redistribute assignment finish
- Returns
Assignment object
- Return type
dtlpy.entities.assignment.Assignment assignment
assignment = task.assignments.redistribute(workload=dl.Workload([dl.WorkloadUnit(assignee_id="annotator1@dataloop.ai", load=50), dl.WorkloadUnit(assignee_id="annotator2@dataloop.ai", load=50)]))
- set_status(status: str, operation: str, item_id: str, assignment_id: str) bool [source]¶
Set item status within assignment.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
- Returns
True id success
- Return type
Example:
success = task.assignments.set_status(assignment_id='assignment_id', status='complete', operation='created', item_id='item_id')
- update(assignment: Optional[Assignment] = None, system_metadata: bool = False) Assignment [source]¶
Update an assignment.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
assignment (dtlpy.entities.assignment.Assignment assignment) – assignment entity
system_metadata (bool) – True, if you want to change metadata system
- Returns
Assignment object
- Return type
dtlpy.entities.assignment.Assignment assignment
Example:
assignment = task.assignments.update(assignment='assignment_entity', system_metadata=False)
Packages¶
- class LocalServiceRunner(client_api: ApiClient, packages, cwd=None, multithreading=False, concurrency=10, package: Optional[Package] = None, module_name='default_module', function_name='run', class_name='ServiceRunner', entry_point='main.py', mock_file_path=None)[source]¶
Bases:
object
Service Runner Class
- class Packages(client_api: ApiClient, project: Optional[Project] = None)[source]¶
Bases:
object
Packages Repository
The Packages class allows users to manage packages (code used for running in Dataloop’s FaaS) and their properties. Read more about Packages.
- build(package: Package, module_name=None, init_inputs=None, local_path=None, from_local=None)[source]¶
Instantiate a module from the package code. Returns a loaded instance of the runner class
- Parameters
package – Package entity
module_name – Name of the module to build the runner class
init_inputs (str) – dictionary of the class init variables (if exists). will be used to init the module class
local_path (str) – local path of the package (if from_local=False - codebase will be downloaded)
from_local (bool) – bool. if true - codebase will not be downloaded (only use local files)
- Returns
dl.BaseServiceRunner
- build_requirements(filepath) list [source]¶
Build a requirement list (list of packages your code requires to run) from a file path. The file listing the requirements MUST BE a txt file.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
filepath – path of the requirements file
- Returns
a list of dl.PackageRequirement
- Return type
- static build_trigger_dict(actions, name='default_module', filters=None, function='run', execution_mode: TriggerExecutionMode = 'Once', type_t: TriggerType = 'Event')[source]¶
Build a trigger dictionary to trigger FaaS. Read more about FaaS Triggers.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
actions – list of dl.TriggerAction
name (str) – trigger name
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
function (str) – function name
execution_mode (str) – execution mode dl.TriggerExecutionMode
type_t (str) – trigger type dl.TriggerType
- Returns
trigger dict
- Return type
Example:
project.packages.build_trigger_dict(actions=dl.TriggerAction.CREATED, function='run', execution_mode=dl.TriggerExecutionMode.ONCE)
- static check_cls_arguments(cls, missing, function_name, function_inputs)[source]¶
Check class arguments. This method checks that the package function is correct.
Prerequisites: You must be in the role of an owner or developer.
- checkout(package: Optional[Package] = None, package_id: Optional[str] = None, package_name: Optional[str] = None)[source]¶
Checkout (switch) to a package.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
package (dtlpy.entities.package.Package) – package entity
package_id (str) – package id
package_name (str) – package name
Example:
project.packages.checkout(package='package_entity')
- delete(package: Optional[Package] = None, package_name=None, package_id=None)[source]¶
Delete a Package object.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
package (dtlpy.entities.package.Package) – package entity
package_id (str) – package id
package_name (str) – package name
- Returns
True if success
- Return type
Example:
project.packages.delete(package_name='package_name')
- deploy(package_id: Optional[str] = None, package_name: Optional[str] = None, package: Optional[Package] = None, service_name: Optional[str] = None, project_id: Optional[str] = None, revision: Optional[str] = None, init_input: Optional[Union[List[FunctionIO], FunctionIO, dict]] = None, runtime: Optional[Union[KubernetesRuntime, dict]] = None, sdk_version: Optional[str] = None, agent_versions: Optional[dict] = None, bot: Optional[Union[Bot, str]] = None, pod_type: Optional[InstanceCatalog] = None, verify: bool = True, checkout: bool = False, module_name: Optional[str] = None, run_execution_as_process: Optional[bool] = None, execution_timeout: Optional[int] = None, drain_time: Optional[int] = None, on_reset: Optional[str] = None, max_attempts: Optional[int] = None, force: bool = False, secrets: Optional[list] = None, **kwargs) Service [source]¶
Deploy a package. A service is required to run the code in your package.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
package_id (str) – package id
package_name (str) – package name
package (dtlpy.entities.package.Package) – package entity
service_name (str) – service name
project_id (str) – project id
revision (str) – package revision - default=latest
init_input – config to run at startup
runtime (dict) – runtime resources
sdk_version (str) –
optional - string - sdk version
agent_versions (dict) –
dictionary - - optional -versions of sdk, agent runner and agent proxy
bot (str) – bot email
pod_type (str) – pod type dl.InstanceCatalog
verify (bool) – verify the inputs
checkout (bool) – checkout
module_name (str) – module name
run_execution_as_process (bool) – run execution as process
execution_timeout (int) – execution timeout
drain_time (int) – drain time
on_reset (str) – on reset
max_attempts (int) – Maximum execution retries in-case of a service reset
force (bool) – optional - terminate old replicas immediately
secrets (list) – list of the integrations ids
- Returns
Service object
- Return type
Example:
project.packages.deploy(service_name=package_name, execution_timeout=3 * 60 * 60, module_name=module.name, runtime=dl.KubernetesRuntime( concurrency=10, pod_type=dl.InstanceCatalog.REGULAR_S, autoscaler=dl.KubernetesRabbitmqAutoscaler( min_replicas=1, max_replicas=20, queue_length=20 ) ) )
- deploy_from_file(project, json_filepath)[source]¶
Deploy package and service from a JSON file.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
project (dtlpy.entities.project.Project) – project entity
json_filepath (str) – path of the file to deploy
- Returns
the package and the services
Example:
project.packages.deploy_from_file(project='project_entity', json_filepath='json_filepath')
- static generate(name=None, src_path: Optional[str] = None, service_name: Optional[str] = None, package_type='default_package_type')[source]¶
Generate a new package. Provide a file path to a JSON file with all the details of the package and service to generate the package.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
Example:
project.packages.generate(name='package_name', src_path='src_path')
- get(package_name: Optional[str] = None, package_id: Optional[str] = None, checkout: bool = False, fetch=None) Package [source]¶
Get Package object to use in your code.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
- Returns
Package object
- Return type
Example:
project.packages.get(package_id='package_id')
- list(filters: Optional[Filters] = None, project_id: Optional[str] = None) PagedEntities [source]¶
List project packages.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
project_id (str) – project id
- Returns
Paged entity
- Return type
Example:
project.packages.list()
- open_in_web(package: Optional[Package] = None, package_id: Optional[str] = None, package_name: Optional[str] = None)[source]¶
Open the package in the web platform.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
package (dtlpy.entities.package.Package) – package entity
package_id (str) – package id
package_name (str) – package name
Example:
project.packages.open_in_web(package_id='package_id')
- pull(package: Package, version=None, local_path=None, project_id=None)[source]¶
Pull (download) the package to a local path.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
package (dtlpy.entities.package.Package) – package entity
version (str) – the package version to pull
local_path – the path of where to save the package
project_id – the project id that include the package
- Returns
local path where the package pull
- Return type
Example:
project.packages.pull(package='package_entity', local_path='local_path')
- push(project: Optional[Project] = None, project_id: Optional[str] = None, package_name: Optional[str] = None, src_path: Optional[str] = None, codebase: Optional[Union[GitCodebase, ItemCodebase, FilesystemCodebase]] = None, modules: Optional[List[PackageModule]] = None, is_global: Optional[bool] = None, checkout: bool = False, revision_increment: Optional[str] = None, version: Optional[str] = None, ignore_sanity_check: bool = False, service_update: bool = False, service_config: Optional[dict] = None, slots: Optional[List[PackageSlot]] = None, requirements: Optional[List[PackageRequirement]] = None, package_type=None, metadata=None) Package [source]¶
Push your local package to the UI.
Prerequisites: You must be in the role of an owner or developer.
Project will be taken in the following hierarchy: project(input) -> project_id(input) -> self.project(context) -> checked out
- Parameters
project (dtlpy.entities.project.Project) – optional - project entity to deploy to. default from context or checked-out
project_id (str) – optional - project id to deploy to. default from context or checked-out
package_name (str) – package name
src_path (str) – path to package codebase
codebase (dtlpy.entities.codebase.Codebase) – codebase object
modules (list) – list of modules PackageModules of the package
is_global (bool) – is package is global or local
checkout (bool) – checkout package to local dir
revision_increment (str) – optional - str - version bumping method - major/minor/patch - default = None
version (str) – semver version f the package
ignore_sanity_check (bool) – NOT RECOMMENDED - skip code sanity check before pushing
service_update (bool) – optional - bool - update the service
:param dict service_config : Service object as dict. Contains the spec of the default service to create. :param list slots: optional - list of slots PackageSlot of the package :param list requirements: requirements - list of package requirements :param str package_type: default ‘faas’, options: ‘app’, ‘ml :param dict metadata: dictionary of system and user metadata
- Returns
Package object
- Return type
Example:
project.packages.push(package_name='package_name', modules=[module], version='1.0.0', src_path=os.getcwd() )
- revisions(package: Optional[Package] = None, package_id: Optional[str] = None)[source]¶
Get the package revisions history.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
package (dtlpy.entities.package.Package) – package entity
package_id (str) – package id
Example:
project.packages.revisions(package='package_entity')
- test_local_package(cwd: Optional[str] = None, concurrency: Optional[int] = None, package: Optional[Package] = None, module_name: str = 'default_module', function_name: str = 'run', class_name: str = 'ServiceRunner', entry_point: str = 'main.py', mock_file_path: Optional[str] = None)[source]¶
Test local package in local environment.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
cwd (str) – path to the file
concurrency (int) – the concurrency of the test
package (dtlpy.entities.package.Package) – entities.package
module_name (str) – module name
function_name (str) – function name
class_name (str) – class name
entry_point (str) – the file to run like main.py
mock_file_path (str) – the mock file that have the inputs
- Returns
list created by the function that tested the output
- Return type
Example:
project.packages.test_local_package(cwd='path_to_package', package='package_entity', function_name='run')
- update(package: Package, revision_increment: Optional[str] = None) Package [source]¶
Update Package changes to the platform.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
package (dtlpy.entities.package.Package) –
revision_increment – optional - str - version bumping method - major/minor/patch - default = None
- Returns
Package object
- Return type
Example:
project.packages.delete(package='package_entity')
Codebases¶
- class Codebases(client_api: ApiClient, project: Optional[Project] = None, dataset: Optional[Dataset] = None, project_id: Optional[str] = None)[source]¶
Bases:
object
Codebase Repository
The Codebases class allows the user to manage codebases and their properties. The codebase is the code the user uploads for the user’s packages to run. Read more about codebase in our FaaS (function as a service).
- clone_git(codebase: Codebase, local_path: str)[source]¶
Clone code base
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
codebase (dtlpy.entities.codebase.Codebase) – codebase object
local_path (str) – local path
- Returns
path where the clone will be
- Return type
str
Example:
package.codebases.clone_git(codebase='codebase_entity', local_path='local_path')
- get(codebase_name: Optional[str] = None, codebase_id: Optional[str] = None, version: Optional[str] = None)[source]¶
Get a Codebase object to use in your code.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
Example:
package.codebases.get(codebase_name='codebase_name')
- static get_current_version(all_versions_pages, zip_md)[source]¶
This method returns the current version of the codebase and other versions found.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
all_versions_pages (codebase) – codebase object
zip_md – zipped file of codebase
- Returns
current version and all versions found of codebase
- Return type
Example:
package.codebases.get_current_version(all_versions_pages='codebase_entity', zip_md='path')
- list() PagedEntities [source]¶
List all codebases.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
Example:
package.codebases.list()
- Returns
Paged entity
- Return type
- list_versions(codebase_name: str)[source]¶
List all codebase versions.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
Example:
package.codebases.list_versions(codebase_name='codebase_name')
- pack(directory: str, name: Optional[str] = None, extension: str = 'zip', description: str = '', ignore_directories: Optional[List[str]] = None)[source]¶
Zip a local code directory and post to codebases.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
- Returns
Codebase object
- Return type
dtlpy.entities.codebase.Codebase
Example:
package.codebases.pack(directory='path_dir', name='codebase_name')
- pull_git(codebase, local_path)[source]¶
Pull (download) a codebase.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
codebase (dtlpy.entities.codebase.Codebase) – codebase object
local_path (str) – local path
- Returns
path where the Pull will be
- Return type
Example:
package.codebases.pull_git(codebase='codebase_entity', local_path='local_path')
- unpack(codebase: Optional[Codebase] = None, codebase_name: Optional[str] = None, codebase_id: Optional[str] = None, local_path: Optional[str] = None, version: Optional[str] = None)[source]¶
Unpack codebase locally. Download source code and unzip.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
- Returns
String (dirpath)
- Return type
Example:
package.codebases.unpack(codebase='codebase_entity', local_path='local_path')
Services¶
- class ServiceLog(_json: dict, service: Service, services: Services, start=None, follow=None, execution_id=None, function_name=None, replica_id=None, system=False)[source]¶
Bases:
object
Service Log
- class Services(client_api: ApiClient, project: Optional[Project] = None, package: Optional[Package] = None, project_id=None)[source]¶
Bases:
object
Services Repository
The Services class allows the user to manage services and their properties. Services are created from the packages users create. See our documentation for more information about services.
- activate_slots(service: Service, project_id: Optional[str] = None, task_id: Optional[str] = None, dataset_id: Optional[str] = None, org_id: Optional[str] = None, user_email: Optional[str] = None, slots: Optional[List[PackageSlot]] = None, role=None, prevent_override: bool = True, visible: bool = True, icon: str = 'fas fa-magic', **kwargs)[source]¶
Activate service slots (creates buttons in the UI that activate services).
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
service (dtlpy.entities.service.Service) – service entity
project_id (str) – project id
task_id (str) – task id
dataset_id (str) – dataset id
org_id (str) – org id
user_email (str) – user email
slots (list) – list of entities.PackageSlot
role (str) – user role MemberOrgRole.ADMIN, MemberOrgRole.owner, MemberOrgRole.MEMBER
prevent_override (bool) – True to prevent override
visible (bool) – visible
icon (str) – icon
kwargs – all additional arguments
- Returns
list of user setting for activated slots
- Return type
Example:
setting = package.services.activate_slots(service='service_entity', project_id='project_id', slots=List[entities.PackageSlot], icon='fas fa-magic')
- checkout(service: Optional[Service] = None, service_name: Optional[str] = None, service_id: Optional[str] = None)[source]¶
Checkout (switch) to a service.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
service (dtlpy.entities.service.Service) – Service entity
service_name (str) – service name
service_id (str) – service id
Example:
package.services.checkout(service_id='service_id')
- delete(service_name: Optional[str] = None, service_id: Optional[str] = None)[source]¶
Delete Service object
Prerequisites: You must be in the role of an owner or developer. You must have a package.
You must provide at least ONE of the following params: service_id, service_name.
Example:
is_deleted = package.services.delete(service_id='service_id')
- deploy(service_name: Optional[str] = None, package: Optional[Package] = None, bot: Optional[Union[Bot, str]] = None, revision: Optional[str] = None, init_input: Optional[Union[List[FunctionIO], FunctionIO, dict]] = None, runtime: Optional[Union[KubernetesRuntime, dict]] = None, pod_type: Optional[InstanceCatalog] = None, sdk_version: Optional[str] = None, agent_versions: Optional[dict] = None, verify: bool = True, checkout: bool = False, module_name: Optional[str] = None, project_id: Optional[str] = None, driver_id: Optional[str] = None, func: Optional[Callable] = None, run_execution_as_process: Optional[bool] = None, execution_timeout: Optional[int] = None, drain_time: Optional[int] = None, max_attempts: Optional[int] = None, on_reset: Optional[str] = None, force: bool = False, secrets: Optional[list] = None, **kwargs) Service [source]¶
Deploy service.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
service_name (str) – name
package (dtlpy.entities.package.Package) – package entity
bot (str) – bot email
revision (str) – package revision of version
init_input – config to run at startup
runtime (dict) – runtime resources
pod_type (str) – pod type dl.InstanceCatalog
sdk_version (str) –
optional - string - sdk version
agent_versions (str) –
dictionary - - optional -versions of sdk
verify (bool) – if true, verify the inputs
checkout (bool) – if true, checkout (switch) to service
module_name (str) – module name
project_id (str) – project id
driver_id (str) – driver id
func (Callable) – function to deploy
run_execution_as_process (bool) – if true, run execution as process
execution_timeout (int) – execution timeout in seconds
drain_time (int) – drain time in seconds
max_attempts (int) – maximum execution retries in-case of a service reset
on_reset (str) – what happens on reset
force (bool) – optional - if true, terminate old replicas immediately
secrets (list) – list of the integrations ids
kwargs – list of additional arguments
- Returns
Service object
- Return type
Example:
service = package.services.deploy(service_name=package_name, execution_timeout=3 * 60 * 60, module_name=module.name, runtime=dl.KubernetesRuntime( concurrency=10, pod_type=dl.InstanceCatalog.REGULAR_S, autoscaler=dl.KubernetesRabbitmqAutoscaler( min_replicas=1, max_replicas=20, queue_length=20 ) ) )
- deploy_from_local_folder(cwd=None, service_file=None, bot=None, checkout=False, force=False) Service [source]¶
Deploy from local folder in local environment.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
- Returns
Service object
- Return type
Example:
service = package.services.deploy_from_local_folder(cwd='file_path', service_file='service_file')
- execute(service: Optional[Service] = None, service_id: Optional[str] = None, service_name: Optional[str] = None, sync: bool = False, function_name: Optional[str] = None, stream_logs: bool = False, execution_input=None, resource=None, item_id=None, dataset_id=None, annotation_id=None, project_id=None) Execution [source]¶
Execute a function on an existing service.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
service (dtlpy.entities.service.Service) – service entity
service_id (str) – service id
service_name (str) – service name
sync (bool) – wait for function to end
function_name (str) – function name to run
stream_logs (bool) – prints logs of the new execution. only works with sync=True
execution_input – input dictionary or list of FunctionIO entities
resource (str) – dl.PackageInputType - input type.
item_id (str) – str - optional - input to function
dataset_id (str) – str - optional - input to function
annotation_id (str) – str - optional - input to function
project_id (str) – str - resource’s project
- Returns
entities.Execution
- Return type
Example:
execution = package.services.execute(service='service_entity', function_name='run', item_id='item_id', project_id='project_id')
- get(service_name=None, service_id=None, checkout=False, fetch=None) Service [source]¶
Get service to use in your code.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
- Returns
Service object
- Return type
Example:
service = package.services.get(service_id='service_id')
- list(filters: Optional[Filters] = None) PagedEntities [source]¶
List all services (services can be listed for a package or for a project).
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
- Returns
Paged entity
- Return type
Example:
services = package.services.list()
- log(service, size=100, checkpoint=None, start=None, end=None, follow=False, text=None, execution_id=None, function_name=None, replica_id=None, system=False, view=True, until_completed=True)[source]¶
Get service logs.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
service (dtlpy.entities.service.Service) – service object
size (int) – size
checkpoint (dict) – the information from the lst point checked in the service
start (str) – iso format time
end (str) – iso format time
follow (bool) – if true, keep stream future logs
text (str) – text
execution_id (str) – execution id
function_name (str) – function name
replica_id (str) – replica id
system (bool) – system
view (bool) – if true, print out all the logs
until_completed (bool) – wait until completed
- Returns
ServiceLog entity
- Return type
Example:
service_logs = package.services.log(service='service_entity')
- name_validation(name: str)[source]¶
Validation service name.
Prerequisites: You must be in the role of an owner or developer.
- Parameters
name (str) – service name
Example:
package.services.name_validation(name='name')
- open_in_web(service: Optional[Service] = None, service_id: Optional[str] = None, service_name: Optional[str] = None)[source]¶
Open the service in web platform
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
service_name (str) – service name
service_id (str) – service id
service (dtlpy.entities.service.Service) – service entity
Example:
package.services.open_in_web(service_id='service_id')
- pause(service_name: Optional[str] = None, service_id: Optional[str] = None, force: bool = False)[source]¶
Pause service.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
You must provide at least ONE of the following params: service_id, service_name
- Parameters
- Returns
True if success
- Return type
Example:
success = package.services.pause(service_id='service_id')
- resume(service_name: Optional[str] = None, service_id: Optional[str] = None, force: bool = False)[source]¶
Resume service.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
You must provide at least ONE of the following params: service_id, service_name.
- Parameters
- Returns
json of the service
- Return type
Example:
service_json = package.services.resume(service_id='service_id')
- revisions(service: Optional[Service] = None, service_id: Optional[str] = None)[source]¶
Get service revisions history.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
You must provide at leats ONE of the following params: service, service_id
- Parameters
service (dtlpy.entities.service.Service) – Service entity
service_id (str) – service id
Example:
service_revision = package.services.revisions(service_id='service_id')
- status(service_name=None, service_id=None)[source]¶
Get service status.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
You must provide at least ONE of the following params: service_id, service_name
- Parameters
- Returns
status json
- Return type
Example:
status_json = package.services.status(service_id='service_id')
- update(service: Service, force: bool = False) Service [source]¶
Update service changes to platform.
Prerequisites: You must be in the role of an owner or developer. You must have a package.
- Parameters
service (dtlpy.entities.service.Service) – Service entity
force (bool) – optional - terminate old replicas immediately
- Returns
Service entity
- Return type
Example:
service = package.services.update(service='service_entity')
Bots¶
- class Bots(client_api: ApiClient, project: Project)[source]¶
Bases:
object
Bots Repository
The Bots class allows the user to manage bots and their properties. See our documentation for more information on bots.
- create(name: str, return_credentials: bool = False)[source]¶
Create a new Bot.
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
- Returns
Bot object
- Return type
Example:
service.bots.delete(name='bot', return_credentials=False)
- delete(bot_id: Optional[str] = None, bot_email: Optional[str] = None)[source]¶
Delete a Bot.
Prerequisites: You must be in the role of an owner or developer. You must have a service.
You must provide at least ONE of the following params: bot_id, bot_email
- Parameters
- Returns
True if successful
- Return type
Example:
service.bots.delete(bot_id='bot_id')
- get(bot_email: Optional[str] = None, bot_id: Optional[str] = None, bot_name: Optional[str] = None)[source]¶
Get a Bot object.
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
- Returns
Bot object
- Return type
Example:
service.bots.get(bot_id='bot_id')
Triggers¶
- class Triggers(client_api: ApiClient, project: Optional[Project] = None, service: Optional[Service] = None, project_id: Optional[str] = None, pipeline: Optional[Pipeline] = None)[source]¶
Bases:
object
Triggers Repository
The Triggers class allows users to manage triggers and their properties. Triggers activate services. See our documentation for more information on triggers.
- create(service_id: Optional[str] = None, trigger_type: TriggerType = TriggerType.EVENT, name: Optional[str] = None, webhook_id=None, function_name='run', project_id=None, active=True, filters=None, resource: TriggerResource = TriggerResource.ITEM, actions: Optional[TriggerAction] = None, execution_mode: TriggerExecutionMode = TriggerExecutionMode.ONCE, start_at=None, end_at=None, inputs=None, cron=None, pipeline_id=None, pipeline=None, pipeline_node_id=None, root_node_namespace=None, **kwargs) BaseTrigger [source]¶
Create a Trigger. Can create two types: a cron trigger or an event trigger. Inputs are different for each type
Prerequisites: You must be in the role of an owner or developer. You must have a service.
Inputs for all types:
- Parameters
service_id (str) – Id of services to be triggered
trigger_type (str) – can be cron or event. use enum dl.TriggerType for the full list
name (str) – name of the trigger
webhook_id (str) – id for webhook to be called
function_name (str) – the function name to be called when triggered (must be defined in the package)
project_id (str) – project id where trigger will work
active (bool) – optional - True/False, default = True, if true trigger is active
Inputs for event trigger: :param dtlpy.entities.filters.Filters filters: optional - Item/Annotation metadata filters, default = none :param str resource: optional - Dataset/Item/Annotation/ItemStatus, default = Item :param str actions: optional - Created/Updated/Deleted, default = create :param str execution_mode: how many times trigger should be activated; default is “Once”. enum dl.TriggerExecutionMode
Inputs for cron trigger: :param start_at: iso format date string to start activating the cron trigger :param end_at: iso format date string to end the cron activation :param inputs: dictionary “name”:”val” of inputs to the function :param str cron: cron spec specifying when it should run. more information: https://en.wikipedia.org/wiki/Cron :param str pipeline_id: Id of pipeline to be triggered :param pipeline: pipeline entity to be triggered :param str pipeline_node_id: Id of pipeline root node to be triggered :param root_node_namespace: namespace of pipeline root node to be triggered
- Returns
Trigger entity
- Return type
Example:
service.triggers.create(name='triggername', execution_mode=dl.TriggerExecutionMode.ONCE, resource='Item', actions='Created', function_name='run', filters={'$and': [{'hidden': False}, {'type': 'file'}]} )
- delete(trigger_id=None, trigger_name=None)[source]¶
Delete Trigger object
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
- Returns
True is successful error if not
- Return type
Example:
service.triggers.delete(trigger_id='trigger_id')
- get(trigger_id=None, trigger_name=None) BaseTrigger [source]¶
Get Trigger object
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
- Returns
Trigger entity
- Return type
Example:
service.triggers.get(trigger_id='trigger_id')
- list(filters: Optional[Filters] = None) PagedEntities [source]¶
List triggers of a project, package, or service.
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
- Returns
Paged entity
- Return type
Example:
service.triggers.list()
- name_validation(name: str)[source]¶
This method validates the trigger name. If name is not valid, this method will return an error. Otherwise, it will not return anything.
- Parameters
name (str) – trigger name
- resource_information(resource, resource_type, action='Created')[source]¶
Returns which function should run on an item (based on global triggers).
Prerequisites: You must be a superuser to run this method.
- Parameters
resource – ‘Item’ / ‘Dataset’ / etc
resource_type – dictionary of the resource object
action – ‘Created’ / ‘Updated’ / etc.
Example:
service.triggers.resource_information(resource='Item', resource_type=item_object, action='Created')
- update(trigger: BaseTrigger) BaseTrigger [source]¶
Update trigger
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
trigger (dtlpy.entities.trigger.Trigger) – Trigger entity
- Returns
Trigger entity
- Return type
Example:
service.triggers.update(trigger='trigger_entity')
Executions¶
- class Executions(client_api: ApiClient, service: Optional[Service] = None, project: Optional[Project] = None)[source]¶
Bases:
object
Service Executions Repository
The Executions class allows the users to manage executions (executions of services) and their properties. See our documentation for more information about executions.
- create(service_id: Optional[str] = None, execution_input: Optional[list] = None, function_name: Optional[str] = None, resource: Optional[PackageInputType] = None, item_id: Optional[str] = None, dataset_id: Optional[str] = None, annotation_id: Optional[str] = None, project_id: Optional[str] = None, sync: bool = False, stream_logs: bool = False, return_output: bool = False, return_curl_only: bool = False, timeout: Optional[int] = None) Execution [source]¶
Execute a function on an existing service
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
service_id (str) – service id to execute on
execution_input (List[FunctionIO] or dict) – input dictionary or list of FunctionIO entities
function_name (str) – function name to run
resource (str) – input type.
item_id (str) – optional - item id as input to function
dataset_id (str) – optional - dataset id as input to function
annotation_id (str) – optional - annotation id as input to function
project_id (str) – resource’s project
sync (bool) – if true, wait for function to end
stream_logs (bool) – prints logs of the new execution. only works with sync=True
return_output (bool) – if True and sync is True - will return the output directly
return_curl_only (bool) – return the cURL of the creation WITHOUT actually do it
timeout (int) – int, seconds to wait until TimeoutError is raised. if <=0 - wait until done - by default wait take the service timeout
- Returns
execution object
- Return type
Example:
service.executions.create(function_name='function_name', item_id='item_id', project_id='project_id')
- get(execution_id: Optional[str] = None, sync: bool = False) Execution [source]¶
Get Service execution object
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
- Returns
Service execution object
- Return type
Example:
service.executions.get(execution_id='execution_id')
- increment(execution: Execution)[source]¶
Increment the number of attempts that an execution is allowed to attempt to run a service that is not responding.
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
execution (dtlpy.entities.execution.Execution) –
- Returns
int
- Return type
Example:
service.executions.increment(execution='execution_entity')
- list(filters: Optional[Filters] = None) PagedEntities [source]¶
List service executions
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
filters (dtlpy.entities.filters.Filters) – dl.Filters entity to filters items
- Returns
Paged entity
- Return type
Example:
service.executions.list()
- logs(execution_id: str, follow: bool = True, until_completed: bool = True)[source]¶
executions logs
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
- Returns
executions logs
Example:
service.executions.logs(execution_id='execution_id')
- progress_update(execution_id: str, status: Optional[ExecutionStatus] = None, percent_complete: Optional[int] = None, message: Optional[str] = None, output: Optional[str] = None, service_version: Optional[str] = None)[source]¶
Update Execution Progress.
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
- Returns
Service execution object
- Return type
Example:
service.executions.progress_update(execution_id='execution_id', status='complete', percent_complete=100)
- rerun(execution: Execution, sync: bool = False)[source]¶
Rerun execution
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
execution (dtlpy.entities.execution.Execution) –
sync (bool) – wait for the execution to finish
- Returns
Execution object
- Return type
Example:
service.executions.rerun(execution='execution_entity')
- terminate(execution: Execution)[source]¶
Terminate Execution
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
execution (dtlpy.entities.execution.Execution) –
- Returns
execution object
- Return type
Example:
service.executions.terminate(execution='execution_entity')
- update(execution: Execution) Execution [source]¶
Update execution changes to platform
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
execution (dtlpy.entities.execution.Execution) – execution entity
- Returns
Service execution object
- Return type
Example:
service.executions.update(execution='execution_entity')
- wait(execution_id: str, timeout: Optional[int] = None)[source]¶
Get Service execution object.
Prerequisites: You must be in the role of an owner or developer. You must have a service.
- Parameters
- Returns
Service execution object
- Return type
Example:
service.executions.wait(execution_id='execution_id')
Pipelines¶
- class Pipelines(client_api: ApiClient, project: Optional[Project] = None)[source]¶
Bases:
object
Pipelines Repository
The Pipelines class allows users to manage pipelines and their properties. See our documentation for more information on pipelines.
- create(name: Optional[str] = None, project_id: Optional[str] = None, pipeline_json: Optional[dict] = None) Pipeline [source]¶
Create a new pipeline.
prerequisites: You must be an owner or developer to use this method.
- Parameters
- Returns
Pipeline object
- Return type
Example:
pipeline = project.pipelines.create(name='pipeline_name')
- delete(pipeline: Optional[Pipeline] = None, pipeline_name: Optional[str] = None, pipeline_id: Optional[str] = None)[source]¶
Delete Pipeline object.
prerequisites: You must be an owner or developer to use this method.
- Parameters
pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity
pipeline_id (str) – pipeline id
pipeline_name (str) – pipeline name
- Returns
True if success
- Return type
Example:
is_deleted = project.pipelines.delete(pipeline_id='pipeline_id')
- execute(pipeline: Optional[Pipeline] = None, pipeline_id: Optional[str] = None, pipeline_name: Optional[str] = None, execution_input=None)[source]¶
Execute a pipeline and return the pipeline execution as an object.
prerequisites: You must be an owner or developer to use this method.
- Parameters
pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity
pipeline_id (str) – pipeline id
pipeline_name (str) – pipeline name
execution_input – list of the dl.FunctionIO or dict of pipeline input - example {‘item’: ‘item_id’}
- Returns
entities.PipelineExecution object
- Return type
Example:
pipeline_execution= project.pipelines.execute(pipeline='pipeline_entity', execution_input= {'item': 'item_id'} )
- get(pipeline_name=None, pipeline_id=None, fetch=None) Pipeline [source]¶
Get Pipeline object to use in your code.
prerequisites: You must be an owner or developer to use this method.
You must provide at least ONE of the following params: pipeline_name, pipeline_id.
- Parameters
- Returns
Pipeline object
- Return type
Example:
pipeline = project.pipelines.get(pipeline_id='pipeline_id')
- install(pipeline: Optional[Pipeline] = None, resume_option: Optional[PipelineResumeOption] = None)[source]¶
Install (start) a pipeline.
prerequisites: You must be an owner or developer to use this method.
- Parameters
pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity
resume_option (dtlpy.entities.pipeline.PipelineResumeOption) – optional - resume pipeline method (what to do with existing cycles)
- Returns
Composition object
Example:
project.pipelines.install(pipeline='pipeline_entity')
- list(filters: Optional[Filters] = None, project_id: Optional[str] = None) PagedEntities [source]¶
List project pipelines.
prerequisites: You must be an owner or developer to use this method.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
project_id (str) – project id
- Returns
Paged entity
- Return type
Example:
pipelines = project.pipelines.list()
- open_in_web(pipeline: Optional[Pipeline] = None, pipeline_id: Optional[str] = None, pipeline_name: Optional[str] = None)[source]¶
Open the pipeline in web platform.
prerequisites: Must be owner or developer to use this method.
- Parameters
pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity
pipeline_id (str) – pipeline id
pipeline_name (str) – pipeline name
Example:
project.pipelines.open_in_web(pipeline_id='pipeline_id')
- pause(pipeline: Optional[Pipeline] = None, keep_triggers_active: Optional[bool] = None)[source]¶
Pause a pipeline.
prerequisites: You must be an owner or developer to use this method.
- Parameters
pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity
keep_triggers_active (bool) – Do we want the triggers to stay active and collect events
- Returns
Composition object
Example:
project.pipelines.pause(pipeline='pipeline_entity')
- reset(pipeline: Optional[Pipeline] = None, pipeline_id: Optional[str] = None, pipeline_name: Optional[str] = None, stop_if_running: bool = False)[source]¶
Reset pipeline counters.
prerequisites: You must be an owner or developer to use this method.
- Parameters
pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity - optional
pipeline_id (str) – pipeline_id - optional
pipeline_name (str) – pipeline_name - optional
stop_if_running (bool) – If the pipeline is installed it will stop the pipeline and reset the counters.
- Returns
bool
Example:
success = project.pipelines.reset(pipeline='pipeline_entity')
- stats(pipeline: Optional[Pipeline] = None, pipeline_id: Optional[str] = None, pipeline_name: Optional[str] = None)[source]¶
Get pipeline counters.
prerequisites: You must be an owner or developer to use this method.
- Parameters
pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity - optional
pipeline_id (str) – pipeline_id - optional
pipeline_name (str) – pipeline_name - optional
- Returns
PipelineStats
- Return type
dtlpy.entities.pipeline.PipelineStats
Example:
pipeline_stats = project.pipelines.stats(pipeline='pipeline_entity')
- update(pipeline: Optional[Pipeline] = None) Pipeline [source]¶
Update pipeline changes to platform.
prerequisites: You must be an owner or developer to use this method.
- Parameters
pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity
- Returns
Pipeline object
- Return type
Example:
pipeline = project.pipelines.update(pipeline='pipeline_entity')
- update_settings(pipeline: Pipeline, settings: PipelineSettings)[source]¶
Update pipeline settings.
prerequisites: You must be an owner or developer to use this method.
- Parameters
pipeline (dtlpy.entities.pipeline.Pipeline) – pipeline entity
settings (dtlpy.entities.pipeline.PipelineSettings) – settings entity
- Returns
Pipeline object
- Return type
Example:
pipeline = project.pipelines.update_settings(pipeline='pipeline_entity', settings=dl.PipelineSettings(keep_triggers_active=True))
Pipeline Executions¶
- class PipelineExecutions(client_api: ApiClient, project: Optional[Project] = None, pipeline: Optional[Pipeline] = None)[source]¶
Bases:
object
PipelineExecutions Repository
The PipelineExecutions class allows users to manage pipeline executions. See our documentation for more information on pipelines.
- create(pipeline_id: Optional[str] = None, execution_input=None)[source]¶
Execute a pipeline and return the execute.
prerequisites: You must be an owner or developer to use this method.
- Parameters
pipeline_id – pipeline id
execution_input – list of the dl.FunctionIO or dict of pipeline input - example {‘item’: ‘item_id’}
- Returns
entities.PipelineExecution object
- Return type
Example:
pipeline_execution = pipeline.pipeline_executions.create(pipeline_id='pipeline_id', execution_input={'item': 'item_id'})
- get(pipeline_execution_id: str, pipeline_id: Optional[str] = None) PipelineExecution [source]¶
Get Pipeline Execution object
prerequisites: You must be an owner or developer to use this method.
- Parameters
- Returns
PipelineExecution object
- Return type
Example:
pipeline_executions = pipeline.pipeline_executions.get(pipeline_id='pipeline_id')
- list(filters: Optional[Filters] = None) PagedEntities [source]¶
List project pipeline executions.
prerequisites: You must be an owner or developer to use this method.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
- Returns
Paged entity
- Return type
Example:
pipeline_executions = pipeline.pipeline_executions.list()
General Commands¶
- class Commands(client_api: ApiClient)[source]¶
Bases:
object
Service Commands repository
- get(command_id: Optional[str] = None, url: Optional[str] = None) Command [source]¶
Get Service command object
- wait(command_id, timeout=0, step=None, url=None, backoff_factor=0.1)[source]¶
Wait for command to finish
backoff_factor: A backoff factor to apply between attempts after the second try {backoff factor} * (2 ** ({number of total retries} - 1)) seconds. If the backoff_factor is 0.1, then
sleep()
will sleep for [0.0s, 0.2s, 0.4s, …] between retries. It will never be longer than 8 sec- Parameters
- Returns
Command object
Download Commands¶
Upload Commands¶
Entities¶
Organization¶
- class Organization(members: list, groups: list, account: dict, created_at, updated_at, id, name, logo_url, plan, owner, created_by, client_api: ApiClient, repositories=NOTHING)[source]¶
Bases:
BaseEntity
Organization entity
- add_member(email, role: ~dtlpy.entities.organization.MemberOrgRole = <enum 'MemberOrgRole'>)[source]¶
Add members to your organization. Read about members and groups [here](https://dataloop.ai/docs/org-members-groups).
Prerequisities: To add members to an organization, you must be in the role of an “owner” in that organization.
- cache_action(mode=CacheAction.APPLY, pod_type=PodType.SMALL)[source]¶
Open the organizations in web platform
- delete_member(user_id: str, sure: bool = False, really: bool = False)[source]¶
Delete member from the Organization.
Prerequisites: Must be an organization “owner” to delete members.
- classmethod from_json(_json, client_api, is_fetched=True)[source]¶
Build a Project entity object from a json
- Parameters
- Returns
Organization object
- Return type
- list_groups()[source]¶
List all organization groups (groups that were created within the organization).
Prerequisites: You must be an organization “owner” to use this method.
- Returns
groups list
- Return type
- list_members(role: Optional[MemberOrgRole] = None)[source]¶
List all organization members.
Prerequisites: You must be an organization “owner” to use this method.
- to_json()[source]¶
Returns platform _json format of object
- Returns
platform json format of object
- Return type
- update(plan: str)[source]¶
Update Organization.
Prerequisities: You must be an Organization superuser to update an organization.
- Parameters
plan (str) – OrganizationsPlans.FREEMIUM, OrganizationsPlans.PREMIUM
- Returns
organization object
- update_member(email: str, role: MemberOrgRole = MemberOrgRole.MEMBER)[source]¶
Update member role.
Prerequisities: You must be an organization “owner” to update a member’s role.
Integration¶
- class Integration(id, name, type, org, created_at, created_by, update_at, client_api: ApiClient, project=None)[source]¶
Bases:
BaseEntity
Integration object
- delete(sure: bool = False, really: bool = False) bool [source]¶
Delete integrations from the Organization
- classmethod from_json(_json: dict, client_api: ApiClient, is_fetched=True)[source]¶
Build a Integration entity object from a json
- Parameters
_json – _json response from host
client_api – ApiClient entity
is_fetched – is Entity fetched from Platform
- Returns
Integration object
Project¶
- class Project(contributors, created_at, creator, id, name, org, updated_at, role, account, is_blocked, feature_constraints, client_api: ApiClient, repositories=NOTHING)[source]¶
Bases:
BaseEntity
Project entity
- add_member(email, role: MemberRole = MemberRole.DEVELOPER)[source]¶
Add a member to the project.
- classmethod from_json(_json, client_api, is_fetched=True)[source]¶
Build a Project object from a json
- Parameters
- Returns
Project object
- Return type
- list_members(role: Optional[MemberRole] = None)[source]¶
List the project members.
- Parameters
role – The required role for the user. Use the enum dl.MemberRole
- Returns
list of the project members
- Return type
- to_json()[source]¶
Returns platform _json format of project object
- Returns
platform json format of project object
- Return type
- update(system_metadata=False)[source]¶
Update the project
- Parameters
system_metadata (bool) – optional - True, if you want to change metadata system
- Returns
Project object
- Return type
- update_member(email, role: MemberRole = MemberRole.DEVELOPER)[source]¶
Update member’s information/details from the project.
User¶
- class User(created_at, updated_at, name, last_name, username, avatar, email, role, type, org, id, project, client_api=None, users=None)[source]¶
Bases:
BaseEntity
User entity
- classmethod from_json(_json, project, client_api, users=None)[source]¶
Build a User entity object from a json
- Parameters
_json (dict) – _json response from host
project (dtlpy.entities.project.Project) – project entity
client_api – ApiClient entity
users – Users repository
- Returns
User object
- Return type
Dataset¶
- class Dataset(id, url, name, annotated, creator, projects, items_count, metadata, directoryTree, export, expiration_options, index_driver, created_at, items_url, readable_type, access_level, driver, readonly, client_api: ApiClient, project=None, datasets=None, repositories=NOTHING, ontology_ids=None, labels=None, directory_tree=None, recipe=None, ontology=None)[source]¶
Bases:
BaseEntity
Dataset object
- add_label(label_name, color=None, children=None, attributes=None, display_label=None, label=None, recipe_id=None, ontology_id=None, icon_path=None)[source]¶
Add single label to dataset
Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.
- Parameters
label_name (str) – str - label name
color (tuple) – RGB color of the annotation, e.g (255,0,0) or ‘#ff0000’ for red
children – children (sub labels). list of sub labels of this current label, each value is either dict or dl.Label
attributes (list) – add attributes to the labels
display_label (str) – name that display label
label (dtlpy.entities.label.Label) – label object
recipe_id (str) – optional recipe id
ontology_id (str) – optional ontology id
icon_path (str) – path to image to be display on label
- Returns
label entity
- Return type
dtlpy.entities.label.Label
Example:
dataset.add_label(label_name='person', color=(34, 6, 231), attributes=['big', 'small'])
- add_labels(label_list, ontology_id=None, recipe_id=None)[source]¶
Add labels to dataset
Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.
- Parameters
- Returns
label entities
Example:
dataset.add_labels(label_list=label_list)
- clone(clone_name, filters=None, with_items_annotations=True, with_metadata=True, with_task_annotations_status=True)[source]¶
Clone dataset
Prerequisites: You must be in the role of an owner or developer.
- Parameters
clone_name (str) – new dataset name
filters (dtlpy.entities.filters.Filters) – Filters entity or a query dict
with_items_annotations (bool) – clone all item’s annotations
with_metadata (bool) – clone metadata
with_task_annotations_status (bool) – clone task annotations status
- Returns
dataset object
- Return type
Example:
dataset = dataset.clone(dataset_id='dataset_id', clone_name='dataset_clone_name', with_metadata=True, with_items_annotations=False, with_task_annotations_status=False)
- delete(sure=False, really=False)[source]¶
Delete a dataset forever!
Prerequisites: You must be an owner or developer to use this method.
- Parameters
- Returns
True is success
- Return type
Example:
is_deleted = dataset.delete(sure=True, really=True)
- delete_attributes(keys: list, recipe_id: Optional[str] = None, ontology_id: Optional[str] = None)[source]¶
Delete a bulk of attributes
- delete_labels(label_names)[source]¶
Delete labels from dataset’s ontologies
Prerequisites: You must be in the role of an owner or developer.
- Parameters
label_names – label object/ label name / list of label objects / list of label names
Example:
dataset.delete_labels(label_names=['myLabel1', 'Mylabel2'])
- download(filters=None, local_path=None, file_types=None, annotation_options: Optional[ViewAnnotationOptions] = None, annotation_filters=None, overwrite=False, to_items_folder=True, thickness=1, with_text=False, without_relative_path=None, alpha=1, export_version=ExportVersion.V1)[source]¶
Download dataset by filters. Filtering the dataset for items and save them local Optional - also download annotation, mask, instance and image mask of the item
Prerequisites: You must be in the role of an owner or developer.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
local_path (str) – local folder or filename to save to.
file_types (list) – a list of file type to download. e.g [‘video/webm’, ‘video/mp4’, ‘image/jpeg’, ‘image/png’]
annotation_options (list) – type of download annotations: list(dl.ViewAnnotationOptions)
annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download
overwrite (bool) – optional - default = False to overwrite the existing files
to_items_folder (bool) – Create ‘items’ folder and download items to it
thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1
with_text (bool) – optional - add text to annotations, default = False
without_relative_path (bool) – bool - download items without the relative path from platform
alpha (float) – opacity value [0 1], default 1
export_version (str) – V2 - exported items will have original extension in filename, V1 - no original extension in filenames
- Returns
List of local_path per each downloaded item
Example:
dataset.download(local_path='local_path', annotation_options=[dl.ViewAnnotationOptions.JSON, dl.ViewAnnotationOptions.MASK], overwrite=False, thickness=1, with_text=False, alpha=1, save_locally=True )
- download_annotations(local_path=None, filters=None, annotation_options: Optional[ViewAnnotationOptions] = None, annotation_filters=None, overwrite=False, thickness=1, with_text=False, remote_path=None, include_annotations_in_output=True, export_png_files=False, filter_output_annotations=False, alpha=1, export_version=ExportVersion.V1)[source]¶
Download dataset by filters. Filtering the dataset for items and save them local Optional - also download annotation, mask, instance and image mask of the item
Prerequisites: You must be in the role of an owner or developer.
- Parameters
local_path (str) – local folder or filename to save to.
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
annotation_options (list(dtlpy.entities.annotation.ViewAnnotationOptions)) – download annotations options: list(dl.ViewAnnotationOptions)
annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download
overwrite (bool) – optional - default = False
thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1
with_text (bool) – optional - add text to annotations, default = False
remote_path (str) – DEPRECATED and ignored
include_annotations_in_output (bool) – default - False , if export should contain annotations
export_png_files (bool) – default - if True, semantic annotations should be exported as png files
filter_output_annotations (bool) – default - False, given an export by filter - determine if to filter out annotations
alpha (float) – opacity value [0 1], default 1
export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames
- Returns
local_path of the directory where all the downloaded item
- Return type
Example:
local_path = dataset.download_annotations(dataset='dataset_entity', local_path='local_path', annotation_options=[dl.ViewAnnotationOptions.JSON, dl.ViewAnnotationOptions.MASK], overwrite=False, thickness=1, with_text=False, alpha=1 )
- download_folder(folder_path, filters=None, local_path=None, file_types=None, annotation_options: Optional[ViewAnnotationOptions] = None, annotation_filters=None, overwrite=False, to_items_folder=True, thickness=1, with_text=False, without_relative_path=None, alpha=1, export_version=ExportVersion.V1)[source]¶
Download dataset folder. Optional - also download annotation, mask, instance and image mask of the item
Prerequisites: You must be in the role of an owner or developer.
- Parameters
folder_path (str) – the path of the folder that want to download
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
local_path (str) – local folder or filename to save to.
file_types (list) – a list of file type to download. e.g [‘video/webm’, ‘video/mp4’, ‘image/jpeg’, ‘image/png’]
annotation_options (list) – type of download annotations: list(dl.ViewAnnotationOptions)
annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download
overwrite (bool) – optional - default = False to overwrite the existing files
to_items_folder (bool) – Create ‘items’ folder and download items to it
thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1
with_text (bool) – optional - add text to annotations, default = False
without_relative_path (bool) – bool - download items without the relative path from platform
alpha (float) – opacity value [0 1], default 1
export_version (str) – V2 - exported items will have original extension in filename, V1 - no original extension in filenames
- Returns
List of local_path per each downloaded item
Example:
dataset.download_folder(folder_path='folder_path' local_path='local_path', annotation_options=[dl.ViewAnnotationOptions.JSON, dl.ViewAnnotationOptions.MASK], overwrite=False, thickness=1, with_text=False, alpha=1, save_locally=True )
- classmethod from_json(project: Project, _json: dict, client_api: ApiClient, datasets=None, is_fetched=True)[source]¶
Build a Dataset entity object from a json
- Parameters
- Returns
Dataset object
- Return type
- static serialize_labels(labels_dict)[source]¶
Convert hex color format to rgb
- Parameters
labels_dict (dict) – dict of labels
- Returns
dict of converted labels
- set_readonly(state: bool)[source]¶
Set dataset readonly mode
Prerequisites: You must be in the role of an owner or developer.
- Parameters
state (bool) – state
Example:
dataset.set_readonly(state=True)
- switch_recipe(recipe_id=None, recipe=None)[source]¶
Switch the recipe that linked to the dataset with the given one
- Parameters
recipe_id (str) – recipe id
recipe (dtlpy.entities.recipe.Recipe) – recipe entity
Example:
dataset.switch_recipe(recipe_id='recipe_id')
- sync(wait=True)[source]¶
Sync dataset with external storage
Prerequisites: You must be in the role of an owner or developer.
Example:
success = dataset.sync()
- to_json()[source]¶
Returns platform _json format of object
- Returns
platform json format of object
- Return type
- update(system_metadata=False)[source]¶
Update dataset field
Prerequisites: You must be an owner or developer to use this method.
- Parameters
system_metadata (bool) – bool - True, if you want to change metadata system
- Returns
Dataset object
- Return type
Example:
dataset = dataset.update()
- update_attributes(title: str, key: str, attribute_type, recipe_id: Optional[str] = None, ontology_id: Optional[str] = None, scope: Optional[list] = None, optional: Optional[bool] = None, values: Optional[list] = None, attribute_range=None)[source]¶
ADD a new attribute or update if exist
- Parameters
ontology_id (str) – ontology_id
title (str) – attribute title
key (str) – the key of the attribute must br unique
attribute_type (AttributesTypes) – dl.AttributesTypes your attribute type
scope (list) – list of the labels or * for all labels
optional (bool) – optional attribute
values (list) – list of the attribute values ( for checkbox and radio button)
attribute_range (dict or AttributesRange) – dl.AttributesRange object
- Returns
true in success
- Return type
Example:
dataset.update_attributes(ontology_id='ontology_id', key='1', title='checkbox', attribute_type=dl.AttributesTypes.CHECKBOX, values=[1,2,3])
- update_label(label_name, color=None, children=None, attributes=None, display_label=None, label=None, recipe_id=None, ontology_id=None, upsert=False, icon_path=None)[source]¶
Add single label to dataset
Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.
- Parameters
label_name (str) – str - label name
color (tuple) – color
children – children (sub labels)
attributes (list) – add attributes to the labels
display_label (str) – name that display label
label (dtlpy.entities.label.Label) – label
recipe_id (str) – optional recipe id
ontology_id (str) – optional ontology id
icon_path (str) – path to image to be display on label
- Returns
label entity
- Return type
dtlpy.entities.label.Label
Example:
dataset.update_label(label_name='person', color=(34, 6, 231), attributes=['big', 'small'])
- update_labels(label_list, ontology_id=None, recipe_id=None, upsert=False)[source]¶
Add labels to dataset
Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.
- Parameters
- Returns
label entities
- Return type
dtlpy.entities.label.Label
Example:
dataset.update_labels(label_list=label_list)
- upload_annotations(local_path, filters=None, clean=False, remote_root_path='/', export_version=ExportVersion.V1)[source]¶
Upload annotations to dataset.
Prerequisites: You must have a dataset with items that are related to the annotations. The relationship between the dataset and annotations is shown in the name. You must be in the role of an owner or developer.
- Parameters
local_path (str) – str - local folder where the annotations files is.
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
clean (bool) – bool - if True it remove the old annotations
remote_root_path (str) – str - the remote root path to match remote and local items
export_version (str) – V2 - exported items will have original extension in filename, V1 - no original extension in filenames
For example, if the item filepath is a/b/item and remote_root_path is /a the start folder will be b instead of a
Example:
dataset.upload_annotations(dataset='dataset_entity', local_path='local_path', clean=False, export_version=dl.ExportVersion.V1 )
- class ExpirationOptions(item_max_days: Optional[int] = None)[source]¶
Bases:
object
ExpirationOptions object
Driver¶
- class Driver(bucket_name, creator, allow_external_delete, allow_external_modification, created_at, region, path, type, integration_id, integration_type, metadata, name, id, client_api: ApiClient, repositories=NOTHING)[source]¶
Bases:
BaseEntity
Driver entity
- delete(sure=False, really=False)[source]¶
Delete a driver forever!
Prerequisites: You must be an owner or developer to use this method.
- Parameters
- Returns
True if success
- Return type
Example:
driver.delete(sure=True, really=True)
Item¶
- class Item(annotations_link, dataset_url, thumbnail, created_at, dataset_id, annotated, metadata, filename, stream, name, type, url, id, hidden, dir, spec, creator, description, src_item, annotations_count, client_api: ApiClient, platform_dict, dataset, model, project, project_id, repositories=NOTHING)[source]¶
Bases:
BaseEntity
Item object
- clone(dst_dataset_id=None, remote_filepath=None, metadata=None, with_annotations=True, with_metadata=True, with_task_annotations_status=False, allow_many=False, wait=True)[source]¶
Clone item
- Parameters
dst_dataset_id (str) – destination dataset id
remote_filepath (str) – complete filepath
metadata (dict) – new metadata to add
with_annotations (bool) – clone annotations
with_metadata (bool) – clone metadata
with_task_annotations_status (bool) – clone task annotations status
allow_many (bool) – bool if True, using multiple clones in single dataset is allowed, (default=False)
wait (bool) – wait for the command to finish
- Returns
Item object
- Return type
Example:
item.clone(item_id='item_id', dst_dataset_id='dist_dataset_id', with_metadata=True, with_task_annotations_status=False, with_annotations=False)
- download(local_path=None, file_types=None, save_locally=True, to_array=False, annotation_options: Optional[ViewAnnotationOptions] = None, overwrite=False, to_items_folder=True, thickness=1, with_text=False, annotation_filters=None, alpha=1, export_version=ExportVersion.V1)[source]¶
Download dataset by filters. Filtering the dataset for items and save them local Optional - also download annotation, mask, instance and image mask of the item
- Parameters
local_path (str) – local folder or filename to save to.
file_types (list) – a list of file type to download. e.g [‘video/webm’, ‘video/mp4’, ‘image/jpeg’, ‘image/png’]
save_locally (bool) – bool. save to disk or return a buffer
to_array (bool) – returns Ndarray when True and local_path = False
annotation_options (list) – download annotations options: list(dl.ViewAnnotationOptions)
annotation_filters (dtlpy.entities.filters.Filters) – Filters entity to filter annotations for download
overwrite (bool) – optional - default = False
to_items_folder (bool) – Create ‘items’ folder and download items to it
thickness (int) – optional - line thickness, if -1 annotation will be filled, default =1
with_text (bool) – optional - add text to annotations, default = False
alpha (float) – opacity value [0 1], default 1
export_version (str) – exported items will have original extension in filename, V1 - no original extension in filenames
- Returns
generator of local_path per each downloaded item
- Return type
generator or single item
Example:
item.download(local_path='local_path', annotation_options=dl.ViewAnnotationOptions.MASK, overwrite=False, thickness=1, with_text=False, alpha=1, save_locally=True )
- classmethod from_json(_json, client_api, dataset=None, project=None, model=None, is_fetched=True)[source]¶
Build an item entity object from a json
- Parameters
project (dtlpy.entities.project.Project) – project entity
_json (dict) – _json response from host
dataset (dtlpy.entities.dataset.Dataset) – dataset in which the annotation’s item is located
model (dtlpy.entities.dataset.Model) – the model entity if item is an artifact of a model
client_api (dlApiClient) – ApiClient entity
is_fetched (bool) – is Entity fetched from Platform
- Returns
Item object
- Return type
- move(new_path)[source]¶
Move item from one folder to another in Platform If the directory doesn’t exist it will be created
- set_description(text: str)[source]¶
Update Item description
- Parameters
text (str) – if None or “” description will be deleted
:return
- to_json()[source]¶
Returns platform _json format of object
- Returns
platform json format of object
- Return type
- update(system_metadata=False)[source]¶
Update items metadata
- Parameters
system_metadata (bool) – bool - True, if you want to change metadata system
- Returns
Item object
- Return type
Item Link¶
Annotation¶
- class Annotation(id, url, item_url, item, item_id, creator, created_at, updated_by, updated_at, type, source, dataset_url, platform_dict, metadata, fps, hash=None, dataset_id=None, status=None, object_id=None, automated=None, item_height=None, item_width=None, label_suggestions=None, annotation_definition: Optional[BaseAnnotationDefinition] = None, frames=None, current_frame=0, end_frame=0, end_time=0, start_frame=0, start_time=0, dataset=None, datasets=None, annotations=None, Annotation__client_api=None, items=None, recipe_2_attributes=None)[source]¶
Bases:
BaseEntity
Annotations object
- add_frame(annotation_definition, frame_num=None, fixed=True, object_visible=True)[source]¶
Add a frame state to annotation
- Parameters
- Returns
True if success
- Return type
Example:
success = annotation.add_frame(frame_num=10, annotation_definition=dl.Box(top=10,left=10,bottom=100, right=100,label='labelName')) )
- add_frames(annotation_definition, frame_num=None, end_frame_num=None, start_time=None, end_time=None, fixed=True, object_visible=True)[source]¶
Add a frames state to annotation
Prerequisites: Any user can upload annotations.
- Parameters
annotation_definition – annotation type object - must be same type as annotation
frame_num (int) – first frame number
end_frame_num (int) – last frame number
start_time – starting time for video
end_time – ending time for video
fixed (bool) – is fixed
object_visible (bool) – does the annotated object is visible
- Returns
Example:
annotation.add_frames(frame_num=10, annotation_definition=dl.Box(top=10,left=10,bottom=100, right=100,label='labelName')) )
- delete()[source]¶
Remove an annotation from item
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.
- Returns
True if success
- Return type
Example:
is_deleted = annotation.delete()
- download(filepath: str, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.JSON, height: Optional[float] = None, width: Optional[float] = None, thickness: int = 1, with_text: bool = False, alpha: float = 1)[source]¶
Save annotation to file
Prerequisites: Any user can upload annotations.
- Parameters
filepath (str) – local path to where annotation will be downloaded to
annotation_format (list) – options: list(dl.ViewAnnotationOptions)
height (float) – image height
width (float) – image width
thickness (int) – line thickness
with_text (bool) – get mask with text
alpha (float) – opacity value [0 1], default 1
- Returns
filepath
- Return type
Example:
filepath = annotation.download(filepath='filepath', annotation_format=dl.ViewAnnotationOptions.MASK)
- classmethod from_json(_json, item=None, client_api=None, annotations=None, is_video=None, fps=None, item_metadata=None, dataset=None, is_audio=None)[source]¶
Create an annotation object from platform json
- Parameters
_json (dict) – platform json
item (dtlpy.entities.item.Item) – item
client_api – ApiClient entity
annotations –
is_video (bool) – is video
fps – video fps
item_metadata – item metadata
dataset – dataset entity
is_audio (bool) – is audio
- Returns
annotation object
- Return type
- classmethod new(item=None, annotation_definition=None, object_id=None, automated=True, metadata=None, frame_num=None, parent_id=None, start_time=None, item_height=None, item_width=None, end_time=None)[source]¶
Create a new annotation object annotations
Prerequisites: Any user can upload annotations.
- Parameters
item (dtlpy.entities.item.Items) – item to annotate
annotation_definition – annotation type object
object_id (str) – object_id
automated (bool) – is automated
metadata (dict) – metadata
frame_num (int) – optional - first frame number if video annotation
parent_id (str) – add parent annotation ID
start_time – optional - start time if video annotation
item_height (float) – annotation item’s height
item_width (float) – annotation item’s width
end_time – optional - end time if video annotation
- Returns
annotation object
- Return type
Example:
annotation = annotation.new(item='item_entity, annotation_definition=dl.Box(top=10,left=10,bottom=100, right=100,label='labelName')) )
- set_frame(frame)[source]¶
Set annotation to frame state
Prerequisites: Any user can upload annotations.
Example:
success = annotation.set_frame(frame=10)
- show(image=None, thickness=None, with_text=False, height=None, width=None, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.MASK, color=None, label_instance_dict=None, alpha=1, frame_num=None)[source]¶
Show annotations mark the annotation of the image array and return it
Prerequisites: Any user can upload annotations.
- Parameters
image – empty or image to draw on
thickness (int) – line thickness
with_text (bool) – add label to annotation
height (float) – height
width (float) – width
annotation_format (dl.ViewAnnotationOptions) – list(dl.ViewAnnotationOptions)
color (tuple) – optional - color tuple
label_instance_dict – the instance labels
alpha (float) – opacity value [0 1], default 1
frame_num (int) – for video annotation, show specific fame
- Returns
list or single ndarray of the annotations
Exampls:
image = annotation.show(image='ndarray', thickness=1, annotation_format=dl.VIEW_ANNOTATION_OPTIONS_MASK, )
- to_json()[source]¶
Convert annotation object to a platform json representatio
- Returns
platform json
- Return type
- update(system_metadata=False)[source]¶
Update an existing annotation in host.
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.
- Parameters
system_metadata – True, if you want to change metadata system
- Returns
Annotation object
- Return type
Example:
annotation = annotation.update()
- update_status(status: AnnotationStatus = AnnotationStatus.ISSUE)[source]¶
Set status on annotation
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager.
- Parameters
status (str) – can be AnnotationStatus.ISSUE, AnnotationStatus.APPROVED, AnnotationStatus.REVIEW, AnnotationStatus.CLEAR
- Returns
Annotation object
- Return type
Example:
annotation = annotation.update_status(status=dl.AnnotationStatus.ISSUE)
- class FrameAnnotation(annotation, annotation_definition, frame_num, fixed, object_visible, recipe_2_attributes=None, interpolation=False)[source]¶
Bases:
BaseEntity
FrameAnnotation object
- classmethod from_snapshot(annotation, _json, fps)[source]¶
new frame state to annotation
- Parameters
annotation – annotation
_json – annotation type object - must be same type as annotation
fps – frame number
- Returns
FrameAnnotation object
- classmethod new(annotation, annotation_definition, frame_num, fixed, object_visible=True)[source]¶
new frame state to annotation
- Parameters
annotation – annotation
annotation_definition – annotation type object - must be same type as annotation
frame_num – frame number
fixed – is fixed
object_visible – does the annotated object is visible
- Returns
FrameAnnotation object
- class ViewAnnotationOptions(value)[source]¶
-
The Annotations file types to download (JSON, MASK, INSTANCE, ANNOTATION_ON_IMAGE, VTT, OBJECT_ID).
State
Description
JSON
Dataloop json format
MASK
PNG file that contains drawing annotations on it
INSTANCE
An image file that contains 2D annotations
ANNOTATION_ON_IMAGE
The source image with the annotations drawing in it
VTT
An text file contains supplementary information about a web video
OBJECT_ID
An image file that contains 2D annotations
Collection of Annotation entities¶
- class AnnotationCollection(item=None, annotations=NOTHING, dataset=None, colors=None)[source]¶
Bases:
BaseEntity
Collection of Annotation entity
- add(annotation_definition, object_id=None, frame_num=None, end_frame_num=None, start_time=None, end_time=None, automated=True, fixed=True, object_visible=True, metadata=None, parent_id=None, model_info=None)[source]¶
Add annotations to collection
- Parameters
annotation_definition – dl.Polygon, dl.Segmentation, dl.Point, dl.Box etc
object_id – Object id (any id given by user). If video - must input to match annotations between frames
frame_num – video only, number of frame
end_frame_num – video only, the end frame of the annotation
start_time – video only, start time of the annotation
end_time – video only, end time of the annotation
automated –
fixed – video only, mark frame as fixed
object_visible – video only, does the annotated object is visible
metadata – optional- metadata dictionary for annotation
parent_id – set a parent for this annotation (parent annotation ID)
model_info – optional - set model on annotation {‘name’,:’’, ‘confidence’:0}
- Returns
- delete()[source]¶
Remove an annotation from item
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.
- Returns
True if success
- Return type
Example:
is_deleted = builder.delete()
- download(filepath, img_filepath=None, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.JSON, height=None, width=None, thickness=1, with_text=False, orientation=0, alpha=1)[source]¶
Save annotations to file
Prerequisites: Any user can upload annotations.
- Parameters
filepath (str) – path to save annotation
img_filepath (str) – img file path - needed for img_mask
annotation_format (dl.ViewAnnotationOptions) – how to show thw annotations. options: list(dl.ViewAnnotationOptions)
height (int) – height
width (int) – width
thickness (int) – thickness
with_text (bool) – add a text to the image
orientation (int) – the image orientation
alpha (float) – opacity value [0 1], default 1
- Returns
file path of the downlaod annotation
- Return type
Example:
filepath = builder.download(filepath='filepath', annotation_format=dl.ViewAnnotationOptions.MASK)
- from_instance_mask(mask, instance_map=None)[source]¶
convert annotation from instance mask format
- Parameters
mask – the mask annotation
instance_map – labels
- classmethod from_json(_json: list, item=None, is_video=None, fps=25, height=None, width=None, client_api=None, is_audio=None)[source]¶
Create an annotation collection object from platform json
- Parameters
- Returns
annotation object
- Return type
- from_vtt_file(filepath)[source]¶
convert annotation from vtt format
- Parameters
filepath (str) – path to the file
- get_frame(frame_num)[source]¶
Get frame
- Parameters
frame_num (int) – frame num
- Returns
AnnotationCollection
- show(image=None, thickness=None, with_text=False, height=None, width=None, annotation_format: ViewAnnotationOptions = ViewAnnotationOptions.MASK, label_instance_dict=None, color=None, alpha=1, frame_num=None)[source]¶
Show annotations according to annotation_format
Prerequisites: Any user can upload annotations.
- Parameters
image (ndarray) – empty or image to draw on
height (int) – height
width (int) – width
thickness (int) – line thickness
with_text (bool) – add label to annotation
annotation_format (dl.ViewAnnotationOptions) – how to show thw annotations. options: list(dl.ViewAnnotationOptions)
label_instance_dict (dict) – instance label map {‘Label’: 1, ‘More’: 2}
color (tuple) – optional - color tuple
alpha (float) – opacity value [0 1], default 1
frame_num (int) – for video annotation, show specific frame
- Returns
ndarray of the annotations
Example:
image = builder.show(image='ndarray', thickness=1, annotation_format=dl.VIEW_ANNOTATION_OPTIONS_MASK, )
- to_json()[source]¶
Convert annotation object to a platform json representation
- Returns
platform json
- Return type
- update(system_metadata=True)[source]¶
Update an existing annotation in host.
Prerequisites: You must have an item that has been annotated. You must have the role of an owner or developer or be assigned a task that includes that item as an annotation manager or annotator.
- Parameters
system_metadata – True, if you want to change metadata system
- Returns
Annotation object
- Return type
Example:
annotation = builder.update()
Annotation Definition¶
Box Annotation Definition¶
- class Box(left=None, top=None, right=None, bottom=None, label=None, attributes=None, description=None, angle=None)[source]¶
Bases:
BaseAnnotationDefinition
Box annotation object Can create a box using 2 point using: “top”, “left”, “bottom”, “right” (to form a box [(left, top), (right, bottom)]) For rotated box add the “angel”
- classmethod from_segmentation(mask, label, attributes=None)[source]¶
Convert binary mask to Polygon
- Parameters
mask – binary mask (0,1)
label – annotation label
attributes – annotations list of attributes
- Returns
Box annotations list to each separated segmentation
- show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]¶
Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray
Classification Annotation Definition¶
- class Classification(label, attributes=None, description=None)[source]¶
Bases:
BaseAnnotationDefinition
Classification annotation object
- show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]¶
Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray
Cuboid Annotation Definition¶
- class Cube(label, front_tl, front_tr, front_br, front_bl, back_tl, back_tr, back_br, back_bl, angle=None, attributes=None, description=None)[source]¶
Bases:
BaseAnnotationDefinition
Cube annotation object
- classmethod from_boxes_and_angle(front_left, front_top, front_right, front_bottom, back_left, back_top, back_right, back_bottom, label, angle=0, attributes=None)[source]¶
Create cuboid by given front and back boxes with angle the angle calculate fom the center of each box
- show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]¶
Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray
Item Description Definition¶
Ellipse Annotation Definition¶
- class Ellipse(x, y, rx, ry, angle, label, attributes=None, description=None)[source]¶
Bases:
BaseAnnotationDefinition
Ellipse annotation object
- show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]¶
Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray
Note Annotation Definition¶
Point Annotation Definition¶
- class Point(x, y, label, attributes=None, description=None)[source]¶
Bases:
BaseAnnotationDefinition
Point annotation object
- show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]¶
Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray
Polygon Annotation Definition¶
- class Polygon(geo, label, attributes=None, description=None)[source]¶
Bases:
BaseAnnotationDefinition
Polygon annotation object
- classmethod from_segmentation(mask, label, attributes=None, epsilon=None, max_instances=1, min_area=0)[source]¶
Convert binary mask to Polygon
- Parameters
mask – binary mask (0,1)
label – annotation label
attributes – annotations list of attributes
epsilon – from opencv: specifying the approximation accuracy. This is the maximum distance between the original curve and its approximation. if 0 all points are returns
max_instances – number of max instances to return. if None all wil be returned
min_area – remove polygons with area lower thn this threshold (pixels)
- Returns
Polygon annotation
- show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]¶
Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray
Polyline Annotation Definition¶
- class Polyline(geo, label, attributes=None, description=None)[source]¶
Bases:
BaseAnnotationDefinition
Polyline annotation object
- show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]¶
Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray
Pose Annotation Definition¶
- class Pose(label, template_id, instance_id=None, attributes=None, points=None, description=None)[source]¶
Bases:
BaseAnnotationDefinition
Classification annotation object
- show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]¶
Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray
Segmentation Annotation Definition¶
- class Segmentation(geo, label, attributes=None, description=None, color=None)[source]¶
Bases:
BaseAnnotationDefinition
Segmentation annotation object
- classmethod from_polygon(geo, label, shape, attributes=None)[source]¶
- Parameters
geo – list of x,y coordinates of the polygon ([[x,y],[x,y]…]
label – annotation’s label
shape – image shape (h,w)
attributes –
- Returns
- show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]¶
Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray
Audio Annotation Definition¶
Undefined Annotation Definition¶
- class UndefinedAnnotationType(type, label, coordinates, attributes=None, description=None)[source]¶
Bases:
BaseAnnotationDefinition
UndefinedAnnotationType annotation object
- show(image, thickness, with_text, height, width, annotation_format, color, alpha=1)[source]¶
Show annotation as ndarray :param image: empty or image to draw on :param thickness: :param with_text: not required :param height: item height :param width: item width :param annotation_format: options: list(dl.ViewAnnotationOptions) :param color: color :param alpha: opacity value [0 1], default 1 :return: ndarray
Similarity¶
- class Collection(type: CollectionTypes, name, items=None)[source]¶
Bases:
object
Base Collection Entity
- add(ref, type: SimilarityTypeEnum = SimilarityTypeEnum.ID)[source]¶
Add item to collection :param ref: :param type: url, id
- class CollectionItem(type: SimilarityTypeEnum, ref)[source]¶
Bases:
object
Base CollectionItem
- class MultiView(name, items=None)[source]¶
Bases:
Collection
Multi Entity
- property items¶
list of the collection items
- class MultiViewItem(type, ref)[source]¶
Bases:
CollectionItem
Single multi view item
- class Similarity(ref, name=None, items=None)[source]¶
Bases:
Collection
Similarity Entity
- property items¶
list of the collection items
- property target¶
Target item for similarity
- class SimilarityItem(type, ref, target=False)[source]¶
Bases:
CollectionItem
Single similarity item
Filter¶
- class Filters(field=None, values=None, operator: Optional[FiltersOperations] = None, method: Optional[FiltersMethod] = None, custom_filter=None, resource: FiltersResource = FiltersResource.ITEM, use_defaults=True, context=None, page_size=None)[source]¶
Bases:
object
Filters entity to filter items from pages in platform
- add(field, values, operator: Optional[FiltersOperations] = None, method: Optional[FiltersMethod] = None)[source]¶
Add filter
- Parameters
field (str) – Metadata field / attribute
values – field values
operator (dl.FiltersOperations) – optional - in, gt, lt, eq, ne
method (dl.FiltersMethod) – Optional - or/and
Example:
filter.add(field='metadata.user', values=['1','2'], operator=dl.FiltersOperations.IN)
- add_join(field, values, operator: Optional[FiltersOperations] = None, method: FiltersMethod = FiltersMethod.AND)[source]¶
join a query to the filter
- Parameters
Example:
filter.add_join(field='metadata.user', values=['1','2'], operator=dl.FiltersOperations.IN)
- open_in_web(resource)[source]¶
Open the filter in the platform data browser (in a new web browser)
- Parameters
resource (str) – dl entity to apply filter on. currently only supports dl.Dataset
- prepare(operation=None, update=None, query_only=False, system_update=None, system_metadata=False)[source]¶
To dictionary for platform call
- sort_by(field, value: FiltersOrderByDirection = FiltersOrderByDirection.ASCENDING)[source]¶
sort the filter
- Parameters
field (str) – field to sort by it
value (dl.FiltersOrderByDirection) – FiltersOrderByDirection.ASCENDING, FiltersOrderByDirection.DESCENDING
Example:
filter.sort_by(field='metadata.user', values=dl.FiltersOrderByDirection.ASCENDING)
Recipe¶
- class Recipe(id, creator, url, title, project_ids, description, ontology_ids, instructions, examples, custom_actions, metadata, ui_settings, client_api: ApiClient, dataset=None, project=None, repositories=NOTHING)[source]¶
Bases:
BaseEntity
Recipe object
- add_instruction(annotation_instruction_file)[source]¶
Add instruction to recipe
- Parameters
annotation_instruction_file (str) – file path or url of the recipe instruction
- clone(shallow=False)[source]¶
Clone Recipe
- Parameters
shallow (bool) – If True, link ot existing ontology, clones all ontology that are link to the recipe as well
- Returns
Cloned ontology object
- Return type
- classmethod from_json(_json, client_api, dataset=None, project=None, is_fetched=True)[source]¶
Build a Recipe entity object from a json
- Parameters
_json (dict) – _json response from host
Dataset (dtlpy.entities.dataset.Dataset) – Dataset entity
project (dtlpy.entities.project.Project) – project entity
client_api (dl.ApiClient) – ApiClient entity
is_fetched (bool) – is Entity fetched from Platform
- Returns
Recipe object
- get_annotation_template_id(template_name)[source]¶
Get annotation template id by template name
- Parameters
template_name (str) –
- Returns
template id or None if does not exist
- to_json()[source]¶
Returns platform _json format of object
- Returns
platform json format of object
- Return type
Ontology¶
- class Ontology(client_api: ApiClient, id, creator, url, title, labels, metadata, attributes, recipe=None, dataset=None, project=None, repositories=NOTHING, instance_map=None, color_map=None)[source]¶
Bases:
BaseEntity
Ontology object
- add_label(label_name, color=None, children=None, attributes=None, display_label=None, label=None, add=True, icon_path=None, update_ontology=False)[source]¶
Add a single label to ontology
- Parameters
label_name (str) – str - label name
color (tuple) – color
children – children (sub labels)
attributes (list) – attributes
display_label (str) – display_label
label (dtlpy.entities.label.Label) – label
add (bool) – to add or not
icon_path (str) – path to image to be display on label
update_ontology (bool) – update the ontology, default = False for backward compatible
- Returns
Label entity
- Return type
dtlpy.entities.label.Label
Example:
label = ontology.add_label(label_name='person', color=(34, 6, 231), attributes=['big', 'small'])
- add_labels(label_list, update_ontology=False)[source]¶
Adds a list of labels to ontology
- Parameters
- Returns
List of label entities added
Example:
labels = ontology.add_labels(label_list=label_list)
- delete_attributes(keys: list)[source]¶
Delete a bulk of attributes
Example:
success = ontology.delete_attributes(['1'])
- delete_labels(label_names)[source]¶
Delete labels from ontology
- Parameters
label_names – label object/ label name / list of label objects / list of label names
- Returns
- classmethod from_json(_json, client_api, recipe, dataset=None, project=None, is_fetched=True)[source]¶
Build an Ontology entity object from a json
- Parameters
is_fetched (bool) – is Entity fetched from Platform
project (dtlpy.entities.project.Project) – project entity
dataset (dtlpy.entities.dataset.Dataset) – dataset
_json (dict) – _json response from host
recipe (dtlpy.entities.recipe.Recipe) – ontology’s recipe
client_api (dl.ApiClient) – ApiClient entity
- Returns
Ontology object
- Return type
- property instance_map¶
instance mapping for creating instance mask
- Return dictionary {label
map_id}
- Return type
- to_json()[source]¶
Returns platform _json format of object
- Returns
platform json format of object
- Return type
- update(system_metadata=False)[source]¶
Update items metadata
- Parameters
system_metadata (bool) – bool - True, if you want to change metadata system
- Returns
Ontology object
- update_attributes(title: str, key: str, attribute_type, scope: Optional[list] = None, optional: Optional[bool] = None, values: Optional[list] = None, attribute_range=None)[source]¶
ADD a new attribute or update if exist
- Parameters
title (str) – attribute title
key (str) – the key of the attribute must br unique
attribute_type (AttributesTypes) – dl.AttributesTypes your attribute type
scope (list) – list of the labels or * for all labels
optional (bool) – optional attribute
values (list) – list of the attribute values ( for checkbox and radio button)
attribute_range (dict or AttributesRange) – dl.AttributesRange object
- Returns
true in success
- Return type
- update_label(label_name, color=None, children=None, attributes=None, display_label=None, label=None, add=True, icon_path=None, upsert=False, update_ontology=False)[source]¶
Update a single label to ontology
- Parameters
label_name (str) – str - label name
color (tuple) – color
children – children (sub labels)
attributes (list) – attributes
display_label (str) – display_label
label (dtlpy.entities.label.Label) – label
add (bool) – to add or not
icon_path (str) – path to image to be display on label
update_ontology (bool) – update the ontology, default = False for backward compatible
upsert (bool) – if True will add in case it does not existing
- Returns
Label entity
- Return type
dtlpy.entities.label.Label
Example:
label = ontology.update_label(label_name='person', color=(34, 6, 231), attributes=['big', 'small'])
- update_labels(label_list, upsert=False, update_ontology=False)[source]¶
Update a list of labels to ontology
- Parameters
label_list (list) – list of labels [{“value”: {“tag”: “tag”, “displayLabel”: “displayLabel”, “color”: “#color”, “attributes”: [attributes]}, “children”: [children]}]
upsert (bool) – if True will add in case it does not existing
update_ontology (bool) – update the ontology, default = False for backward compatible
- Returns
List of label entities added
Example:
labels = ontology.update_labels(label_list=label_list)
Label¶
Task¶
- class Task(name, status, project_id, metadata, id, url, task_owner, item_status, creator, due_date, dataset_id, spec, recipe_id, query, assignmentIds, annotation_status, progress, for_review, issues, updated_at, created_at, available_actions, total_items, priority, client_api, current_assignments=None, assignments=None, project=None, dataset=None, tasks=None, settings=None)[source]¶
Bases:
object
Task object
- add_items(filters=None, items=None, assignee_ids=None, workload=None, limit=None, wait=True, query=None)[source]¶
Add items to Task
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
items (list) – list of items (item Ids or objects) to add to the task
assignee_ids (list) – list to assignee who works in the task
workload (list) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]
limit (int) – the limit items that task can include
wait (bool) – wait until add items will to finish
query (dict) – query to filter the items for the task
- Returns
task entity
- Return type
- create_assignment(assignment_name, assignee_id, items=None, filters=None)[source]¶
Create a new assignment
- Parameters
assignment_name (str) – assignment name
assignee_id (str) – the assignment assignees (contributors) that should be working on the task. Provide a user email
items (List[entities.Item]) – list of items (item Id or objects) to insert to the task
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
- Returns
Assignment object
- Return type
dtlpy.entities.assignment.Assignment assignment
Example:
assignment = task.create_assignment(assignee_id='annotator1@dataloop.ai')
- create_qa_task(due_date, assignee_ids, filters=None, items=None, query=None, workload=None, metadata=None, available_actions=None, wait=True, batch_size=None, max_batch_workload=None, allowed_assignees=None, priority=TaskPriority.MEDIUM)[source]¶
Create a new QA Task
- Parameters
due_date (float) – date by which the QA task should be finished; for example, due_date=datetime.datetime(day=1, month=1, year=2029).timestamp()
assignee_ids (list) – list the QA task assignees (contributors) that should be working on the task. Provide a list of users’ emails
filters (entities.Filters) – dl.Filters entity to filter items for the task
items (List[entities.Item]) – list of items (item Id or objects) to insert to the task
query (dict DQL) – filter items for the task
workload (List[WorkloadUnit]) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]
metadata (dict) – metadata for the task
available_actions (list) – list of available actions (statuses) that will be available for the task items; The default statuses are: “Approved” and “Discarded”
wait (bool) – wait until create task finish
batch_size (int) – Pulling batch size (items), use with pulling allocation method. Restrictions - Min 3, max 100
max_batch_workload (int) – Max items in assignment, use with pulling allocation method. Restrictions - Min batchSize + 2, max batchSize * 2
allowed_assignees (list) – list the task assignees (contributors) that should be working on the task. Provide a list of users’ emails
priority (entities.TaskPriority) – priority of the task options in entities.TaskPriority
- Returns
task object
- Return type
Example:
task = task.create_qa_task(due_date = datetime.datetime(day= 1, month= 1, year= 2029).timestamp(), assignee_ids =[ 'annotator1@dataloop.ai', 'annotator2@dataloop.ai'])
- classmethod from_json(_json, client_api, project=None, dataset=None)[source]¶
Return the task object form the json
- Parameters
_json (dict) – platform json that describe the task
client_api – ApiClient object
project (dtlpy.entities.project.Project) – project object where task will create
dataset (dtlpy.entities.dataset.Dataset) – dataset object that refer to the task
- Returns
- get_items(filters=None)[source]¶
Get the task items
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
- Returns
list of the items or PagedEntity output of items
- Return type
- remove_items(filters: Optional[Filters] = None, query=None, items=None, wait=True)[source]¶
remove items from Task.
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned to be owner of the annotation task.
- Parameters
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
query (dict) – query to filter the items use it
items (list) – list of items to add to the task
wait (bool) – wait until remove items finish
- Returns
True if success and an error if failed
- Return type
- set_status(status: str, operation: str, item_ids: List[str])[source]¶
Update item status within task
Assignment¶
- class Assignment(name, annotator, status, project_id, metadata, id, url, task_id, dataset_id, annotation_status, item_status, total_items, for_review, issues, client_api, task=None, assignments=None, project=None, dataset=None, datasets=None)[source]¶
Bases:
BaseEntity
Assignment object
- get_items(dataset=None, filters=None)[source]¶
Get all the items in the assignment
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
dataset (dtlpy.entities.dataset.Dataset) – dataset object, the dataset that refer to the assignment
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filters parameters
- Returns
pages of the items
- Return type
Example:
items = task.assignments.get_items()
- open_in_web()[source]¶
Open the assignment in web platform
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Returns
Example:
assignment.open_in_web()
- reassign(assignee_id, wait=True)[source]¶
Reassign an assignment
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
- Returns
Assignment object
- Return type
Example:
assignment = assignment.reassign(assignee_ids='annotator1@dataloop.ai')
- redistribute(workload, wait=True)[source]¶
Redistribute an assignment
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
workload (dtlpy.entities.assignment.Workload) – list of WorkloadUnit objects. Customize distribution (percentage) between the task assignees. For example: [dl.WorkloadUnit(annotator@hi.com, 80), dl.WorkloadUnit(annotator2@hi.com, 20)]
wait (bool) – wait until redistribute assignment finish
- Returns
Assignment object
- Return type
dtlpy.entities.assignment.Assignment assignment
Example:
assignment = assignment.redistribute(workload=dl.Workload([dl.WorkloadUnit(assignee_id="annotator1@dataloop.ai", load=50), dl.WorkloadUnit(assignee_id="annotator2@dataloop.ai", load=50)]))
- set_status(status: str, operation: str, item_id: str)[source]¶
Set item status within assignment
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
- Returns
True id success
- Return type
Example:
success = assignment.set_status(status='complete', operation='created', item_id='item_id')
- to_json()[source]¶
Returns platform _json format of object
- Returns
platform json format of object
- Return type
- update(system_metadata=False)[source]¶
Update an assignment
Prerequisites: You must be in the role of an owner, developer, or annotation manager who has been assigned as owner of the annotation task.
- Parameters
system_metadata (bool) – True, if you want to change metadata system
- Returns
Assignment object
- Return type
dtlpy.entities.assignment.Assignment assignment
Example:
assignment = assignment.update(system_metadata=False)
Package¶
- class Package(_dict)[source]¶
Bases:
DlEntity
Package object
- build(module_name=None, init_inputs=None, local_path=None, from_local=None)[source]¶
Instantiate a module from the package code. Returns a loaded instance of the runner class
- Parameters
module_name – Name of the module to build the runner class
init_inputs (str) – dictionary of the class init variables (if exists). will be used to init the module class
local_path (str) – local path of the package (if from_local=False - codebase will be downloaded)
from_local (bool) – bool. if true - codebase will not be downloaded (only use local files)
- Returns
dl.BaseServiceRunner
- deploy(service_name=None, revision=None, init_input=None, runtime=None, sdk_version=None, agent_versions=None, verify=True, bot=None, pod_type=None, module_name=None, run_execution_as_process=None, execution_timeout=None, drain_time=None, on_reset=None, max_attempts=None, force=False, secrets: Optional[list] = None, **kwargs)[source]¶
Deploy package
- Parameters
service_name (str) – service name
revision (str) – package revision - default=latest
init_input – config to run at startup
runtime (dict) – runtime resources
sdk_version (str) –
optional - string - sdk version
agent_versions (dict) –
dictionary - - optional -versions of sdk, agent runner and agent proxy
bot (str) – bot email
pod_type (str) – pod type dl.InstanceCatalog
verify (bool) – verify the inputs
module_name (str) – module name
run_execution_as_process (bool) – run execution as process
execution_timeout (int) – execution timeout
drain_time (int) – drain time
on_reset (str) – on reset
max_attempts (int) – Maximum execution retries in-case of a service reset
force (bool) – optional - terminate old replicas immediately
secrets (list) – list of the integrations ids
- Returns
Service object
- Return type
Example:
- service: dl.Service = package.deploy(service_name=package_name,
execution_timeout=3 * 60 * 60, module_name=module.name, runtime=dl.KubernetesRuntime(
concurrency=10, pod_type=dl.InstanceCatalog.REGULAR_S, autoscaler=dl.KubernetesRabbitmqAutoscaler(
min_replicas=1, max_replicas=20, queue_length=20
)
- classmethod from_json(_json, client_api, project, is_fetched=True)[source]¶
Turn platform representation of package into a package entity
- Parameters
_json (dict) – platform representation of package
client_api (dl.ApiClient) – ApiClient entity
project (dtlpy.entities.project.Project) – project entity
is_fetched – is Entity fetched from Platform
- Returns
Package entity
- Return type
- static get_ml_metadata(cls=None, available_methods=None, output_type=AnnotationType.CLASSIFICATION, input_type='image', default_configuration: Optional[dict] = None)[source]¶
Create ML metadata for the package :param cls: ModelAdapter class, to get the list of available_methods :param available_methods: available user function on the adapter. [‘load’, ‘save’, ‘predict’, ‘train’, ‘evaluate’] :param output_type: annotation type the model create, e.g. dl.AnnotationType.CLASSIFICATION :param input_type: input file type the model gets, one of [‘image’, ‘video’, ‘txt’] :param default_configuration: default service configuration for the deployed services :return:
- pull(version=None, local_path=None) str [source]¶
Pull local package
Example:
path = package.pull(local_path='local_path')
- push(codebase: Optional[Union[GitCodebase, ItemCodebase]] = None, src_path: Optional[str] = None, package_name: Optional[str] = None, modules: Optional[list] = None, checkout: bool = False, revision_increment: Optional[str] = None, service_update: bool = False, service_config: Optional[dict] = None, package_type='faas')[source]¶
Push local package
- Parameters
codebase (dtlpy.entities.codebase.Codebase) – PackageCode object - defines how to store the package code
checkout (bool) – save package to local checkout
src_path (str) – location of pacjage codebase folder to zip
package_name (str) – name of package
modules (list) – list of PackageModule
revision_increment (str) – optional - str - version bumping method - major/minor/patch - default = None
service_update (bool) – optional - bool - update the service
:param dict service_config : Service object as dict. Contains the spec of the default service to create. :param str package_type: default is “faas”, one of “app”, “ml” :return: package entity :rtype: dtlpy.entities.package.Package
Example:
package = packages.push(package_name='package_name', modules=[module], version='1.0.0', src_path=os.getcwd())
- test(cwd=None, concurrency=None, module_name='default_module', function_name='run', class_name='ServiceRunner', entry_point='main.py')[source]¶
Test local package in local environment.
- Parameters
- Returns
list created by the function that tested the output
- Return type
Example:
package.test(cwd='path_to_package', function_name='run')
Package Function¶
Package Module¶
Slot¶
- class PackageSlot(module_name='default_module', function_name='run', display_name=None, display_scopes: Optional[list] = None, display_icon=None, post_action: SlotPostAction = NOTHING, default_inputs: Optional[list] = None, input_options: Optional[list] = None)[source]¶
Bases:
BaseEntity
Webhook object
Codebase¶
Service¶
- class InstanceCatalog(value)[source]¶
-
The Service Pode size.
State
Description
REGULAR_XS
regular pod with extra small size
REGULAR_S
regular pod with small size
REGULAR_M
regular pod with medium size
REGULAR_L
regular pod with large size
REGULAR_XL
regular pod with extra large size
HIGHMEM_XS
highmem pod with extra small size
HIGHMEM_S
highmem pod with small size
HIGHMEM_M
highmem pod with medium size
HIGHMEM_L
highmem pod with large size
HIGHMEM_XL
highmem pod with extra large size
GPU_K80_S
GPU pod with small size
GPU_K80_M
GPU pod with medium size
- class KubernetesAutuscalerType(value)[source]¶
-
The Service Autuscaler Type (RABBITMQ, CPU).
State
Description
RABBITMQ
Service Autuscaler will be in RABBITMQ
CPU
Service Autuscaler will be in in local CPU
- class OnResetAction(value)[source]¶
-
The Execution action when the service reset (RERUN, FAILED).
State
Description
RERUN
When the service resting rerun the execution
FAILED
When the service resting fail the execution
- class RuntimeType(value)[source]¶
-
Service culture Runtime (KUBERNETES).
State
Description
KUBERNETES
Service run in kubernetes culture
- class Service(created_at, updated_at, creator, version, package_id, package_revision, bot, use_user_jwt, init_input, versions, module_name, name, url, id, active, driver_id, secrets, runtime: KubernetesRuntime, queue_length_limit, run_execution_as_process: bool, execution_timeout, drain_time, on_reset: OnResetAction, type: ServiceType, project_id, is_global, max_attempts, package, client_api: ApiClient, revisions=None, project=None, repositories=NOTHING)[source]¶
Bases:
BaseEntity
Service object
- activate_slots(project_id: Optional[str] = None, task_id: Optional[str] = None, dataset_id: Optional[str] = None, org_id: Optional[str] = None, user_email: Optional[str] = None, slots=None, role=None, prevent_override: bool = True, visible: bool = True, icon: str = 'fas fa-magic', **kwargs) object [source]¶
Activate service slots
- Parameters
project_id (str) – project id
task_id (str) – task id
dataset_id (str) – dataset id
org_id (str) – org id
user_email (str) – user email
slots (list) – list of entities.PackageSlot
role (str) – user role MemberOrgRole.ADMIN, MemberOrgRole.owner, MemberOrgRole.MEMBER
prevent_override (bool) – True to prevent override
visible (bool) – visible
icon (str) – icon
kwargs – all additional arguments
- Returns
list of user setting for activated slots
- Return type
Example:
setting = service.activate_slots(project_id='project_id', slots=List[entities.PackageSlot], icon='fas fa-magic')
- execute(execution_input=None, function_name=None, resource=None, item_id=None, dataset_id=None, annotation_id=None, project_id=None, sync=False, stream_logs=True, return_output=True)[source]¶
Execute a function on an existing service
- Parameters
execution_input (List[FunctionIO] or dict) – input dictionary or list of FunctionIO entities
function_name (str) – function name to run
resource (str) – input type.
item_id (str) – optional - item id as input to function
dataset_id (str) – optional - dataset id as input to function
annotation_id (str) – optional - annotation id as input to function
project_id (str) – resource’s project
sync (bool) – if true, wait for function to end
stream_logs (bool) – prints logs of the new execution. only works with sync=True
return_output (bool) – if True and sync is True - will return the output directly
- Returns
execution object
- Return type
Example:
execution = service.execute(function_name='function_name', item_id='item_id', project_id='project_id')
- classmethod from_json(_json: dict, client_api: ApiClient, package=None, project=None, is_fetched=True)[source]¶
Build a service entity object from a json
- Parameters
_json (dict) – platform json
client_api (dl.ApiClient) – ApiClient entity
package (dtlpy.entities.package.Package) – package entity
project (dtlpy.entities.project.Project) – project entity
is_fetched (bool) – is Entity fetched from Platform
- Returns
service object
- Return type
- log(size=None, checkpoint=None, start=None, end=None, follow=False, text=None, execution_id=None, function_name=None, replica_id=None, system=False, view=True, until_completed=True)[source]¶
Get service logs
- Parameters
size (int) – size
checkpoint (dict) – the information from the lst point checked in the service
start (str) – iso format time
end (str) – iso format time
follow (bool) – if true, keep stream future logs
text (str) – text
execution_id (str) – execution id
function_name (str) – function name
replica_id (str) – replica id
system (bool) – system
view (bool) – if true, print out all the logs
until_completed (bool) – wait until completed
- Returns
ServiceLog entity
- Return type
Example:
service_log = service.log()
- to_json()[source]¶
Returns platform _json format of object
- Returns
platform json format of object
- Return type
- class ServiceType(value)[source]¶
-
The type of the service (SYSTEM).
State
Description
SYSTEM
Dataloop internal service
Bot¶
- class Bot(created_at, updated_at, name, last_name, username, avatar, email, role, type, org, id, project, client_api=None, users=None, bots=None, password=None)[source]¶
Bases:
User
Bot entity
Trigger¶
- class BaseTrigger(id, url, created_at, updated_at, creator, name, active, type, scope, is_global, input, function_name, service_id, webhook_id, pipeline_id, special, project_id, spec, operation, service, project, client_api: ApiClient, op_type='service', repositories=NOTHING)[source]¶
Bases:
BaseEntity
Trigger Entity
- classmethod from_json(_json, client_api, project, service=None)[source]¶
Build a trigger entity object from a json
- Parameters
_json (dict) – platform json
client_api (dl.ApiClient) – ApiClient entity
project (dtlpy.entities.project.Project) – project entity
service (dtlpy.entities.service.Service) – service entity
- Returns
- class CronTrigger(id, url, created_at, updated_at, creator, name, active, type, scope, is_global, input, function_name, service_id, webhook_id, pipeline_id, special, project_id, spec, operation, service, project, client_api: ApiClient, op_type='service', repositories=NOTHING, start_at=None, end_at=None, cron=None)[source]¶
Bases:
BaseTrigger
- class Trigger(id, url, created_at, updated_at, creator, name, active, type, scope, is_global, input, function_name, service_id, webhook_id, pipeline_id, special, project_id, spec, operation, service, project, client_api: ApiClient, op_type='service', repositories=NOTHING, filters=None, execution_mode=TriggerExecutionMode.ONCE, actions=TriggerAction.CREATED, resource=TriggerResource.ITEM)[source]¶
Bases:
BaseTrigger
Trigger Entity
- classmethod from_json(_json, client_api, project, service=None)[source]¶
Build a trigger entity object from a json
- Parameters
_json – platform json
client_api – ApiClient entity
project (dtlpy.entities.project.Project) – project entity
service (dtlpy.entities.service.Service) – service entity
- Returns
Execution¶
- class Execution(id, url, creator, created_at, updated_at, input, output, feedback_queue, status, status_log, sync_reply_to, latest_status, function_name, duration, attempts, max_attempts, to_terminate: bool, trigger_id, service_id, project_id, service_version, package_id, package_name, client_api: ApiClient, service, project=None, repositories=NOTHING, pipeline: Optional[dict] = None)[source]¶
Bases:
BaseEntity
Service execution entity
- classmethod from_json(_json, client_api, project=None, service=None, is_fetched=True)[source]¶
- Parameters
_json (dict) – platform json
client_api (dl.ApiClient) – ApiClient entity
project (dtlpy.entities.project.Project) – project entity
service (dtlpy.entities.service.Service) –
is_fetched – is Entity fetched from Platform
- progress_update(status: Optional[ExecutionStatus] = None, percent_complete: Optional[int] = None, message: Optional[str] = None, output: Optional[str] = None, service_version: Optional[str] = None)[source]¶
Update Execution Progress
Pipeline¶
- class Pipeline(id, name, creator, org_id, connections, settings: PipelineSettings, status: CompositionStatus, created_at, updated_at, start_nodes, project_id, composition_id, url, preview, description, revisions, project, client_api: ApiClient, original_settings: PipelineSettings, repositories=NOTHING)[source]¶
Bases:
BaseEntity
Pipeline object
- execute(execution_input=None)[source]¶
execute a pipeline and return the execute
- Parameters
execution_input – list of the dl.FunctionIO or dict of pipeline input - example {‘item’: ‘item_id’}
- Returns
entities.PipelineExecution object
- classmethod from_json(_json, client_api, project, is_fetched=True)[source]¶
Turn platform representation of pipeline into a pipeline entity
- Parameters
_json (dict) – platform representation of package
client_api (dl.ApiClient) – ApiClient entity
project (dtlpy.entities.project.Project) – project entity
is_fetched (bool) – is Entity fetched from Platform
- Returns
Pipeline entity
- Return type
- install(resume_option: Optional[PipelineResumeOption] = None)[source]¶
install pipeline
- Returns
Composition entity
- pause(keep_triggers_active: Optional[bool] = None)[source]¶
pause pipeline
- Returns
Composition entity
- reset(stop_if_running: bool = False)[source]¶
Resets pipeline counters
- Parameters
stop_if_running (bool) – If the pipeline is installed it will stop the pipeline and reset the counters.
- Returns
bool
- set_start_node(node: PipelineNode)[source]¶
Set the start node of the pipeline
- Parameters
node (PipelineNode) – node to be the start node
- stats()[source]¶
Get pipeline counters
- Returns
PipelineStats
- Return type
dtlpy.entities.pipeline.PipelineStats
Pipeline Execution¶
- class PipelineExecution(id, nodes, executions, status, created_at, updated_at, pipeline_id, max_attempts, pipeline, client_api: ApiClient, repositories=NOTHING)[source]¶
Bases:
BaseEntity
Package object
- classmethod from_json(_json, client_api, pipeline, is_fetched=True)[source]¶
Turn platform representation of pipeline_execution into a pipeline_execution entity
- Parameters
_json (dict) – platform representation of package
client_api (dl.ApiClient) – ApiClient entity
pipeline (dtlpy.entities.pipeline.Pipeline) – Pipeline entity
is_fetched (bool) – is Entity fetched from Platform
- Returns
Pipeline entity
- Return type
Other¶
Pages¶
- class PagedEntities(client_api: ApiClient, page_offset, page_size, filters, items_repository, has_next_page=False, total_pages_count=0, items_count=0, service_id=None, project_id=None, order_by_type=None, order_by_direction=None, execution_status=None, execution_resource_type=None, execution_resource_id=None, execution_function_name=None, list_function=None, items=[])[source]¶
Bases:
object
Pages object
- get_page(page_offset=None, page_size=None)[source]¶
Get page
- Parameters
page_offset – page offset
page_size – page size
Base Entity¶
Command¶
- class Command(id, url, status, created_at, updated_at, type, progress, spec, error, client_api: ApiClient, repositories=NOTHING)[source]¶
Bases:
BaseEntity
Com entity
- classmethod from_json(_json, client_api, is_fetched=True)[source]¶
Build a Command entity object from a json
- Parameters
_json – _json response from host
client_api – ApiClient entity
is_fetched – is Entity fetched from Platform
- Returns
Command object
- in_progress()[source]¶
Check if command is still in one of the in progress statuses
- Returns
True if command still in progress
- Return type
- to_json()[source]¶
Returns platform _json format of object
- Returns
platform json format of object
- Return type
Directory Tree¶
Utilities¶
converter¶
- class Converter(concurrency=6, return_error_filepath=False)[source]¶
Bases:
object
Annotation Converter
- attach_agent_progress(progress: Progress, progress_update_frequency: Optional[int] = None)[source]¶
Attach agent progress.
- Parameters
progress (Progress) – the progress object that follows the work
progress_update_frequency (int) – progress update frequency in percentages
- convert(annotations, from_format: str, to_format: str, conversion_func=None, item=None)[source]¶
Convert annotation list or single annotation.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
item (dtlpy.entities.item.Item) – item entity
annotations (list or AnnotationCollection) – annotations list to convert
from_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP
to_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP
conversion_func (Callable) – Custom conversion service
- Returns
the annotations
- convert_dataset(dataset, to_format: str, local_path: str, conversion_func=None, filters=None, annotation_filter=None)[source]¶
Convert entire dataset.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
dataset (dtlpy.entities.dataet.Dataset) – dataset entity
to_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP
local_path (str) – path to save the result to
conversion_func (Callable) – Custom conversion service
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filter parameters
annotation_filter (dtlpy.entities.filters.Filters) – Filter entity
- Returns
the error log file path if there are errors and the coco json if the format is coco
- convert_directory(local_path: str, to_format: AnnotationFormat, from_format: AnnotationFormat, dataset, conversion_func=None)[source]¶
Convert annotation files in entire directory.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
local_path (str) – path to the directory
to_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP
from_format (str) – AnnotationFormat to convert from – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP
dataset (dtlpy.entities.dataset.Dataset) – dataset entity
conversion_func (Callable) – Custom conversion service
- Returns
the error log file path if there are errors
- convert_file(to_format: str, from_format: str, file_path: str, save_locally: bool = False, save_to: Optional[str] = None, conversion_func=None, item=None, pbar=None, upload: bool = False, **_)[source]¶
Convert file containing annotations.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
to_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP
from_format (str) – AnnotationFormat to convert from – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP
file_path (str) – path of the file to convert
pbar (tqdm) – tqdm object that follows the work (progress bar)
upload (bool) – if True upload
save_locally (bool) – If True, save locally
save_to (str) – path to save the result to
conversion_func (Callable) – Custom conversion service
item (dtlpy.entities.item.Item) – item entity
- Returns
annotation list, errors
- static custom_format(annotation, conversion_func, i_annotation=None, annotations=None, from_format=None, item=None, **_)[source]¶
Custom convert function.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
annotation (dtlpy.entities.annotation.Annotation or dict) – annotations to convert
conversion_func (Callable) – Custom conversion service
i_annotation (int) – annotation index
annotations (list) – list of annotations
param str from_format: AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP :param dtlpy.entities.item.Item item: item entity :return: converted Annotation
- from_coco(annotation, **kwargs)[source]¶
Convert from COCO format to DATALOOP format. Use this as conversion_func param for functions that ask for this param.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
annotation – annotations to convert
kwargs – additional params
- Returns
converted Annotation entity
- Return type
- static from_voc(annotation, **_)[source]¶
Convert from VOC format to DATALOOP format. Use this as conversion_func for functions that ask for this param.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
annotation – annotations to convert
- Returns
converted Annotation entity
- Return type
- from_yolo(annotation, item=None, **kwargs)[source]¶
Convert from YOLO format to DATALOOP format. Use this as conversion_func param for functions that ask for this param.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
annotation – annotations to convert
item (dtlpy.entities.item.Item) – item entity
kwargs – additional params
- Returns
converted Annotation entity
- Return type
- save_to_file(save_to, to_format, annotations, item=None)[source]¶
Save annotations to a file.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
save_to (str) – path to save the result to
to_format – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP
annotations (list) – annotation list to convert
item (dtlpy.entities.item.Item) – item entity
- static to_coco(annotation, item=None, **_)[source]¶
Convert from DATALOOP format to COCO format. Use this as conversion_func param for functions that ask for this param.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
annotation (dtlpy.entities.annotation.Annotation or dict) – annotations to convert
item (dtlpy.entities.item.Item) – item entity
**_ –
additional params
- Returns
converted Annotation
- Return type
- static to_voc(annotation, item=None, **_)[source]¶
Convert from DATALOOP format to VOC format. Use this as conversion_func param for functions that ask for this param.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
annotation (dtlpy.entities.annotation.Annotation or dict) – annotations to convert
item (dtlpy.entities.item.Item) – item entity
**_ –
additional params
- Returns
converted Annotation
- Return type
- to_yolo(annotation, item=None, **_)[source]¶
Convert from DATALOOP format to YOLO format. Use this as conversion_func param for functions that ask for this param.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
annotation (dtlpy.entities.annotation.Annotation or dict) – annotations to convert
item (dtlpy.entities.item.Item) – item entity
**_ –
additional params
- Returns
converted Annotation
- Return type
- upload_local_dataset(from_format: AnnotationFormat, dataset, local_items_path: Optional[str] = None, local_labels_path: Optional[str] = None, local_annotations_path: Optional[str] = None, only_bbox: bool = False, filters=None, remote_items=None)[source]¶
Convert and upload local dataset to dataloop platform.
Prerequisites: You must be an owner or developer to use this method.
- Parameters
from_format (str) – AnnotationFormat to convert to – AnnotationFormat.COCO, AnnotationFormat.YOLO, AnnotationFormat.VOC, AnnotationFormat.DATALOOP
dataset (dtlpy.entities.dataset.Dataset) – dataset entity
local_items_path (str) – path to items to upload
local_annotations_path (str) – path to annotations to upload
local_labels_path (str) – path to labels to upload
only_bbox (bool) – only for coco datasets, if True upload only bbox
filters (dtlpy.entities.filters.Filters) – Filters entity or a dictionary containing filter parameters
remote_items (list) – list of the items to upload
- Returns
the error log file path if there are errors
Tutorials¶
Getting Started With Dataloop SDK¶
Getting started with Dataloop SDK
Dataloop SDK Overview¶
Dataloop provides an end-to-end platform that supports the entire AI lifecycle, from development to production. By leveraging both a data management and annotation platform, deep learning data generation is streamlined, resulting in accelerated automated pipeline production and reduced engineering time and costs.
The Dataloop platform is built upon an extensive Python SDK that provides full control over your projects and code. It allows you to automate CRUD (Create, Read, Update, Delete) operations within the platform for
Projects
Datasets
Items
Annotations
Metadata
About this guide¶
The Getting Started guide provides the developer with an efficient SDK on-boarding experience and covers the following:
Installing Prerequisite Software¶
The Dataloop SDK requires several prerequisite software packages to be installed on your system before it can be used.
:information_source: The scope of this guide does not cover detailed external software installation issues. Please use the provided software vendor website links for further installation information and troubleshooting related to your OS.
Python¶
Python 3.6 or later must be installed in order to use the SDK.
From the Downloads page, select your desired OS and proceed with the download.
Once the download is complete, you can proceed to install the software.
Dataloop SDK Package¶
pip¶
The SDK package requires pip to be installed on your system. pip Is the package installer for Python. If Python was installed from python.org as described above, pip should already be installed.
You can check if pip is installed on your system.
Run the following from the Command Line:
pip --version
If pip isn’t already installed, you can bootstrap it from the standard library.
Run the following from the Command Line:
python3 -m ensurepip --default-pip
DTLPY Package¶
Once you have verified that pip is installed, the Dataloop SDK Package can be installed.
Run the following from the Command Line:
pip install dtlpy
Once the SDK Package is successfully installed, a confirmation message is displayed:
Successfully installed dtlpy-1.64.9
SDK Login¶
Once the Dataloop SDK Package is installed, you can login to the SDK.
To log in to the Dataloop SDK:
Open a Python Shell.
Run the following Python command:
import dtlpy as dl
dl.login()
Login tokens expire after 24hours, therefore the following expression can be added at the start of the command:
if dl.token_expired():
dl.login()
A web browser login screen is displayed:
alt_text
Enter your credentials, or alternatively login using a Google account.
Once your credentials have been verified a confirmation message is displayed:
alt_text
Machine-to-Machine Login¶
Long-running SDK jobs require API authentication.The M2M flow allows machines to obtain valid, signed JWT (authentication token) and automatically refresh it, without the need for a web browser login.
M2M Login is recommended when you want to:- run commands on the platform without an ongoing internet connection- run API commands directly from an external system to Dataloop
:information_source: This can be done with your email and password (signup with a password), or using project bots (which is NOT is the scope of this tutorial).
dl.login_m2m(email=email, password=password)
Datasets¶
In Dataloop, a dataset is a collection of items (files), their respective metadata, and annotations. Datasets have a file system structure and are organized into folders and subfolders at multiple levels.
There are 3 types of datasets:
Master - The original dataset which manages the actual binaries.
Clone - Contains pointers to original files, which enables management of virtual items that do not replicate the binaries of the underlying storage once cloned or copied. When cloning a dataset, users can decide if the new copy will overwrite the original metadata and annotations.
Merge - Several datasets can be merged into one, allowing multiple annotations to be combined into the same dataset.
Creating a New Dataset¶
Before a new dataset can be created, at least one project must exist.
To create a new project:¶
Run the following command to create a new project named: My-First-Project:
project = dl.projects.create(project_name='My-First-Project')
The new project is created.
The new project must be selected prior to creating a new associated dataset.
To select the new project:¶
Run the following command to select the new project named created in the above step:
project = dl.projects.get(project_name='My-First-Project')
The new project is selected.
A project can also be referenced in the above command via its unique project_id.
To select the new project using a project_id:¶
Run The following command to select the new project by referencing the project_id:
project = dl.projects.get(project_id='e4a5e5b3-a22a-4b59-9b76-30417a0859d9')
The new project is selected.
The new dataset can now be created.
To create a new dataset:¶
Run the following command to create a new dataset named My-First-Dataset associated with the project My-First-Project:
project.datasets.create(dataset_name='My-First-Dataset')
Confirmation of the successfully created dataset is displayed:
Dataset(id='632c24ae3444a86f029acb47', url='https://gate.dataloop.ai/api/v1/datasets/632c1194120a7571664d0de3', name='My-First-Dataset', creator='JohnDoe@gmail.com', items_count=0, expiration_options=None, index_driver='v1', created_at='2022-09-22T07:41:08.324Z')
:information_source: Your Dataset ID will differ from the example above.
Uploading items¶
Items (files) can be uploaded to datasets in a file system structure and are organized into folders and subfolders. Individual items or entire folders can be uploaded.
Before items can be uploaded, the dataset to which the items will be uploaded must be selected.
To select the dataset:¶
Run the following command to initialize a new instance (dataset) of the new dataset (My-First-Dataset) in order to upload items:
dataset = project.datasets.get(dataset_name='My-First-Dataset')
Confirmation of the new instance of the selected dataset is displayed:
Dataset(id='632c24ae3444a86f029acb47', url='https://gate.dataloop.ai/api/v1/datasets/632c1194120a7571664d0de3', name='My-First-Dataset', creator='JohnDoe@gmail.com', items_count=0, expiration_options=None, index_driver='v1', created_at='2022-09-22T07:41:08.324Z')
If the selected dataset does not exist the following error message is displayed:
dtlpy.exceptions.NotFound: ('404', "Dataset not found. Name: 'My-First-Dataset')
Once the dataset instance has been successfully initialized, items can be uploaded.
The structure of the Upload Item Command is:
dataset.items.upload(local_path='/path/to/file.extension')
:information_source: directory paths look different in Windows and in Linux, Windows require an “r” at the beginning.
To upload an item to a dataset:¶
Create a local directory in your file explorer. For this example, C:\UploadDemo is used.
Run The following command to upload an image file from a local directory:
dataset.items.upload(local_path=r'C:\UploadDemo\test1.jpg')
:warning: Ensure the path and file exists before running the command.
Confirmation of the completed upload is displayed:
Upload Items: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.54it/s]
Item(dataset_url='https://gate.dataloop.ai/api/v1/datasets/'632c24ae3444a86f029acb47', created_at='2022-09-22T10:18:03.000Z', dataset_id='632c24ae3444a86f029acb47', filename='/test1.jpg', name='test.jpg', type='file', id='632dadf7b28a0c0da317dfc8', spec=None, creator='JohnDoe@gmail.com', _description=None, annotations_count=0)
The Item ID of the uploaded file is 632dadf7b28a0c0da317dfc8. This ID is used when Listing/Getting items ( See Getting Items).
:information_source: Your Item ID will differ from the example above.
If the item to upload is not found, the following error message is displayed:
dtlpy.exceptions.NotFound: ('404', 'Unknown local path: C:\\UploadDemo\\test1.jpg')
:information_source: By default, files are uploaded to the root directory. Items can be uploaded to an existing folder within a dataset using the remote_path argument (Not in the scope of this guide).
Exercise 1¶
Write the commands to Upload a 2nd image (test2.jpg) file item to My-First-Dataset.
Getting Items¶
Items can be retrieved from a dataset individually using the item ID. Alternatively, all items can be retrieved using a loop.
Getting a Single Item¶
The command structure of Getting a Single Item is:
item = dataset.items.get(item_id='my_item_id')
item.print()
Run the following command to set an instance of a single item object (item_1) from the dataset (My-First-Dataset) by specifying an item ID:
item_1 = dataset.items.get(item_id='632c365b6002b1266e007830')
:information_source: Your Item ID will differ from the example above.
Run the following command to print the specified item:
item_1.print()
The item details are displayed including the following:
Filename
Creator of the item
Created timestamp
Dataset ID of the item
Exercise 2¶
Write the commands to print the details of the 2nd uploaded item (Test2). Name the item object item_2
:bulb: Remember: The ID of the item (Test2) must be identified first.
Getting All Items¶
All item details in a dataset can be printed using a loop.
Run the following command to loop through the dataset and print all item details:
pages = dataset.items.list()
for item in pages.all():
item.print()
or:
pages = dataset.items.list()
for page in pages:
for item in page:
item.print()
All dataset item details are displayed.
Annotating Items¶
Dataset items are annotated using Labels. A Label is composed of various Label Settings and Instructions that are defined by a dataset’s Recipe. For example, an image or video file item can contain 1 label defined as a Classification to categorize the entire image and multiple other labels defined as Point Markers to identify specific objects in an image/video file item.
Classification¶
Classifications are used to categorize an entire image or scene (in the case of a video file). For example, a Classification label can be used to classify product images under categories, subcategories, and characteristics, such as men’s clothes, polo shirts, etc.
The SDK can add Classification labels to an Item using 2 steps.
Adding a label to a dataset’s Recipe.
Adding the label to an item as a Classification.
Run the following command to add a Label (Person) to the My-First-Dataset dataset recipe.
dataset.add_label(label_name='Person')
The label is created and its Properties are displayed.
[Label(tag='Person', display_data={}, color='#0214a7', display_label='Person', attributes=[], children=[])]
Run the following commands to Annotate and Upload the label (Person) as a Classification to the item (item_1):
builder = item_1.annotations.builder()
builder.add(annotation_definition=dl.Classification(label='Person'))
item_1.annotations.upload(builder)
The label is annotated as a Classification to item_1.
Point Markers¶
A Point Marker is used to identify specific objects in an image or video item. For example, an image of a person’s face can contain multiple Point Marker labels specifying the person’s eyes, mouth, ears, etc.
Point Marker commands accept 2 coordinate input parameters (x,y) which specify where the label is plotted on the image.
The SDK can add Point Marker labels to an Item using 2 steps.
Adding a label to a dataset’s Recipe.
Adding the label to an item as a Point Marker.
To Add/Upload a Point Marker Label to a Dataset Recipe:¶
Run the following command to add a Label (Ear) to the My-First-Dataset dataset recipe.
dataset.add_label(label_name='Ear')
The label is created its Properties are displayed.
[Label(tag='Ear', display_data={}, color='#0214a7', display_label='Person', attributes=[], children=[])]
Run the following commands to Annotate and Upload the label (Ear) as 2 Point Markers to the item (item_1):
builder = item_1.annotations.builder()
builder.add(annotation_definition=dl.Point(x=80, y=80, label='Ear'))
builder.add(annotation_definition=dl.Point(x=120, y=120, label='Ear'))
item_1.annotations.upload(builder)
The label is annotated as 2 Point Markers to item_1.
:information_source: Other Label Types include Box, Cube, Polygon etc.
Exercise 3¶
Annotate 3 items (use item_2 from Exercise 2) with the Classification of ‘Face’.
Annotate 2 random Point Marker annotations with the label ‘Eye’ to an item (use item_2 from Exercise 2).
:bulb: Remember: The label must first be added to the Recipe of the dataset.
Working with Filters¶
The SDK supports the filtering of item data. You can filter items by creating Filter Queries that define the Parameters of the filter. For example, you can create a Filter Query that filters item data on a specific field name, or by an item’s annotation label.
Multiple Parameters can be added to a Filter Query, for example, you can include a parameter that filters for all items that include Point Marker Annotation types that are Labelled as ’Ear’.
Creating Filters¶
The first step is to create a Filter Query.
Run the following command to create a Filter Query named my_filter
my_filter = dl.Filters()
The Filter Query is created.
Once the Filter Query is created, Filter Parameters can be added.
Run the following command to add a Filter Parameter to my_filter that filters for all items that include Point Marker Annotation types:
my_filter.add_join(field='type', values='point')
The Filter Parameter is created.
:information_source: Other Fields can be used as Filter Parameters including id, dataset_id, etc.
Additional Filter Parameters can be added to the Filter Query.
To add additional filter parameters:
Run the following command to add Additional Filter Parameter to my_filter that filters for all items that include a Label value of ‘Ear’.
my_filter.add_join(field='label', values='Ear')
The Additional Filter Parameter is added.
The created Filter Query can be applied to the dataset and displayed.
Run the following commands to Apply the Filter Query to the dataset and display the filtered item(s):
pages = dataset.items.list(filters=my_filter)
for item in pages.all():
item.print()
The Filter Query is applied to the datasetand the filtered item(s) details are displayed:
Iterate Pages: 0%| | 0/1 [00:00<?, ?it/s]Item(dataset_url='https://gate.dataloop.ai/api/v1/datasets/632c24ae3444a86f029acb47', created_at='2022-09-23T13:00:39.000Z', dataset_id='632c24ae3444a86f029acb47', filename='/test1.jpg', name='test1.jpg', type='file', id='632dadf7b28a0c0da317dfc8', spec=None, creator='JohnDoe@gmail.com', _description=None, annotations_count=7)
Iterate Pages: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 169.70it/s]
>>>
Using Filters to Replace Data¶
Filters can be used to Replace existing item data. For example, you can Create and Apply a Filter Query that returns a subset of item data that includes a particular Classification such as ‘Person’ and Replace it with another value, such as ‘Adult’, across the entire subset.
The first step is to Create a new Filter Query with a Filter Parameter that filters for all items that include a Label value of ‘Person’.
To Create the Replacement Filter Query:¶
Run the following commands to create the Replacement Filter Query and Filter Parameter:
person_filter = dl.Filters(resource=dl.FILTERS_RESOURCE_ITEM)
person_filter.add_join(field='label', values='Person')
The Replacement Filter Query and Filter Parameter are created.
The new label can be added with the value ‘Adult’.
Run the following commands to create the new label:
dataset.add_label(label_name='Adult')
pages = dataset.items.list(filters=person_filter)
The new label is created.
The existing label can be deleted and replaced with the new label.
Run the following commands to delete the existing label and Add the new label:
import dtlpy as dl
person_ann_filter = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
person_ann_filter.add(field='label', values='Person')
for item in pages.all():
item.annotations.delete(filters=person_ann_filter)
annotations = item.annotations.builder()
annotations.add(annotation_definition=dl.Classification(label='Adult'))
item.annotations.upload(annotations)
All instances of the old label are replaced in each item with the new label.
Exercise 4¶
Create and Apply a Filter Query (use item_2 from Exercise 3) that filters items and returns all items that include Point Marker Annotations that are labeled ‘Eye’.
Create and Apply a Filter Query (use item_2 from Exercise 3) that filters the items with the ‘Face’ classification, deletes the label, and replaces it with the label ‘Person’.
Working with Item Metadata¶
Metadata is a dictionary attribute used with items, annotations, and other entities of the Dataloop system, for example, Recipes.
You can add any keys and values to both item and annotation user metadata sections using the SDK. This user metadata can be used for data filtering, sorting, etc.
Adding a New User Metadata Field to an Item¶
The following example will demonstrate adding a new user metadata field named Date&Time to the item named test1, which in this case has an item ID = 632dadf7b28a0c0da317dfc8
:information_source: Your Item ID will differ from the example above. See Get a Single Item.
The first step is to import the datetime module.
Run the following commands to import the datetime module:
import datetime
The datetime module is imported.
An instance of item test1 can be created.
Create an instanceof item test1 named item_1.
item_1 = dataset.items.get(item_id='632dadf7b28a0c0da317dfc8')
An instance of item test1 named item_1 is created.
The current date can be assigned to a new field in the item’s metadata named Date&Time and the item can be updated.
Run the following commands to assign the date to a new metadata field and update the item:
now = datetime.datetime.now().isoformat()
# modify metadata for the item
item_1.metadata['user'] = dict()
# add it to the item's metadata
item_1.metadata['user']['dateTime'] = now
# update the item
item_1 = item_1.update()
The date is assigned to the new metadata field and the item is updated.
Metadata fields can also be created for a subset of items at once using filters.
Run the following commands to create metadata fields for a subset of items that include the label ‘Person’ using a filter:
filters = dl.Filters()
filters.add_join(field='label', values='Person')
now = datetime.datetime.now().isoformat()
dataset.items.update(filters=filters, update_values={'user': {'dateTime': now}})
The date is assigned to the new metadata field and all items that include the label ‘Person’ are updated.
Exercise 5¶
For the filtered items with the classification ‘Adult’ from Exercise 4, add a new field called ‘date’ In the item’s user metadata and assign it the current date.
Creating Tasks¶
A Task is used to initiate annotations. A Task requires defining the included data items, the assignee(s), and other options such as due date, etc.
To Create a Task¶
Run the following commands to create a Task containing items with the label ‘Person’ (from the previous example).
task = dataset.tasks.create(task_name='test',
due_date=datetime(day=15, month=7, year=2022).timestamp(),
assignee_ids=['JohnDoe@gmail.com'],
filters=filters)
The task is created.
Exercise 6¶
Create a Task that contains those items from Exercise 5, ie all the items filtered for the classification ‘Adult’
Logging out¶
To Logout Run the following command to Logout of the SDK:
dl.logout()
Data Management Tutorial¶
Tutorials for data management
Cloud Storage¶
External Storage Dataset¶
If you already have your data managed and organized on a cloud storage service, such as GCS/S3/Azure, you may want toutilize that with Dataloop, and not upload the binaries and create duplicates.
Cloud Storage Integration¶
Access & Permissions - Creating an integration with GCS/S2/Azure cloud requires adding a key/secret with the followingpermissions:
List (Mandatory) - allowing Dataloop to list all the items in the storage.Get (Mandatory) - get the items and perform pre-process functionalities like thumbnails, item info etc.Put / Write (Mandatory) - lets you upload your itemsdirectly to the external storage from the Dataloop platform.Delete - lets you delete your items directly from the external storage using the Dataloop platform.
Create Integration With GCS¶
Creating an integration GCS requires having JSON file with GCS configuration.¶
import dtlpy as dl
if dl.token_expired():
dl.login()
organization = dl.organizations.get(organization_name=org_name)
with open(r"C:\gcsfile.json", 'r') as f:
gcs_json = json.load(f)
gcs_to_string = json.dumps(gcs_json)
organization.integrations.create(name='gcsintegration',
integrations_type=dl.ExternalStorage.GCS,
options={'key': '',
'secret': '',
'content': gcs_to_string})
Create Integration With Amazon S3¶
import dtlpy as dl
if dl.token_expired():
dl.login()
organization = dl.organizations.get(organization_name='my-org')
organization.integrations.create(name='S3integration', integrations_type=dl.ExternalStorage.S3,
options={'key': "my_key", 'secret': "my_secret"})
Create Integration With Azure¶
import dtlpy as dl
if dl.token_expired():
dl.login()
organization = dl.organizations.get(organization_name='my-org')
organization.integrations.create(name='azureintegration',
integrations_type=dl.ExternalStorage.AZUREBLOB,
options={'key': 'my_key',
'secret': 'my_secret',
'clientId': 'my_clientId',
'tenantId': 'my_tenantId'})
External Storage Driver¶
Once you have an integration, you can set up a driver, which adds a specific bucket (and optionally with a specificpath/folder) as a storage resource.
Create Drivers in the Platform (browser)¶
# param name: the driver name
# param driver_type: ExternalStorage.S3, ExternalStorage.GCS , ExternalStorage.AZUREBLOB
# param integration_id: the integration id
# param bucket_name: the external bucket name
# param project_id:
# param allow_external_delete:
# param region: relevant only for s3 - the bucket region
# param storage_class: relevant only for s3
# param path: Optional. By default, path is the root folder. Path is case sensitive.
# return: driver object
import dtlpy as dl
project = dl.projects.get('prject_name')
driver = project.drivers.create(name='driver_name',
driver_type=dl.ExternalStorage.S3,
integration_id='integration_id',
bucket_name='bucket_name',
allow_external_delete=True,
region='eu-west-1',
storage_class="",
path="")
Once the integration and drivers are ready, you can create a Dataloop Dataset and sync all the data:
# create a dataset from a driver name, you can also create by the driver ID
import dtlpy as dl
project: dl.Project
dataset = project.datasets.create(dataset_name=dataset_name,
driver=driver)
dataset.sync()
Dataset Binding with AWS¶
We will create an AWS Lambda to continuously sync a bucket with Dataloop’s dataset
If you want to catch events from the AWS bucket and update the Dataloop Dataset you need to set up a Lambda.The Lambda will catch the AWS bucket events and will reflect them into the Dataloop Platform.
We have prepared an environment zip file with our SDK for python3.8, so you don’t need to create anything else to use dtlpy in the lambda.
NOTE: For any other custom use (e.g. other python version or more packages) try creating your own layer (We used this tutorial and the python:3.8 docker image).
Create the Lambda¶
Create a new Lambda
The default timeout is 3[s] so we’ll need to change to 1[m]:Configuration → General configuration → Edit → Timeout
Copy the following code:
import os
import urllib.parse
# Set dataloop path to tmp (to read/write from the lambda)
os.environ["DATALOOP_PATH"] = "/tmp"
import dtlpy as dl
DATASET_ID = ''
DTLPY_USERNAME = ''
DTLPY_PASSWORD = ''
def lambda_handler(event, context):
dl.login_m2m(email=DTLPY_USERNAME, password=DTLPY_PASSWORD)
dataset = dl.datasets.get(dataset_id=DATASET_ID,
fetch=False # to avoid GET the dataset each time
)
for record in event['Records']:
# Get the bucket name
bucket = record['s3']['bucket']['name']
# Get the file name
filename = urllib.parse.unquote_plus(record['s3']['object']['key'], encoding='utf-8')
if 'ObjectRemoved' in record['eventName']:
# On delete event - delete the item from Dataloop
try:
dtlpy_filename = '/' + filename
filters = dl.Filters(field='filename', values=dtlpy_filename)
dataset.items.delete(filters=filters)
except Exception as e:
raise e
elif 'ObjectCreated' in record['eventName']:
# On create event - add a new item to the Dataset
try:
# upload the file
path = 'external://' + filename
# dataset.items.upload(local_path=path, overwrite=True) # if overwrite is required
dataset.items.upload(local_path=path)
except Exception as e:
raise e
Add a Layer to the Lambda¶
We have created an AWS Layer with the Dataloop SDK ready. Click here to download the zip file.Because the layer’s size is larger than 50MB you cannot use it directly (AWS restrictions), but need to upload it to a bucket first.Once uploaded, create a new layer for the dtlpy env:
Go to the layers screen and “click Add Layer”.
add_layer
Choose a name (dtlpy-env).
Use the link to the bucket layer.zip.
Select the env (x86_64, python3.8).
Click “Create” and the bottom on the page.
Go back to your lambda and add the layer:
Select the “Add Layer”.
add_layer
Choose “Custom layer” and select the Layer you’ve added and the version.
click “Add” at the bottom.
Create the Bucket Events¶
Go to the bucket you are using, and create the event:
Go to Properties → Event notifications → Create event notification
Choose a name for the Event
For Event types choose: All object create events, All object delete events
Destination - Lambda function → Choose from your Lambda functions → choose the function you build → SAVE
Deploy and you’re good to go!
Dataset Binding with Azure¶
We will create an Azure Function App to continuously sync a blob with Dataloop’s dataset
If you want to catch events from the Azure blob and update the Dataloop Dataset you need to set up a blob function.The function will catch the blob storage events and will reflect them into the Dataloop Platform.
If you are familiar with Azure Function App, you can just use our integration function below.
We assume you already have an Azure account with resource group and storage account. If you don’t, follow the Azure docs and create them.
Create the Blob Function¶
Create a Container in the created Storage account
Public access level -> Container OR BlobNOTE this container should be used as the external storage for the Dataloop dataset.
Go back to Resource group and click Create -> Function App
Choose Subscription, your Resource group, Name and Region
Publish -> Code
Runtime stack -> Python
Version -> <=3.7
In VScode, flow the instructions in azure docs to configure your environment and deploy the function:
Configure your environment
Sign in to Azure
Create your local project
in Select a template for your project’s first function choose -> Azure Blob Storage trigger
in Storage account select your Storage account
in Resource group select your Resource group
Set the ‘Create new Azure Blob Storage trigger’ to your container name (used in the Dataloop platform)
add_layer
open the code file
add dtlpy to the requirements.txt file
add “disabled”: false to the function.json file
add a function code to __init__.py file
import azure.functions as func
import dtlpy as dl
import os
os.environ["DATALOOP_PATH"] = "/tmp"
dataset_id = os.environ.get('DATASET_ID')
dtlpy_username = os.environ.get('DTLPY_USERNAME')
dtlpy_password = os.environ.get('DTLPY_PASSWORD')
def main(myblob: func.InputStream):
dl.login_m2m(email=dtlpy_username, password=dtlpy_password)
dataset = dl.datasets.get(dataset_id=dataset_id,
fetch=False # to avoid GET the dataset each time
)
# remove th Container name from the path
path_parser = myblob.name.split('/')
file_name = '/'.join(path_parser[1:])
file_name = 'external://' + file_name
dataset.items.upload(local_path=file_name)
Deploy the code to the function app you created.
In VS code go to view tab -> Command Palette -> Azure Functions: Upload Local Settings
Go to the Function App -> Select your function -> Configuration (Under Settings section)* add the 3 secrets vars DATASET_ID, DTLPY_USERNAME, DTLPY_PASSWORD
Done! Now your storage blob will be synced with the Dataloop dataset
Dataset Binding with Google Cloud Storage¶
We will create a GCS cloud function to continuously sync a bucket with Dataloop’s Dataset
If you want to catch events from the GCS bucket and update the Dataloop Dataset you need to set up a Cloud function.The function will catch the GCS bucket events and will reflect them into the Dataloop Platform.
Create the cloud function¶
Create a cloud function for create event (must add the environment variables DATASET_ID, DTLPY_USERNAME and DTLPY_PASSWORD)
add_layer
add_layer
Add dtlpy to the requirements.txt
Copy the following code to the main file:
import dtlpy as dl
import os
dataset_id = os.environ.get('DATASET_ID')
dtlpy_username = os.environ.get('DTLPY_USERNAME')
dtlpy_password = os.environ.get('DTLPY_PASSWORD')
def create_gcs(event, context):
"""Triggered by a change to a Cloud Storage bucket.
Args:
event (dict): Event payload.
context (google.cloud.functions.Context): Metadata for the event.
"""
file = event
dl.login_m2m(email=dtlpy_username, password=dtlpy_password)
dataset = dl.datasets.get(dataset_id=dataset_id,
fetch=False # to avoid GET the dataset each time
)
file_name = 'external://' + file['name']
dataset.items.upload(local_path=file_name)
create another function for delete with delete event with this code
import dtlpy as dl
import os
dataset_id = os.environ.get('DATASET_ID')
dtlpy_username = os.environ.get('DTLPY_USERNAME')
dtlpy_password = os.environ.get('DTLPY_PASSWORD')
def delete_gcs(event, context):
"""Triggered by a change to a Cloud Storage bucket.
Args:
event (dict): Event payload.
context (google.cloud.functions.Context): Metadata for the event.
"""
file = event
dl.login_m2m(email=dtlpy_username, password=dtlpy_password)
dataset = dl.datasets.get(dataset_id=dataset_id,
fetch=False # to avoid GET the dataset each time
)
file_name = file['name']
dataset.items.delete(filename=file_name)
Deploy and you’re good to go!
Manage Datasets¶
Datasets are buckets in the dataloop system that hold a collection of data items of any type, regardless of theirstorage location (on Dataloop storage or external cloud storage).
Create Dataset¶
You can create datasets within a project. There are no limits to the number of dataset a project can have, whichcorrelates with data versioning where datasets can be cloned and merged.
dataset_name = 'my-dataset-name'
dataset = project.datasets.create(dataset_name=dataset_name)
Create Dataset With Cloud Storage Driver¶
If you’ve created an integration and driver to your cloud storage, you can create a dataset connected to that driver. Asingle integration (for example: S3) can have multiple drivers (per bucket or even per folder), so you need to specifythat.
project_name = 'my-project-name'
driver = 'my_driver_name'
dataset_name = 'my_dataset_name'
project = dl.projects.get(project_name=project_name)
# Get your drivers list
project.drivers.list().print()
# Create a dataset from a driver name. You can also create by the driver ID.
dataset = project.datasets.create(driver=driver, dataset_name=dataset_name)
Retrieve Datasets¶
You can read all datasets that exist in a project, and then access the datasets by their ID (or name).
dataset_id = 'my-dataset-id'
datasets = project.datasets.list()
dataset = project.datasets.get(dataset_id=dataset_id)
Create Directory¶
A dataset can have multiple directories, allowing you to manage files by context, such as upload time, working batch,source, etc.
directory = '/directory/name'
dataset.items.make_dir(directory=directory)
Deep Copy a Folder to Another Dataset¶
You can create a clone of a folder into a new dataset, but if you want to actually move between datasets a folder withfiles that are stored in the Dataloop system, you’ll need to download the files and upload again to the destination dataset.
source_folder = '/source_folder'
destination_folder = '/destination_folder'
source_project_name = 'source_project_name'
source_dataset_name = 'source_dataset_name'
destination_project_name = 'destination_project_name'
destination_dataset_name = 'destination_dataset_name'
copy_annotations = True
flat_copy = False # if true, it copies all dir files and sub dir files to the destination folder without sub directories
# Get source project dataset
project = dl.projects.get(project_name=source_project_name)
dataset_from = project.datasets.get(dataset_name=source_dataset_name)
source_folder = source_folder.rstrip('/')
# Filter to get all files of a specific folder
filters = dl.Filters()
filters.add(field='filename', values=source_folder + '/**') # Get all items in folder (recursive)
pages = dataset_from.items.list(filters=filters)
# Get destination project and dataset
project = dl.projects.get(project_name=destination_project_name)
dataset_to = project.datasets.get(dataset_name=destination_dataset_name)
# Go over all projects and copy file from src to dst
for page in pages:
for item in page:
# Download item (without save to disk)
buffer = item.download(save_locally=False)
# Give the item's name to the buffer
if flat_copy:
buffer.name = item.name
else:
buffer.name = item.filename[len(source_folder) + 1:]
# Upload item
print("Going to add {} to {} dir".format(buffer.name, destination_folder))
new_item = dataset_to.items.upload(local_path=buffer, remote_path=destination_folder)
if not isinstance(new_item, dl.Item):
print('The file {} could not be upload to {}'.format(buffer.name, destination_folder))
continue
print("{} has been uploaded".format(new_item.filename))
if copy_annotations:
new_item.annotations.upload(item.annotations.list())
Data Versioning¶
Dataloop’s powerful data versioning provides you with unique tools for data management - clone, merge, slice & dice your files, to create multiple versions for various applications. Sample use cases include:Golden training sets managementReproducibility (dataset training snapshot)Experimentation (creating subsets from different kinds)Task/Assignment managementData Version “Snapshot” - Use our versioning feature as a way to save data (items, annotations, metadata) before any major process. For example, a snapshot can serve as a roll-back mechanism to original datasets in case of any error without losing the data.
Clone Datasets¶
Cloning a dataset creates a new dataset with the same files as the original. Files are actually a reference to the original binary and not a new copy of the original, so your cloud data remains safe and protected. When cloning a dataset, you can add a destination dataset, remote file path, and more…
dataset_id = 'my-dataset-id'
clone_name = 'clone-name'
dataset = project.datasets.get(dataset_id=dataset_id)
dataset_clone = dataset.clone(clone_name=clone_name,
filters=None,
with_items_annotations=True,
with_metadata=True,
with_task_annotations_status=True)
Merge Datasets¶
Dataset merging outcome depends on how similar or different the datasets are.
Cloned Datasets - items, annotations, and metadata will be merged. This means that you will see annotations from different datasets on the same item.
Different datasets (not clones) with similar recipes - items will be summed up, which will cause duplication of similar items.
Datasets with different recipes - Datasets with different default recipes cannot be merged. Use the ‘Switch recipe’ option on dataset level (3-dots action button) to match recipes between datasets and be able to merge them.
dataset_ids = ["dataset-1-id", "dataset-2-id"]
project_ids = ["dataset-1-project-id", "dataset-2-project-id"]
merge_name = 'my_dataset-merge'
dataset_merge = dl.datasets.merge(merge_name=merge_name,
project_ids=project_ids,
dataset_ids=dataset_ids,
with_items_annotations=True,
with_metadata=False,
with_task_annotations_status=False)
Upload & Manage Data & Metadata¶
Upload Specific Files¶
When you have specific files you want to upload, you can upload them all into a dataset using this script:
project_name = 'project_name'
dataset_name = 'dataset_name'
local_path = [r'C:/home/project/images/John Morris.jpg',
r'C:/home/project/images/John Benton.jpg',
r'C:/home/project/images/Liu Jinli.jpg']
remote_path = '/folder_name'
import dtlpy as dl
if dl.token_expired():
dl.login()
project = dl.projects.get(project_name=project_name)
dataset = project.datasets.get(dataset_name=dataset_name)
dataset.items.upload(local_path=local_path,
remote_path=remote_path) # Remote path is optional, images will go to the root directory by default
Upload All Files in a Folder¶
If you want to upload all files from a folder, you can do that by just specifying the folder name:
project_name = 'project_name'
dataset_name = 'dataset_name'
local_path = r'C:/home/project/images'
remote_path = '/folder_name'
import dtlpy as dl
if dl.token_expired():
dl.login()
project = dl.projects.get(project_name=project_name)
dataset = project.datasets.get(dataset_name=dataset_name)
dataset.items.upload(local_path=local_path,
remote_path=remote_path) # Remote path is optional, images will go to the root directory by default
Upload Items From URL Links¶
You can provide Dataloop with the link to the item, and not necessarily the item itself.
dataset_name = 'dataset_name'
file_name = 'file_name.jpg'
dataset = project.datasets.get(dataset_name=dataset_name)
url_path = 'http://ww.some_website/beautiful_flower.jpg'
# Create link
link = dl.UrlLink(ref=url_path, mimetype='image', name=file_name)
# Upload link
item = dataset.items.upload(local_path=link)
You can open an item uploaded to Dataloop by opening it in a viewer.
item.open_in_web()
Upload Items with Metadata¶
You can upload items as a table using a Pandas DataFrame that will let you upload items with info (annotations, metadata such as confidence, filename, etc.) attached to it.
dataset_id = 'id'
# First item:
local_path1 = r"E:\TypesExamples\000000000064.jpg"
local_annotations_path1 = r"E:\TypesExamples\000000000776.json"
remote_path1 = '/first'
remote_name1 = 'f.jpg'
item_metadata1 = {'user': {'dummy': 'fir'}}
# Second item:
local_path2 = r"E:\TypesExamples\000000000776.jpg"
local_annotations_path2 = r"E:\TypesExamples\000000000776.json"
remote_path2 = "/second"
remote_name2 = 's.jpg'
item_metadata2 = {'user': {'dummy': 'sec'}}
import pandas
import dtlpy as dl
dataset = dl.datasets.get(dataset_id=dataset_id) # Get dataset
to_upload = list()
# First item and info attached:
to_upload.append({'local_path': local_path1, # Local path to image
'local_annotations_path': local_annotations_path1, # Local path to annotation file
'remote_path': remote_path1, # Remote directory of uploaded image
'remote_name': remote_name1, # Remote name of image
'item_metadata': item_metadata1}) # Metadata for the created item
# Second item and info attached:
to_upload.append({'local_path': local_path2, # Local path to image
'local_annotations_path': local_annotations_path2, # Local path to annotation file
'remote_path': remote_path2, # Remote directory of uploaded image
'remote_name': remote_name2, # Remote name of image
'item_metadata': item_metadata2}) # Metadata for the created item
df = pandas.DataFrame(to_upload) # Make data into DF table
items = dataset.items.upload(local_path=df,
overwrite=True) # Upload DF to platform
Upload & Manage Annotations¶
import dtlpy as dl
item = dl.items.get(item_id="")
annotation = item.annotations.get(annotation_id="")
annotation.metadata["user"] = True
annotation.update()
Upload User Metadata¶
To upload annotations from JSON and include the user metadata, add the parameter local_annotation_path to the dataset.items.upload function, like so:
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
dataset.items.upload(local_path=r'<items path>',
local_annotations_path=r'<annotation json file path>',
item_metadata=dl.ExportMetadata.FROM_JSON,
overwrite=True)
Convert Annotations To COCO Format¶
converter = dl.Converter()
converter.upload_local_dataset(
from_format=dl.AnnotationFormat.COCO,
dataset=dataset,
local_items_path=r'C:/path/to/items',
# Please make sure the names of the items are the same as written in the COCO JSON file
local_annotations_path=r'C:/path/to/annotations/file/coco.json'
)
Upload Entire Directory and their Corresponding Dataloop JSON Annotations¶
# Local path to the items folder
# If you wish to upload items with your directory tree use : r'C:/home/project/images_folder'
local_items_path = r'C:/home/project/images_folder/*'
# Local path to the corresponding annotations - make sure the file names fit
local_annotations_path = r'C:/home/project/annotations_folder'
dataset.items.upload(local_path=local_items_path,
local_annotations_path=local_annotations_path)
Upload Annotations To Video Item¶
Uploading annotations to video items needs to consider spanning between frames, and toggling visibility (occlusion). In this example, we will use the following CSV file.In this file there is a single ‘person’ box annotation that begins on frame number 20, disappears on frame number 41, reappears on frame number 51 and ends on frame number 90.
import pandas as pd
# Read CSV file
df = pd.read_csv(r'C:/file.csv')
# Get item
item = dataset.items.get(item_id='my_item_id')
builder = item.annotations.builder()
# Read line by line from the csv file
for i_row, row in df.iterrows():
# Create box annotation from csv rows and add it to a builder
builder.add(annotation_definition=dl.Box(top=row['top'],
left=row['left'],
bottom=row['bottom'],
right=row['right'],
label=row['label']),
object_visible=row['visible'], # Support hidden annotations on the visible row
object_id=row['annotation id'], # Numbering system that separates different annotations
frame_num=row['frame'])
# Upload all created annotations
item.annotations.upload(annotations=builder)
Upload Annotations In VTT Format¶
The Dataloop builder support VTT files, for uploading Web Text Tracks for video transcription.
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# local path to item
local_item_path = r'/Users/local/path/to/item.png'
# local path to vtt
local_vtt_path = r'/Users/local/path/to/subtitles.vtt'
# upload item
item = dataset.items.upload(local_path=local_item_path)
# upload VTT file - wait until the item finishs uploading
builder = item.annotations.builder()
builder.from_vtt_file(filepath=local_vtt_path)
item.annotations.upload(builder)
Upload Audio Annotation to an Audio File¶
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
item = dataset.items.get(filepath='/my_item.mp4')
# Using annotation builder
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Subtitle(label='<label>',
text='<text>'),
start_time='<start>',
end_time='<end>')
Set Attributes On Annotations¶
You can set attributes on annotations in hte platform using the SDK. Since Dataloop deprecated a legacy attributes mechanism, attributes are refered to as ‘2.0’ version and need to be set as such first.
Free Text Attribute¶
dl.use_attributes_2(True)
annotation.attributes.update({"ID of the attribute": "value of the attribute"})
annotation = annotation.update(True)
Range Attributes (Slider in UI)¶
dl.use_attributes_2(True)
annotation.attributes.update({"<attribute-id>": number_on_range})
annotation = annotation.update(system_metadata=True)
CheckBox Attribute (Multiple choice)¶
dl.use_attributes_2(True)
annotation.attributes.update({"<attribute-id>": ["selection", "selection"]})
annotation = annotation.update(system_metadata=True)
Yes/No Attribute¶
dl.use_attributes_2(True)
annotation.attributes.update({"<attribute-id>": True / False})
annotation = annotation.update(system_metadata=True)
Show Annotations Over Image¶
After uploading items and annotations with their metadata, you might want to see some of them and perform visual validation.
To see only the annotations, use the annotation type show option.
# Use the show function for all annotation types
box = dl.Box()
# Must provide all inputs
box.show(image='',
thickness='',
with_text='',
height='',
width='',
annotation_format='',
color='')
To see the item itself with all annotations, use the Annotations option.
# Must input an image or height and width
annotation.show(image='',
height='', width='',
annotation_format='dl.ViewAnnotationOptions.*',
thickness='',
with_text='')
Download Data, Annotations & Metadata¶
The item ID for a specific file can be found in the platform UI - Click BROWSE for a dataset, click on the selected file, and the file information will be displayed in the right-side panel. The item ID is detailed, and can be copied in a single click.
Download Items and Annotations¶
Download dataset items and annotations to your computer folder in two separate folders.To list the download annotation option use dl.ViewAnnotationOptions
:
JSON: Download json files with the Dataloop annotation format.
MASK: Save a PNG image file with the RGB annotation drawn.
INSTANCE: Saves a PNG with the annotation label ID as the pixel value.
ANNOTATION_ON_IMAGE: Saves a PNG with the annotation drawn on top of the image.
VTT: Save
subtitle
annotation type in a VTT format.OBJECT_ID: Save a PNG with the object ID as the pixel value.
dataset.download(local_path=r'C:/home/project/images', # The default value is ".dataloop" folder
annotation_options=dl.VIEW_ANNOTATION_OPTIONS_JSON)
NOTE: The annotation option can also be a list to download multiple options:
dataset.download(local_path=r'C:/home/project/images', # The default value is ".dataloop" folder
annotation_options=[dl.VIEW_ANNOTATION_OPTIONS_MASK,
dl.VIEW_ANNOTATION_OPTIONS_JSON,
dl.ViewAnnotationOptions.INSTANCE])
Filter by Item and/or Annotation¶
Items filter - download filtered items based on multiple parameters, like their directory.You can also download items based on different filters. Learn all about item filters here.
Annotation filter - download filtered annotations based on multiple parameters like their label.You can also download items annotations based on different filters, learn all about annotation filters here.This example will download items and JSONS from a dog folder of the label ‘dog’.
# Filter items from "folder_name" directory
item_filters = dl.Filters(resource='items', field='dir', values='/dog_name')
# Filter items with dog annotations
annotation_filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION, field='label', values='dog')
dataset.download(local_path=r'C:/home/project/images', # The default value is ".dataloop" folder
filters=item_filters,
annotation_filters=annotation_filters,
annotation_options=dl.VIEW_ANNOTATION_OPTIONS_JSON)
Filter by Annotations¶
Annotation filter - download filtered annotations based on multiple parameters like their label. You can also download items annotations based on different filters, learn all about annotation filters here.
item = dataset.items.get(item_id="item_id") # Get item from dataset to be able to view the dataset colors on Mask
# Filter items with dog annotations
annotation_filters = dl.Filters(resource='annotations', field='label', values='dog')
item.download(local_path=r'C:/home/project/images', # the default value is ".dataloop" folder
annotation_filters=annotation_filters,
annotation_options=dl.VIEW_ANNOTATION_OPTIONS_JSON)
Download Annotations in COCO/YOLO/VOC Format¶
Items filter - download filtered items based on multiple parameters like their directory. You can also download items based on different filters, learn all about item filters here.
Annotation filter - download filtered annotations based on multiple parameters like their label. You can also download items annotations based on different filters, learn all about annotation filters here.
This example will download COCO from a dog items folder of the label ‘dog’ (edit the script to change to YOLO/VOC).
# Filter items from "folder_name" directory
item_filters = dl.Filters(resource='items', field='dir', values='/dog_name')
# Filter items with dog annotations
annotation_filters = dl.Filters(resource='annotations', field='label', values='dog')
converter = dl.Converter()
converter.convert_dataset(dataset=dataset,
# Use the converter of choice
# to_format='yolo',
# to_format='voc',
to_format='coco',
local_path=r'C:/home/coco_annotations',
filters=item_filters,
annotation_filters=annotation_filters)
# Param export_version will be set to ExportVersion.V1 by default.
dataset.download(local_path='/path',
annotation_options='json',
export_version=dl.ExportVersion.V2)
from PIL import Image
item = dl.items.get(item_id='my-item-id')
array = item.download(save_locally=False, to_array=True)
# Check out the downloaded Ndarray with these commands - optional
image = Image.fromarray(array)
image.save(r'C:/home/project/images.jpg')
Advance SDK Filters¶
To access the filters entity click here.
Filter Operators¶
To understand more about filter operators please click here.
When adding a filter, several operators are available for use:
Equal¶
eq -> equal(or dl.FiltersOperation.EQUAL)
For example, filter items from a specific folder directory.
import dtlpy as dl
# Get project and dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# Create filters instance
filters = dl.Filters()
# Filter only items from a specific folder directory
filters.add(field='dir', values='/DatasetFolderName', operator=dl.FILTERS_OPERATIONS_EQUAL)
# optional - return results sorted by ascending file name
filters.sort_by(field='filename')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Not Equal¶
ne -> not equal(or dl.FiltersOperation.NOT_EQUAL)
In this example, you will get all items that do not have ONLY a ‘cat’ label.NoteThis Operator is a better fit for filters of a single value because, for example, this filter will return items that have both ‘cat’ and ‘dog’ labels.View an example of the solution here.
filters = dl.Filters()
# Filter ONLY a cat label
filters.add_join(field='label', values='cat', operator=dl.FILTERS_OPERATIONS_NOT_EQUAL)
# optional - return results sorted by ascending file name
filters.sort_by(field='filename')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in the dataset: {}'.format(pages.items_count))
Greater Than¶
gt -> greater than(or dl.FiltersOperation.GREATER_THAN)
You will get items with a greater height (in pixels) than the given value in this example.
filters = dl.Filters()
# Filter images with a bigger height size
filters.add(field='metadata.system.height', values=height_number_in_pixels,
operator=dl.FILTERS_OPERATIONS_GREATER_THAN)
# optional - return results sorted by ascending file name
filters.sort_by(field='filename')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Less Than¶
lt -> less than(or dl.FiltersOperation.LESS_THAN)
You will get items with a width (in pixels) less than the given value in this example.
filters = dl.Filters()
# Filter images with a bigger height size
filters.add(field='metadata.system.width', values=width_number_in_pixels, operator=dl.FILTERS_OPERATIONS_LESS_THAN)
# optional - return results sorted by ascending file name
filters.sort_by(field='filename')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
In a List¶
in -> is in a list (when using this expression, values should be a list).(or dl.FiltersOperation.IN)In this example, you will get items with dog OR cat labels.
filters = dl.Filters()
# Filter items with dog OR cat labels
filters.add_join(field='label', values=['dog', 'cat'], operator=dl.FILTERS_OPERATIONS_IN)
# optional - return results sorted by ascending file name
filters.sort_by(field='filename')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Exist¶
The filter param FILTERS_OPERATIONS_EXISTS checks if an attribute exists. The following example checks if there is an item with user metadata:
filters = dl.Filters()
filters.add(field='metadata.user', values=True, operator=dl.FILTERS_OPERATIONS_EXISTS)
dataset.items.list(filters=filters)
SDK defaults¶
Filters ignore SDK defaults like hidden items and directories or note annotations as issues.If you wish to change this behavior, you may do the following:
filters = dl.Filters(use_defaults=False)
Delete a Filter¶
filters = dl.Filters()
# For example, if you added the following filter:
filters.add(field='to-delete-field', values='value')
# Use this command to delete the filter
filters.pop(field='to-delete-field')
# or for items by their annotations
filters.pop_join(field='to-delete-annotation-field')
Full Examples¶
How to filter items that were created between specific dates?¶
In this example, you will get all of the items that were created in 2018.
import datetime, time
filters = dl.Filters()
# -- time filters -- must be in ISO format and in UTC (offset from local time). converting using datetime package as follows:
earlier_timestamp = datetime.datetime(year=2018, month=1, day=1, hour=0, minute=0, second=0,
tzinfo=datetime.timezone(
datetime.timedelta(seconds=-time.timezone))).isoformat()
later_timestamp = datetime.datetime(year=2019, month=1, day=1, hour=0, minute=0, second=0,
tzinfo=datetime.timezone(
datetime.timedelta(seconds=-time.timezone))).isoformat()
filters.add(field='createdAt', values=earlier_timestamp, operator=dl.FiltersOperations.GREATER_THAN)
filters.add(field='createdAt', values=later_timestamp, operator=dl.FiltersOperations.LESS_THAN)
# change method to OR
filters.method = dl.FiltersMethod.OR
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
How to filter items that don’t have a specific label?¶
In this example, you will get all items that do not have a ‘cat’ label AT ALL.
This filter will NOT return items that have both 'cat' and 'dog' labels.
# Get all items
all_items = set([item.id for item in dataset.items.list().all()])
# Get all items WITH the label cat
filters = dl.Filters()
filters.add_join(field='label', values='cat')
cat_items = set([item.id for item in dataset.items.list(filters=filters).all()])
# Get the difference between the sets. This will give you a list of the items with no cat
no_cat_items = all_items.difference(cat_items)
print('Number of filtered items in dataset: {}'.format(len(no_cat_items)))
# Iterate through the ID's - Go over all ID's and print the matching item
for item_id in no_cat_items:
print(dataset.items.get(item_id=item_id))
To access the filters entity click here.
The Dataloop Query Language - DQL¶
Using The Dataloop Query Language, you may navigate through massive amounts of data.
You can filter, sort, and update your metadata with it.
Filters¶
Using filters, you can filter items and get a generator of the filtered items. The filters entity is used to build such filters.
Filters - Field & Value¶
Filter your items or annotations using the parameters in the JSON code that represent its data within our system.Access your item/annotation JSON using to_json()
.
Field refers to the attributes you filter by.
For example, “dir” would be used if you wish to filter items by their folder/directory.
Value refers to the input by which you want to filter.For example, “/new_folder” can be the directory/folder name where the items you wish to filter are located.
Sort - Field & Value¶
Field refers to the field you sort your items/annotations list by.For example, if you sort by filename, you will get the item list sorted in alphabetical order by filename.See the full list of the available fields here.
Value refers to the list order direction. Either ascending or descending.
Filter Annotations¶
Filter annotations by the annotations’ JSON fields.In this example, you will get all of the note annotations in the dataset sorted by the label.
See all of the items iterator options on the Iterator of Items page.
import dtlpy as dl
# Get project and dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# Create filters instance with annotation resource
filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
# Filter example - only note annotations
filters.add(field='type', values='note')
# optional - return results sorted by descending label
filters.sort_by(field='label', value=dl.FiltersOrderByDirection.DESCENDING)
pages = dataset.annotations.list(filters=filters)
# Count the annotations
print('Number of filtered annotations in dataset: {}'.format(pages.items_count))
# Iterate through the annotations - Go over all annotations and print the properties
for page in pages:
for annotation in page:
annotation.print()
Filter Annotations by the Annotations’ Item¶
add_join
- filter Annotations by the annotations’ items’ JSON fields. For example, filter only box annotations from image items.
See all of the items iterator options on the Iterator of Items page.
# Create filters instance
filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
# Filter all box annotations
filters.add(field='type', values='box')
# AND filter annotations by their items - only items that are of mimetype image
# Meaning you will get 'box' annotations of all image items
filters.add_join(field='metadata.system.mimetype', values="image*")
# optional - return results sorted by descending creation date
filters.sort_by(field='createdAt', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered annotations list in a page object
pages = dataset.annotations.list(filters=filters)
# Count the annotations
print('Number of filtered annotations in dataset: {}'.format(pages.items_count))
Filters Method - “Or” and “And”¶
For more advanced filters operators visit the Advanced SDK Filters page.
And¶
If you wish to filter annotations with the “and” logical operator, you can do so by specifying which filters will be checked with “and”.
filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
# set annotation resource
filters.add(field='type', values='box', method=dl.FiltersMethod.AND)
filters.add(field='label', values='car',
method=dl.FiltersMethod.AND) # optional - return results sorted by ascending creation date
filters.sort_by(field='createdAt')
# Get filtered annotations list
pages = dataset.annotations.list(filters=filters)
# Count the annotations
print('Number of filtered annotations in dataset: {}'.format(pages.items_count))
Or¶
If you wish to filter annotations with the “or” logical operator, you can do so by specifying which filters will be checked with “or”.In this example, you will get a list of the dataset’s annotations that are either a ‘box’ or a ‘point’ type.
filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
# filters with or
filters.add(field='type', values='/box', method=dl.FiltersMethod.OR)
filters.add(field='type', values='/point',
method=dl.FiltersMethod.OR) # optional - return results sorted by descending updated date
filters.sort_by(field='createdAt', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered annotations list
pages = dataset.annotations.list(filters=filters)
# Count the annotations
print('Number of filtered annotations in dataset: {}'.format(pages.items_count))
Delete Filtered Items¶
In this example, you will delete annotations that were created on 30/8/2020 at 8:17 AM.
filters = dl.Filters()
# set annotation resource
filters.resource = dl.FiltersResource.ANNOTATION
# Example - created on 30/8/2020 at 8:17 AM
filters.add(field='createdAt', values="2020-08-30T08:17:08.000Z")
dataset.annotations.delete(filters=filters)
Annotation Filtering Fields¶
More Filter Options¶
{
"id": "5f576f660bb2fb455d79ffdf",
"datasetId": "5e368bee106a76a61cf05282",
"type": "segment",
"label": "Planet",
"attributes": [],
"coordinates": [
[
{
"x": 856.25,
"y": 1031.2499999999995
},
{
"x": 1081.25,
"y": 1631.2499999999995
},
{
"x": 485.41666666666663,
"y": 1735.4166666666665
},
{
"x": 497.91666666666663,
"y": 1172.9166666666665
}
]
],
"metadata": {
"system": {
"status": null,
"startTime": 0,
"endTime": 1,
"frame": 0,
"endFrame": 1,
"snapshots_": [
{
"fixed": true,
"type": "transition",
"frame": 0,
"objectVisible": true,
"data": [
[
{
"x": 856.25,
"y": 1031.2499999999995
},
{
"x": 1081.25,
"y": 1631.2499999999995
},
{
"x": 485.41666666666663,
"y": 1735.4166666666665
},
{
"x": 497.91666666666663,
"y": 1172.9166666666665
}
]
],
"label": "Planet",
"attributes": []
}
],
"automated": false,
"isOpen": false,
"system": false
},
"user": {}
},
"creator": "user@dataloop.ai",
"createdAt": "2020-09-08T11:47:50.576Z",
"updatedBy": "user@dataloop.ai",
"updatedAt": "2020-09-08T11:47:50.576Z",
"itemId": "5f572f4423a69b8c83408f12",
"url": "https://gate.dataloop.ai/api/v1/annotations/5f576f660bb2fb455d79ffdf",
"item": "https://gate.dataloop.ai/api/v1/items/5f572f4423a69b8c83408f12",
"dataset": "https://gate.dataloop.ai/api/v1/datasets/5e368bee106a76a61cf05282",
"hash": "11fdc816804faf0f7266b40d1cb67aff38e5c10d"
}
Full Examples¶
How to filter annotations by their label?¶
filters = dl.Filters()
# set resource
filters.resource = dl.FiltersResource.ANNOTATION
filters.add(field='label', values='your_label_value')
pages = dataset.annotations.list(filters=filters)
# Count the annotations
print('Number of filtered annotations in dataset: {}'.format(pages.items_count))
Advanced Filtering Operators¶
Explore advanced filtering options on this page.
To access the filters entity click here.
The Dataloop Query Language - DQL¶
Using The Dataloop Query Language, you may navigate through massive amounts of data.
You can filter, sort, and update your metadata with it.
Filters¶
Using filters, you can filter items and get a generator of the filtered items. The filters entity is used to build such filters.
Filters - Field & Value¶
Filter your items or annotations using the parameters in the JSON code that represent its data within our system.Access your item/annotation JSON using to_json()
.
Field refers to the attributes you filter by.
For example, “dir” would be used if you wish to filter items by their folder/directory.
Value refers to the input by which you want to filter.For example, “/new_folder” can be the directory/folder name where the items you wish to filter are located.
Sort - Field & Value¶
Field refers to the field you sort your items/annotations list by.For example, if you sort by filename, you will get the item list sorted in alphabetical order by filename.See the full list of the available fields here.
Value refers to the list order direction. Either ascending or descending.
Filter Items¶
Filter items by the item’s JSON fields.In this example, you will get all annotated items in a dataset sorted by the filename.
See all of the items iterator options on the Iterator of Items page.
import dtlpy as dl
# Get project and dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# Create filters instance
filters = dl.Filters()
# Filter only annotated items
filters.add(field='annotated', values=True)
# optional - return results sorted by ascending file name
filters.sort_by(field="filename")
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Filter Items by the Items’ Annotations¶
add_join
- filter items by the items’ annotations JSON fields. For example, filter only items with ‘box’ annotations.
See all of the items iterator options on the Iterator of Items page.
filters = dl.Filters()
# Filter all approved items
filters.add(field='metadata.system.annotationStatus', values="approved")
# AND filter items by their annotation - only items with 'box' annotations
# Meaning you will get approved items with 'box' annotations
filters.add_join(field='type', values='box')
# optional - return results sorted by descending creation date
filters.sort_by(field='createdAt', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Filters Method - “Or” and “And”¶
For more advanced filters operators visit the Advanced SDK Filters page.
And¶
If you wish to filter annotations with the “and” logical operator, you can do so by specifying which filters will be checked with “and”.
filters = dl.Filters() # filters with and
filters.add(field='annotated', values=True, method=dl.FiltersMethod.AND)
filters.add(field='metadata.user.is_automated', values=True,
method=dl.FiltersMethod.AND) # optional - return results sorted by ascending file name
filters.sort_by(field='name')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Or¶
If you wish to filter annotations with the “or” logical operator, you can do so by specifying which filters will be checked with “or”.In this example, you will get a list of items that are either on “folder1” or “folder2” directories.
filters = dl.Filters()
# filters with or
filters.add(field='dir', values='/folderName1', method=dl.FiltersMethod.OR)
filters.add(field='dir', values='/folderName2',
method=dl.FiltersMethod.OR) # optional - return results sorted by descending directory name
filters.sort_by(field='dir', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Update User Metadata of Filtered Items¶
Update Filtered Items - The ‘update_value’ must be a dictionary.The dictionary will only update user metadata.Understand more about user metadata <a href=https://github.com/dataloop-ai/dtlpy-documentation/blob/main/tutorials/data_management/working_with_metadata/chapter.md/” target=”_blank”>here.In this example, you will update/add user metadata (with the field “BlackDogs” and value True), to items in a specific folder ‘dogs’ with an attribute ‘black’.
filters = dl.Filters()
# For example - filter only items in a specific folder - like 'dogs'
filters.add(field='dir', values='/dogs')
# For example - filter items by their annotation - only items with 'black' attribute
filters.add_join(field='attributes', values='black')
# to add filed BlackDogs to all filtered items and give value True
# this field will be added to user metadata
# create update order
update_values = {'BlackDogs': True}
# update
pages = dataset.items.update(filters=filters, update_values=update_values)
Delete Filtered Items¶
In this example, you will delete items that were created on 30/8/2020 at 8:17 AM.
filters = dl.Filters()
# For example - filter only annotated items
filters.add(field='createdAt', values="2020-08-30T08:17:08.000Z")
dataset.items.delete(filters=filters)
Item Filtering Fields¶
More Filter Options¶
{
"id": "5f4b60848ced1d50c3df114a",
"datasetId": "5f4b603d9825b9f191bbd3b3",
"createdAt": "2020-08-30T08:17:08.000Z",
"dir": "/new_folder",
"filename": "/new_folder/optional.jpg",
"type": "file",
"hidden": false,
"metadata": {
"system": {
"originalname": "file",
"size": 3290035,
"encoding": "7bit",
"mimetype": "image/jpeg",
"annotationStatus": [
"completed"
],
"refs": [
{
"type": "task",
"id": "5f4b61f8f81ab6238c331bd2"
},
{
"type": "assignment",
"id": "5f4b61f8f81ab60508331bd3"
}
],
"executionLogs": {
"image-metadata-extractor": {
"default_module": {
"run": {
"5f4b60841b892d82eaa2d95b": {
"progress": 100,
"status": "success"
}
}
}
}
},
"exif": {},
"height": 2734,
"width": 4096,
"statusLog": [
{
"status": "completed",
"timestamp": "2020-08-30T14:54:17.014Z",
"creator": "user@dataloop.ai",
"action": "created"
}
],
"isBinary": true
}
},
"name": "optional.jpg",
"url": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a",
"dataset": "https://gate.dataloop.ai/api/v1/datasets/5f4b603d9825b9f191bbd3b3",
"annotationsCount": 18,
"annotated": "discarded",
"stream": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/stream",
"thumbnail": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/thumbnail",
"annotations": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/annotations"
}
Full Examples¶
How to filter items by their annotations label?¶
filters = dl.Filters()
filters.add_join(field='label', values='your_label_value')
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))
How to filter items by completed and approved status?¶
filters = dl.Filters()
filters.add(field='metadata.system.annotationStatus', values=["completed", "approved"])
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
How to filter items by completed status (with items who are approved as well)?¶
filters = dl.Filters()
# set resource
filters.add(field='metadata.system.annotationStatus', values="completed")
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
How to filter items by only completed status?¶
filters = dl.Filters()
filters.add(field='metadata.system.annotationStatus', values=["completed"])
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
How to filter unassigned items?¶
filters = dl.Filters()
filters.add(field='metadata.system.refs', values=[])
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
How to filter items by a specific folder?¶
filters = dl.Filters()
filters.add(field='dir', values="/folderName")
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
Get all items named foo.bar¶
filters = dl.Filters()
filters.add(field='name', values='foo.bar.*')
# Get filtered item list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))
Sort files of size 0-5 MB by name, in ascending order¶
filters = dl.Filters()
filters.add(field='metadata.system.size', values='0', operator='gt')
filters.add(field='metadata.system.size', values='5242880', operator='lt')
filters.sort_by(field='filename', value=dl.FILTERS_ORDERBY_DIRECTION_ASCENDING)
# Get filtered item list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))
Sort with multiple fields: Sort Items by labels ascending and createdAt descending¶
filters = dl.Filters()
# set annotation resource
filters.resource = dl.FiltersResource.ANNOTATION
# return results sorted by descending label
filters.sort_by(field='label', value=dl.FILTERS_ORDERBY_DIRECTION_ASCENDING)
filters.sort_by(field='createdAt', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered item list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))
Advanced Filtering Operators¶
Explore advanced filtering options on this page.
Response to DQL Query¶
A typical response to a DQL query will look like the following:
{
"totalItemsCount": number,
"items": Array,
"totalPagesCount": number,
"hasNextPage": boolean,
}
# A possible result:
{
"totalItemsCount": 2,
"totalPagesCount": 1,
"hasNextPage": false,
"items": [
{
"id": "5d0783852dbc15306a59ef6c",
"createdAt": "2019-06-18T23:29:15.775Z",
"filename": "/5546670769_8df950c6b6.jpg",
"type": "file"
// ...
},
{
"id": "5d0783852dbc15306a59ef6d",
"createdAt": "2019-06-19T23:29:15.775Z",
"filename": "/5551018983_3ce908ac98.jpg",
"type": "file"
// ...
}
]
}
Pagination¶
Pages¶
We use pages instead of a list when we have an object that contains a lot of information.
The page object divides a large list into pages (with a default of 1000 items) in order to save time when going over the items.
It is the same as we display it in the annotation platform, see example here.
You can redefine the number of items on a page with the page_size attribute.When we go over the items we use nested loops to first go to the pages and then go over the items for each page.
Iterator of Items¶
You can create a generator of items with different filters.
import dtlpy as dl
# Get the project
project = dl.projects.get(project_name='project_name')
# Get the dataset
dataset = project.datasets.get(dataset_name='dataset_name')
# Get items in pages (1000 item per page)
filters = dl.Filters()
filters.add(field='filename', values='/your/file/path.mimetype')
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))
# Go over all item and print the properties
for i_page, page in enumerate(pages):
print('{} items in page {}'.format(len(page), i_page))
for item in page:
item.print()
A Page entity iterator also allows reverse iteration for cases in which you want to change items during the iteration:
# Go over all item and print the properties
for i_page, page in enumerate(reversed(pages)):
print('{} items in page {}'.format(len(page), i_page))
If you want to iterate through all items within your filter, you can also do so without going through them page by page:
for item in pages.all():
print(item.name)
If you are planning to do some process on each item, it’s faster to use multi-threads (or multi-process) for parallel computation.The following uses ThreadPoolExecutor with 32 workers to process parallel batches of 32 items:
from concurrent.futures import ThreadPoolExecutor
def single_item(item):
# do some work on item
print(item.filename)
return True
with ThreadPoolExecutor(max_workers=32) as executor:
executor.map(single_item, pages.all())
Lets compare the runtime to see that now the process is faster:
from concurrent.futures import ThreadPoolExecutor
import time
tic = time.time()
for item in pages.all():
# do stuff on item
time.sleep(1)
print('One by one took {:.2f}[s]'.format(time.time() - tic))
def single_item(item):
# do stuff on item
time.sleep(1)
return True
tic = time.time()
with ThreadPoolExecutor(max_workers=32) as executor:
executor.map(single_item, pages.all())
print('Using threads took {:.2f}[s]'.format(time.time() - tic))
Visualizing the progress with tqdm progress bar:
import tqdm
pbar = tqdm.tqdm(total=pages.items_count)
def single_item(item):
# do stuff on item
time.sleep(1)
pbar.update()
return True
with ThreadPoolExecutor(max_workers=32) as executor:
executor.map(single_item, pages.all())
Set page_size¶
The following example sets the page_size to 50:
# Create filters instance
filters = dl.Filters()
# Get filtered item list in a page object, where the starting page is 1
pages = dataset.items.list(filters=filters, page_offset=1, page_size=50)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))
# Print items from page 1
print('Length of first page: {}'.format(len(pages.items)))
Working with Metadata¶
import dtlpy as dl
project_name = 'project_name'
dataset_name = 'dataset_name'
# Get project and dataset
project = dl.projects.get(project_name=project_name)
dataset = project.datasets.get(dataset_name=dataset_name)
User Metadata¶
As a powerful tool to manage data based on your categories and information, you can add any keys and values to both the item’s and annotations’ user-metadata sections using the Dataloop SDK. Then, you can use your user-metadata for data filtering, sorting, etc.
NoteWhen adding metadata to the same item, the new metadata might overwrite existing metadata. To avoid overwriting a field or the entire metadata, use the list data type.
Metadata Data Types¶
Metadata is a dictionary attribute used with items, annotations, and other entities of the Dataloop system (task, recipe, and more). As such, it can be used with string, number, boolean, list or null types.
String¶
item.metadata['user'] = dict()
item.metadata['user']['MyKey'] = 'MyValue'
annotation.metadata['user']['MyKey'] = 'MyValue'
Number¶
item.metadata['user']['MyKey'] = 3
annotation.metadata['user']['MyKey'] = 3
Boolean¶
item.metadata['user']['MyKey'] = True
annotation.metadata['user']['MyKey'] = True
Null – add metadata with no information¶
item.metadata['user']['MyKey'] = None
annotation.metadata['user']['MyKey'] = None
List¶
# add metadata of a list (can contain elements of different types).
item.metadata['user']['MyKey'] = ["A", 2, False]
annotation.metadata['user']['MyKey'] = ["A", 2, False]
Add new metadata to a list without losing existing data¶
item.metadata['user']['MyKey'].append(3)
item = item.update()
annotation.metadata['user']['MyKey'].append(3)
annotation = annotation.update()
Add metadata to an item’s user metadata¶
local_path = r'C:/home/project/images/item.mimetype'
item_id = 'write-your-id-number'
# upload and claim item
item = dataset.items.upload(local_path=local_path)
# or get item
item = dataset.items.get(item_id=item_id)
# modify metadata
item.metadata['user'] = dict()
item.metadata['user']['MyKey'] = 'MyValue'
# update and reclaim item
item = item.update()
Modify an existing user metadata field¶
local_path = r'C:/home/project/images/item.mimetype'
item_id = 'write-your-id-number'
# upload and claim item
item = dataset.items.upload(local_path=local_path)
# or get item
item = dataset.items.get(item_id=item_id)
# modify metadata
if 'user' not in item.metadata:
item.metadata['user'] = dict()
item.metadata['user']['MyKey'] = 'MyValue'
# update and reclaim item
item = item.update()
Item in platform should have section ‘user’ in metadata with field ‘MyKey’ and value ‘MyValue’.
Add metadata to annotations’ user metadata¶
annotation_id = 'my-annotation-id'
# Get annotation
annotation = dl.annotations.get(annotation_id=annotation_id)
# modify metadata
annotation.metadata['user'] = dict()
annotation.metadata['user']['red'] = True
# update and reclaim annotation
annotation = annotation.update()
annotation in platform should have section ‘user’ in metadata with field ‘red’ and value True
Filter items by user metadata¶
project_name = 'project_name'
dataset_name = 'dataset_name'
project = dl.projects.get(project_name=project_name)
dataset = project.datasets.get(dataset_name=dataset_name)
You can also add metadata to filtered items
local_path = r'C:/home/project/images/item.mimetype'
item_id = 'write-your-id-number'
# upload and claim item
item = dataset.items.upload(local_path=local_path)
# or get item
item = dataset.items.get(item_id=item_id)
# modify metadata
item.metadata['user'] = dict()
item.metadata['user']['MyKey'] = 'MyValue'
# update and reclaim item
item = item.update()
filters = dl.Filters()
# set resource - optional - default is item
filters.resource = dl.FiltersResource.ITEM
filters.add(field='metadata.user.Key', values='Value')
pages = dataset.items.list(filters=filters)
# Go over all item and print the properties
for page in pages:
for item in page:
item.print()
FaaS Tutorial¶
Tutorials for FaaS
FaaS Interactive Tutorial¶
Concept¶
Dataloop Function-as-a-Service (FaaS) is a compute service that automatically runs your code based on time patterns or in response to trigger events.
You can use Dataloop FaaS to extend other Dataloop services with custom logic. Altogether, FaaS serves as a super flexible unit that provides you with increased capabilities in the Dataloop platform and allows achieving any need while automating processes.
With Dataloop FaaS, you simply upload your code and create your functions. Following that, you can define a time interval or specify a resource event for triggering the function. When a trigger event occurs, the FaaS platform launches and manages the compute resources, and executes the function.
You can configure the compute settings according to your preferences (machine types, concurrency, timeout, etc.) or use the default settings.
Use Cases¶
Pre annotation processing: Resize, video assembler, video dissembler
Post annotation processing: Augmentation, crop box-annotations, auto-parenting
ML models: Auto-detection
QA models: Auto QA, consensus model, majority vote model
Introduction¶
This tutorial will help you get started with FaaS.
Prerequisites
Basic use case: Single function
Deploy a function as a service
Execute the service manually and view the output
Advance use case: Multiple functions
Deploy several functions as a package
Deploy a service of the package
Set trigger events to the functions
Execute the functions and view the output and logs
First, log in to the platform by running the following Python code in the terminal or your IDE:
import dtlpy as dl
if dl.token_expired():
dl.login()
Your browser will open a login screen, allowing you to enter your credentials or log in with Google. Once the “Login Successful” tab appears, you are allowed to close it.
This tutorial requires a project. You can create a new project, or alternatively use an existing one:
# Create a new project
project = dl.projects.create(project_name='project-sdk-tutorial')
# Use an existing project
project = dl.projects.get(project_name='project-sdk-tutorial')
Let’s create a dataset to work with and upload a sample item to it:
dataset = project.datasets.create(dataset_name='dataset-sdk-tutorial')
item = dataset.items.upload(
local_path=[
'https://raw.githubusercontent.com/dataloop-ai/dtlpy-documentation/main/assets/images/hamster.jpg'],
remote_path='/folder_name')
# Remote path is optional, images will go to the main directory by default
Basic Use Case: Single Function¶
Create and Deploy a Sample Function¶
Below is an image-manipulation function in Python to use for converting an RGB image to a grayscale image. The function receives a single item, which later can be used as a trigger to invoke the function:
def rgb2gray(item: dl.Item):
"""
Function to convert RGB image to GRAY
Will also add a modality to the original item
:param item: dl.Item to convert
:return: None
"""
import numpy as np
import cv2
buffer = item.download(save_locally=False)
bgr = cv2.imdecode(np.frombuffer(buffer.read(), np.uint8), -1)
gray = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY)
bgr_equalized_item = item.dataset.items.upload(local_path=gray,
remote_path='/gray' + item.dir,
remote_name=item.name)
# add modality
item.modalities.create(name='gray',
ref=bgr_equalized_item.id)
item.update(system_metadata=True)
You can now deploy the function as a service using Dataloop SDK. Once the service is ready, you may execute the available function on any input:
project = dl.projects.get(project_name='project-sdk-tutorial')
service = project.services.deploy(func=rgb2gray,
service_name='grayscale-item-service')
Execute the function¶
An execution means running the function on a service with specific inputs (arguments). The execution input will be provided to the function that the execution runs.
Now that the service is up, it can be executed manually (on-demand) or automatically, based on a set trigger (time/event). As part of this tutorial, we will demonstrate how to manually run the “RGB to Gray” function.
To see the item we uploaded, run the following code:
item.open_in_web()
Multiple Functions and Modules¶
Multiple Functions¶
Create and Deploy a Package of Several Functions¶
First, login to the Dataloop platform:
import dtlpy as dl
if dl.token_expired():
dl.login()
Let’s define the project and dataset you will work with in this tutorial.To create a new project and dataset:
project = dl.projects.create(project_name='project-sdk-tutorial')
project.datasets.create(dataset_name='dataset-sdk-tutorial')
To use an existing project and dataset:
project = dl.projects.get(project_name='project-sdk-tutorial')
dataset = project.datasets.get(dataset_name='dataset-sdk-tutorial')
Write your code¶
The following code consists of two image-manipulation methods:
RGB to grayscale over an image
CLAHE Histogram Equalization over an image - Contrast Limited Adaptive Histogram Equalization (CLAHE) to equalize images
To proceed with this tutorial, copy the following code and save it as a main.py file.
import dtlpy as dl
import cv2
import numpy as np
class ImageProcess(dl.BaseServiceRunner):
@staticmethod
def rgb2gray(item: dl.Item):
"""
Function to convert RGB image to GRAY
Will also add a modality to the original item
:param item: dl.Item to convert
:return: None
"""
buffer = item.download(save_locally=False)
bgr = cv2.imdecode(np.frombuffer(buffer.read(), np.uint8), -1)
gray = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY)
gray_item = item.dataset.items.upload(local_path=gray,
remote_path='/gray' + item.dir,
remote_name=item.name)
# add modality
item.modalities.create(name='gray',
ref=gray_item.id)
item.update(system_metadata=True)
@staticmethod
def clahe_equalization(item: dl.Item):
"""
Function to perform histogram equalization (CLAHE)
Will add a modality to the original item
Based on opencv https://docs.opencv.org/4.x/d5/daf/tutorial_py_histogram_equalization.html
:param item: dl.Item to convert
:return: None
"""
buffer = item.download(save_locally=False)
bgr = cv2.imdecode(np.frombuffer(buffer.read(), np.uint8), -1)
# create a CLAHE object (Arguments are optional).
lab = cv2.cvtColor(bgr, cv2.COLOR_BGR2LAB)
lab_planes = cv2.split(lab)
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
lab_planes[0] = clahe.apply(lab_planes[0])
lab = cv2.merge(lab_planes)
bgr_equalized = cv2.cvtColor(lab, cv2.COLOR_LAB2BGR)
bgr_equalized_item = item.dataset.items.upload(local_path=bgr_equalized,
remote_path='/equ' + item.dir,
remote_name=item.name)
# add modality
item.modalities.create(name='equ',
ref=bgr_equalized_item.id)
item.update(system_metadata=True)
Define the module¶
Multiple functions may be defined in a single package under a “module” entity. This way you will be able to use a single codebase for various services.
Here, we will create a module containing the two functions we discussed. The “main.py” file you downloaded is defined as the module entry point. Later, you will specify its directory file path.
modules = [dl.PackageModule(name='image-processing-module',
entry_point='main.py',
class_name='ImageProcess',
functions=[dl.PackageFunction(name='rgb2gray',
description='Converting RGB to gray',
inputs=[dl.FunctionIO(type=dl.PackageInputType.ITEM,
name='item')]),
dl.PackageFunction(name='clahe_equalization',
description='CLAHE histogram equalization',
inputs=[dl.FunctionIO(type=dl.PackageInputType.ITEM,
name='item')])
])]
Push the package¶
When you deployed the service in the previous tutorial (“Single Function”), a module and a package were automatically generated.
Now we will explicitly create and push the module as a package in the Dataloop FaaS library (application hub). For that, please specify the source path (src_path) of the “main.py” file you downloaded, and then run the following code:
src_path = 'functions/opencv_functions'
project = dl.projects.get(project_name='project-sdk-tutorial')
package = project.packages.push(package_name='image-processing',
modules=modules,
src_path=src_path)
Deploy a service¶
Now that the package is ready, it can be deployed to the Dataloop platform as a service.To create a service from a package, you need to define which module the service will serve. Notice that a service can only contain a single module. All the module functions will be automatically added to the service.
Multiple services can be deployed from a single package. Each service can get its own configuration: a different module and settings (computing resources, triggers, UI slots, etc.).
In our example, there is only one module in the package. Let’s deploy the service:
service = package.services.deploy(service_name='image-processing',
runtime=dl.KubernetesRuntime(concurrency=32),
module_name='image-processing-module')
Trigger the service¶
Once the service is up, we can configure a trigger to automatically run the service functions. When you bind a trigger to a function, that function will execute when the trigger fires. The trigger is defined by a given time pattern or by an event in the Dataloop system.
Event based trigger is related to a combination of resource and action. A resource can be any entity in our system (item, dataset, annotation, etc.) and the associated action will define a change in the resource that will prompt the trigger (update, create, delete). You can only have one resource per trigger.
The resource object that triggered the function will be passed as the function’s parameter (input).
Let’s set a trigger in the event a new item is created:
filters = dl.Filters()
filters.add(field='datasetId', values=dataset.id)
trigger = service.triggers.create(name='image-processing2',
function_name='clahe_equalization',
execution_mode=dl.TriggerExecutionMode.ONCE,
resource=dl.TriggerResource.ITEM,
actions=dl.TriggerAction.CREATED,
filters=filters)
In the defined filters we specified a dataset. Once a new item is uploaded (created) in this dataset, the CLAHE function will be executed for this item. You can also add filters to specify the item type (image, video, JSON, directory, etc.) or a certain format (jpeg, jpg, WebM, etc.).
A separate trigger must be set for each function in your service.Now, we will define a trigger for the second function in the module rgb2gray. Each time an item is updated, invoke the rgb2gray function:
trigger = service.triggers.create(name='image-processing-rgb',
function_name='rgb2gray',
execution_mode=dl.TriggerExecutionMode.ALWAYS,
resource=dl.TriggerResource.ITEM,
actions=dl.TriggerAction.UPDATED,
filters=filters)
To trigger the function only once (only on the first item update), set TriggerExecutionMode.ONCE instead of TriggerExecutionMode.ALWAYS.
Execute the function¶
Now we can upload (“create”) an image to our dataset to trigger the service. The function clahe_equalization will be invoked:
item = dataset.items.upload(
local_path=['https://github.com/dataloop-ai/dtlpy-documentation/raw/main/assets/images/hamster.jpg'])
Remote path is optional, images will go to the main directory by default.
To see the original item, please click here.
Review the function’s logs¶
You can review the execution log history to check that your execution succeeded:
service.log()
The transformed image will be saved in your dataset.Once you see in the log that the execution succeeded, you may open the item to see its transformation:
item.open_in_web()
Pause the service:¶
We recommend pausing the service you created for this tutorial so it will not be triggered:
service.pause()
Congratulations! You have successfully created, deployed, and tested Dataloop functions!
Multiple Modules¶
You can define multiple different modules in a package. A typical use-case for multiple-modules is to have a single code base that can be used by a number of services (for different applications). For example, having a single YoloV4 codebase, but creating different modules for training, inference, etc.
When creating a service from that package, you will need to define which module the service will serve (a service can only serve a single module with all its functions). For example, to push a 2 module package, you will need to have 2 entry points, one for each module, and this is how you define the modules:
modules = [
dl.PackageModule(
name='first-module',
entry_point='first_module_main.py',
functions=[
dl.PackageFunction(
name='run',
inputs=[dl.FunctionIO(name='item',
type=dl.PackageInputType.ITEM)]
)
]
),
dl.PackageModule(
name='second-module',
entry_point='second_module_main.py',
functions=[
dl.PackageFunction(
name='run',
inputs=[dl.FunctionIO(name='item',
type=dl.PackageInputType.ITEM)]
)
]
)
]
Create the package with your modules
package = project.packages.push(package_name='two-modules-test',
modules=modules,
src_path='<path to where the entry point is located>'
)
You will pass these modules as a param to packages.push()After that, when you deploy the package, you will need to specify the module name:Note: A service can only implement one module.
service = package.deploy(
module_name='first-module',
service_name='first-module-test-service'
)
UI Slots¶
Define a UI slot in the platform¶
UI slots can be created for any function, making it possible to invoke the function through a button click.Binding functions to UI slots will enable you to manually trigger them on selected items, annotations, or tasks.
Dataloop currently supports the following UI slots:
Item as a resource:
Dataset Browser
Annotation Studio
Annotation as a resource:
Annotation Studio
Task as a resource:
Tasks browser
Let’s define a UI button for the “RGB to Gray” function.For that, we should create a slot entity in the SDK, that can be later activated from the UI to quickly invoke functions.
In this case, the input for the RGB function is an item, so the slot resource should be item as well (i.e. SlotDisplayScopeResource.ITEM).As a result, the function will be accessible in the annotations studio under “Applications” dropdown:
import dtlpy as dl
slots = [
dl.PackageSlot(module_name='image-processing',
function_name='rgb2gray',
display_name='RGB2GRAY',
post_action=dl.SlotPostAction(type=dl.SlotPostActionType.NO_ACTION),
display_scopes=[
dl.SlotDisplayScope(
resource=dl.SlotDisplayScopeResource.ITEM,
panel=dl.UiBindingPanel.ALL,
filters={})])
]
Once the function finishes executing, you can decide what the function outputs. Currently, 3 Post-Action outputs are available for UI slots:
SlotPostActionType.DOWNLOAD - Download the output, available only for item output.
SlotPostActionType.DRAW_ANNOTATION - Available only for annotation output. Draw the annotation on the item.
SlotPostActionType.NO_ACTION - Take no further actions
Additionally, you can use filters to specify which items in are eligible to be used in app (e.g. filtered by item type, item format, etc.).
Update the Package and Service with the Slot¶
Now you can update your package and service with the new slot you added:
# Update package with the slot
package.slots = slots
package = package.update()
# Update service with the new package version
service.package_revision = package.version
service.update()
Activate the UI slot¶
To make the UI slot visible to users in the platform, you need to activate the slots:
package.services.activate_slots(service=service,
project_id=project.id,
slots=slots)
Notice that clicking on the UI slot button will trigger the service only if it is active.
Pause the service:We recommend pausing the service you created for this tutorial so it will not be triggered:
service.pause()
Congratulations! You have successfully created, deployed, and tested Dataloop functions!
Executions Control¶
Execution Termination¶
Sometimes when we run long term executions, such as model training, we need the option to terminate the execution. This is facilitated using terminate at Checkpoint.To stop an execution set the code checkpoints to check if this execution received a termination and if it did, raise the Termination Exception.This allows you to save some work that was already done before terminating.For example:
class ServiceRunner(dl.BaseServiceRunner):
def detect(self, item: dl.Item):
# Do some work
foo = 0
self.kill_event()
# Do some more work
bar = 1
self.kill_event()
# Sleep for a while
import time
time.sleep(1)
# And... done!
return
Each time there is a “kill_event” the service runner checks to see if this execution received a termination request.To kill such execution we use
execution.terminate()
Execution Timeout¶
You can tell an execution to stop after a given number of seconds with the timeout parameter (the default time is 1 hour).In case a service reset, such as in timeout or service update, If there are running executions the service will wait for the execution timeout before resetting.The number have to be a natural number (int).
service.execution_timeout = 60 # 1 minute
You can decide what to do to executions that have experienced a timeout. There are 2 options of timeout handling:
Mark execution as failed
Retry
service.on_reset = 'failed'
service.on_reset = 'rerun'
# The service must be updated after changing these attributes
service.update()
Using Secrets in a Function¶
You can use organization integration secrets (key-value) as environment variable inside the function’s env.First you’ll need to add the integrations (UI only), then simply add the integrations’ ids whe deploying the service:
package: dl.Package
package.services.deploy(name='with-integrations',
secrets=['integrationId'])
Inside the service you can access the values using os package:
import os
print(os.environ['INTEGRATION_NAME'])
FaaS Docker Image¶
Dataloop enables you to deploy in the FaaS module a custom docker image, to enrich your application with literallyanything required for your project. Deploying a docker image is as easy as providing the Docker image path whendeploying the service:
service = package.deploy(service_name='my-service',
runtime=dl.KubernetesRuntime(
runner_image='docker.io/python:3.8'
))
or if you want to update an existing service:
service = dl.services.get('service-id-or-name')
service.runtime.runner_image = 'python:3.8'
service.update()
Our Docker Image¶
We have our list of docker images publicly available in DockerhubYou can see the env of each docker on the dockerfile here
Public Docker Images¶
You can use any public docker image, and on runtime, the Dataloop agent will install:
Package requirements
dtlpy package (version as defined on the service)
dtlpy-agent (same version as the SDK)
For example, using docker.io/python:3.9.13
will run the function with Python 3.9.
Build Your Own Docker Image¶
If you want other environment or need to add some apt-get installation, you can create any docker image and use it directly.You will need to set the HOME directory to /tmp
and install the python packages with –user (or as USER 1000).For instance:
FROM dockerhub.io/dataloopai/dtlpy-agent:latest.gpu.cuda11.5.py3.8.opencv
RUN apt update && apt install -y zip ffmpeg
USER 1000
ENV HOME=/tmp
RUN pip3 install --user \
dtlpy==1.54.10 \
dtlpy-agent==1.54.10 \
torch \
torchvision \
imgaug \
scikit-image==0.17.2
Using Private Docker Registry¶
To connect a private registry, you’ll need to add the docker container registry credentials as an Organization Secret (ONLY in the UI) and just create use the runner image.
Example: Model Annotations Service¶
This tutorial demonstrates creating and deploying a service that pre-annotates an items before manual annotation work is performed, as part of active-learning process.
Service Code¶
Your code can perform any action you need toe execute as part of pre-annotating items for example:
Access a remote server/API to retrieve annotations
Run your algorithm or ML model as a service in Dataloop FaaS
In this example we use a simple face detection algorithm that uses Cv2 and Caffe model
import numpy as np
import os
import cv2
import dtlpy as dl
class ServiceRunner:
def __init__(self,
model_filename: str,
prototxt_filename: str,
min_confidence: float):
prototxt = os.path.join(os.getcwd(), prototxt_filename)
weights = os.path.join(os.getcwd(), model_filename)
print("[INFO] loading model...")
self.net = cv2.dnn.readNetFromCaffe(prototxt, weights)
self.min_confidence = min_confidence
def detect(self, item: dl.Item):
print("[INFO] downloading image...")
filename = item.download()
try:
# load the input image and construct an input blob for the image
# by resizing to a fixed 300x300 pixels and then normalizing it
print("[INFO] opening image...")
image = cv2.imread(filename)
(h, w) = image.shape[:2]
blob = cv2.dnn.blobFromImage(cv2.resize(image,
(300, 300)), 1.0,
(300, 300),
(104.0, 177.0, 123.0))
# pass the blob through the network and obtain the detections and
# predictions
print("[INFO] computing object detections...")
self.net.setInput(blob)
detections = self.net.forward()
# create annotation builder to add annotations to item
print("[INFO] uploading annotations...")
builder = item.annotations.builder()
# loop over the detections
for i in range(0, detections.shape[2]):
# extract the confidence (i.e., probability) associated with the
# prediction
confidence = detections[0, 0, i, 2]
# filter out weak detections by ensuring the `confidence` is
# greater than the minimum confidence
if confidence > self.min_confidence:
# compute the (x, y)-coordinates of the bounding box for the
# object
box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
(startX, startY, endX, endY) = box.astype("int")
# draw the bounding box of the face along with the associated
# probability
builder.add(
annotation_definition=dl.Box(
top=startY,
left=startX,
right=endX,
bottom=endY,
label='person'
),
model_info={
'name': 'Caffe',
'confidence': confidence
}
)
# upload annotations
builder.upload()
print("[INFO] Done!")
finally:
os.remove(filename)
Define the module¶
In this example, we load the model in the init method, which runs only once at deployment time, saving us time bu not loading on each execution/This can inputs are attributes that we want the service to include for its entire lifetime. In this case, it’s the model and weights files we want the service to use and the confidence limit for accepting detections.
module = dl.PackageModule(
init_inputs=[
dl.FunctionIO(name='model_filename', type=dl.PackageInputType.STRING),
dl.FunctionIO(name='prototxt_filename', type=dl.PackageInputType.STRING),
dl.FunctionIO(name='min_confidence', type=dl.PackageInputType.FLOAT)
],
functions=[
dl.PackageFunction(
name='detect',
description='OpenCV face detection using Caffe model',
inputs=[
dl.FunctionIO(name='item', type=dl.PackageInputType.ITEM)
]
)
]
)
numpy == 1.18
opencv - python == 3.4
Model and weights files¶
The function uses 2 files containing the model and its weights for inferencing detections. We need to have these files at the same folder as the entry point.To get these files please download them here.https://storage.googleapis.com/dtlpy/model_assets/faas-tutorial/model_weights.zip
##Package RequirementsOur package’s codebase uses 2 Python libraries that are not standard ones. Therefore, we need to make sure they are pre-installed before running the entry point. One way to do so is to use a custom Docker Image (information on this process can be found here. The other way is to add a requirements.txt file to the package codebase. To do so, simply add the following requirements.txt file in the same folder of the entry point (main.py):https://dataloop.ai/docs/service-runtime#customimage
package = project.packages.push(
src_path='<path to folder containing the codebase>',
package_name='face-detector',
modules=[module]
)
Push the Package¶
Make sure you have the following files in one directory:
main.py
requirements.txt
res10_300x300_ssd_iter_140000.caffemodel
deploy.prototxt.txt
Run this to push your package:
service = package.deploy(
service_name='face-detector',
init_input=[
dl.FunctionIO(name='model_filename',
type=dl.PackageInputType.STRING,
value='res10_300x300_ssd_iter_140000.caffemodel'),
dl.FunctionIO(name='prototxt_filename',
type=dl.PackageInputType.STRING,
value='deploy.prototxt.txt'),
dl.FunctionIO(name='min_confidence',
type=dl.PackageInputType.FLOAT,
value=0.5)
],
runtime=dl.KubernetesRuntime(concurrency=1)
# The runtime argument Concurrency=1 means that only one execution can run at a time (no parallel executions).
)
Deploy The Service¶
The package is now ready to be deployed as a service in the Dataloop Platform.Whenever executed, your package will run as a service on default instance type. Review the service configuration to configure it to your needs, for example
Change instance-type to use stronger instances with more memory, CPU and GPU
Increase auto-scaling to handle larger loads
Increase timeouts to allow longer execution time
filters = dl.Filters(resource=dl.FiltersResource.ITEM)
filters.add(field='metadata.system.mimetype', values='image*')
trigger = service.triggers.create(
name='face-detector',
function_name="detect",
resource=dl.TriggerResource.ITEM,
actions=dl.TriggerAction.CREATED,
filters=filters
)
Trigger the Service¶
Once the service is deployed, we can create a trigger to run it automatically when a certain event occurs.In our example we trigger the face-detection service whenever an item is uploaded to the platform.Consider using other triggers or different ways to run you service:
Add the services to a FaaS-node in a pipeline, before annotation tasks
Use DQL trigger to run only on specific datasets, or in specific tasks
Run the service when an item is updated
self.package.artifacts.download(artifact_name=artifact_filename,
local_path=full_weight_path)
Uploading Model Weights as Artifacts¶
Large data files such as ML model weights can be too big to include in a package. These and other large files can be uploaded as artifact.
Task Workflows¶
Tutorials for workforce management
Creating Tasks¶
Tasks are created in the Dataloop platform to initiate annotation or QA work.It requires defining the data items to be included, the assignees working on the task, and various options such as work-load, custom-statuses and more.
Create A Task (Annotation task or QA task) Using Filter¶
The following example demonstrates creating a task from an items filter.The script includes 2 example, for filtering an entire folder/directory, and for filtering by item annotation status.
import dtlpy as dl
import datetime
if dl.token_expired():
dl.login()
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name>')
# Create a task with all items in a specific folder
filters = dl.Filters(field='<dir>', values='</my/folder/directory>')
# filter items without annotations
filters = dl.Filters(field='<annotated>', values=False)
# Create annotation task with filters
task = dataset.tasks.create(
task_name='<task_name>',
due_date=datetime.datetime(day=1, month=1, year=2029).timestamp(),
assignee_ids=['<annotator1@dataloop.ai>', '<annotator2@dataloop.ai>'],
# The items will be divided equally between assignments
filters=filters # filter by folder directory or use other filters
)
# Create QA task with filters
qa_task = dataset.tasks.create_qa_task(task=task,
due_date=datetime.datetime(day=1, month=1, year=2029).timestamp(),
assignee_ids=['<annotator1@dataloop.ai>', '<annotator2@dataloop.ai>'],
filters=filters # this filter is for "completed items"
)
List of Items¶
Create a task from a list of items. The items will be divided equally between annotator’s assignments:
import dtlpy as dl
import datetime
if dl.token_expired():
dl.login()
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name>')
items = dataset.items.list()
items_list = [item for item in items.all()]
task = dataset.tasks.create(
task_name='<task_name>',
due_date=datetime.datetime(day=1, month=1, year=2029).timestamp(),
assignee_ids=['<annotator1@dataloop.ai>', '<annotator2@dataloop.ai>'],
# The items will be divided equally between assignments
items=items_list
)
Entire Dataset¶
Create a task from all items in a dataset. The items will be divided equally between annotator’s assignments:
import dtlpy as dl
import datetime
if dl.token_expired():
dl.login()
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name>')
# Create annotation task
task = dataset.tasks.create(
task_name='<task_name>',
due_date=datetime.datetime(day=1, month=1, year=2029).timestamp(),
assignee_ids=['<annotator1@dataloop.ai>', '<annotator2@dataloop.ai>']
# The items will be divided equally between assignments
)
Add items to an existing task¶
Adding items to an existing task will create new assignments (for new assignee/s).
By Filters¶
import dtlpy as dl
import datetime
if dl.token_expired():
dl.login()
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name>')
filters = dl.Filters(field='<metadata.system.refs>', values=[]) # filter on unassigned items
# Create annotation task
task.add_items(
filters=filters, # filter by folder directory or use other filters
assignee_ids=['<annotator1@dataloop.ai>', '<annotator2@dataloop.ai>'])
Managing Tasks & Assignments¶
Get Task¶
import dtlpy as dl
# Get task by ID
task = dl.tasks.get(task_id='<my-task-id>')
# Get task by name - in a project
project = dl.projects.get(project_name='<project_name>')
task = project.tasks.get(task_name='<my-task-name>')
# Get task by name - in a Dataset
dataset = project.datasets.get(dataset_name='<dataset_name>')
task = project.tasks.get(task_name='<my-task-name>')
# Get all tasks (list) in a project
tasks = project.tasks.list()
# Get all tasks (list( in a dataset
tasks = dataset.tasks.list()
Get Assignments¶
# Get assignment by assignment ID
assignment = dl.assignments.get(assignment_id='<my-assignment-id>')
# Get assignment by name – in a project
project = dl.projects.get(project_name='<project_name>')
assignment = project.assignments.get(assignment_name='<my-assignment-name>')
# Get assignment by name – in a dataset
dataset = project.datasets.get(dataset_name='<dataset_name>')
assignment = dataset.assignments.get(assignment_name='<my-assignment-name>')
# Get assignment by name – in a task
task = project.tasks.get(task_name='<my-task-name>')
assignment = task.assignments.get(assignment_name='<my-assignment-name>')
# Get assignments list - in a project
assignments = project.assignments.list()
# Get assignments list - in a dataset
assignments = dataset.assignments.list()
# Get assignments list - in a task
assignments = task.assignments.list()
Get Assignment Items¶
assignment_items = assignment.get_items()
Redistribute and Reassign Assignments¶
import dtlpy as dl
import datetime
if dl.token_expired():
dl.login()
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name>')
task = dl.tasks.get(task_id='<my-task-id>')
assignment = task.assignments.get(assignment_name='<my-assignment-name>')
Redistribute¶
Redistributing an assignment means to distribute the items among an combination of assignees.The process is identical to annotation and QA tasks.
# load is the workload percentage for each annotator
assignment.redistribute(dl.Workload([dl.WorkloadUnit(assignee_id='<annotator1@dataloop.ai>', load=50),
dl.WorkloadUnit(assignee_id='<annotator2@dataloop.ai>', load=50)]))
Reassign¶
Reassigning an assignment changes the assignee from its original one to another.
assignment.reassign(assignee_ids['<annotator1@dataloop.ai>'])
Delete Task and Assignments¶
When a task is deleted, all its assignments will be deleted as well.
task.delete()
assignment.delete()
Image Annotations¶
Tutorials for creating all types of image annotations
Setup¶
This tutorial guides you through the process using the Dataloop SDK to create and upload annotations into items.The tutorial includes chapters with different tools, and the last chapter includes various more advanced scripts
import dtlpy as dl
if dl.token_expired():
dl.login()
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
Initiation¶
Using the annotation definitions classes you can create, edit, view and upload platform annotations. Each annotation init receives the coordinates for the specific type, label, and optional attributes.
Optional Plotting¶
Before updating items with annotations, you can optionally plot the annotation you created and review it before uploading it. This applies to all annotations described in the following section.
import matplotlib.pyplot as plt
plt.figure()
plt.imshow(builder.show())
for annotation in builder:
plt.figure()
plt.imshow(annotation.show())
plt.title(annotation.label)
Classification Point and Pose¶
Classification¶
Classify a single item
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Classify
builder.add(annotation_definition=dl.Classification(label=label))
# Upload classification to the item
item.annotations.upload(builder)
Classify Multiple Items¶
Classifying multiple items requires using an Items entity with a filter.
# mutiple items classification using filter
...
Create a Point Annotation¶
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create point annotation with label and attribute
builder.add(annotation_definition=dl.Point(x=100,
y=100,
label='my-label',
attributes={'color': 'red'}))
# Upload point to the item
item.annotations.upload(builder)
Pose Annotation¶
Pose annotations are a collection of points that follows a certain template, for example a ‘skeleton’ for tracking key-point on people showing in image or video items.Templates are created in the Dataloop platform, at the instructions settings of a recipe.
# Pose annotation is based on pose template. Create the pose template from the platform UI and use it in the script by its ID
template_id = recipe.get_annotation_template_id(template_name="my_template_name")
# Get item
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Define the Pose parent annotation and upload it to the item
parent_annotation = item.annotations.upload(
dl.Annotation.new(annotation_definition=dl.Pose(label='my_parent_label',
template_id=template_id,
# instance_id is optional
instance_id=None)))[0]
# Add child points
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Point(x=x,
y=y,
label='my_point_label'),
parent_id=parent_annotation.id)
builder.upload()
Bounding Box and Cuboid¶
Create Box Annotation¶
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create box annotation with label
builder.add(annotation_definition=dl.Box(top=10,
left=10,
bottom=100,
right=100,
label='my-label'))
# Upload box to the item
item.annotations.upload(builder)
Create a Rotated Bounding Box Annotation¶
A rotated box is created by setting its top-left and bottom-right coordinates, and providing its rotation angle.
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create box annotation with label
builder.add(annotation_definition=dl.Box(top=10,
left=10,
bottom=100,
right=100,
angle=80,
label='my-label'))
# Upload box to the item
item.annotations.upload(builder)
Convert Semantic Segmentation to Bounding Box¶
Convert all semantic segmentation annotations in an item into box annotation
annotations = item.annotations.list()
builder = item.annotations.builder()
# run over all annotation in item
for annotation in annotations:
if annotation.type == dl.AnnotationType.SEGMENTATION:
print("Found binary annotation - id:", annotation.id)
builder.add(annotation_definition=annotation.annotation_definition.to_box())
item.annotations.upload(annotations=builder)
Create Cuboid (3D Box) Annotation¶
Create cuboid annotation in one of two ways :
# A.Bring front and back rectangles and the angel of the cuboid
builder.add(annotation_definition=dl.Cube.from_boxes_and_angle(label="label",
front_top=100,
front_left=100,
front_right=300,
front_bottom=300,
back_top=200,
back_left=200,
back_right=400,
back_bottom=400,
angle=0
))
# B.Bring all 8 points of the Cuboid
builder.add(annotation_definition=dl.Cube(label="label",
# front top left point coordinates
front_tl=[200, 200],
# front top right point coordinates
front_tr=[500, 250],
# front bottom right point coordinates
front_br=[500, 550],
# front bottom left point coordinates
front_bl=[200, 500],
# back top left point coordinates
back_tl=[300, 300],
# back top right point coordinates
back_tr=[600, 350],
# back bottom right point coordinates
back_br=[600, 650],
# back bottom left point coordinates
back_bl=[300, 600]
))
item.annotations.upload(builder)
Polygon and Polyline¶
Create Single Polygon/Polyline Annotation¶
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create polygon annotation with label
# with array of points: [[x1, y1], [x2, y2], ..., [xn, yn]]
builder.add(annotation_definition=dl.Polygon(geo=[[100, 50],
[80, 120],
[110, 130]],
label='my-label'))
# create Polyline annotation with label
builder.add(annotation_definition=dl.Polyline(geo=[[100, 50],
[80, 120],
[110, 130]],
label='my-label'))
# Upload polygon to the item
item.annotations.upload(builder)
Create Multiple Polygons from Mask¶
annotations = item.annotations.list()
mask_annotation = annotations[0]
builder = item.annotations.builder()
builder.add(dl.Polygon.from_segmentation(mask_annotation.geo,
max_instances=2,
label=mask_annotation.label))
item.annotations.upload(builder)
Convert Mask Annotations to Polygon¶
More about from_segmentation() function on here.
annotations = item.annotations.list()
builder = item.annotations.builder()
# run over all annotation in item
for annotation in annotations:
if annotation.type == dl.AnnotationType.SEGMENTATION:
print("Found binary annotation - id:", annotation.id)
builder.add(dl.Polygon.from_segmentation(mask=annotation.annotation_definition.geo,
# binary mask of the annotation
label=annotation.label,
max_instances=None))
annotation.delete()
item.annotations.upload(annotations=builder)
Convert Polygon Annotation to Mask¶
More about from_polygon() function on here.This script uses module CV2, please use this page to install it.
if annotation.type == dl.AnnotationType.POLYGON:
print("Found polygon annotation - id:", annotation.id)
builder.add(dl.Segmentation.from_polygon(geo=annotation.annotation_definition.geo,
# binary mask of the annotation
label=annotation.label,
shape=img.size[::-1] # (h,w)
))
annotation.delete()
item.annotations.upload(annotations=builder)
Ellipse and Item Description¶
Create Ellipse Annotation¶
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create ellipse annotation with label - With params for an ellipse; x and y for the center, rx, and ry for the radius and rotation angle:
builder.add(annotations_definition=dl.Ellipse(x=x,
y=y,
rx=rx,
ry=ry,
angle=angle,
label=label))
# Upload the ellipse to the item
item.annotations.upload(builder)
Item Description¶
Item description is added as a “system annotation”, and serves as a way to save information about the item, that can be seen by anyone accessing it.
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Add description (update if already exists)- if text is empty it will remove the description from the item
item.set_description(text="this is item description")
Segmentation¶
Init Segmentation¶
Each annotation init receives the coordinates for the specific type, label, and optional attributes. A binary mask should be exactly the same dimensions as the image item, with 0 for background and 1 for the annotation.
annotations_definition = dl.Segmentation(geo=geo, label=label)
Create a Semantic Segmentation Annotation¶
# Get item from the platform
item = dataset.items.get(filepath='/your-image-file-path.jpg')
# Create a builder instance
builder = item.annotations.builder()
# Create semantic segmentation mask with label and attribute
mask = np.zeros(shape=(item.height, item.width), dtype=np.uint8)
# mark some part in the middle
mask[50:100, 200:250] = 1
# Add annotations of type segmentation
builder.add(annotation_definition=dl.Segmentation(geo=mask,
label='my-label'))
# Optional: Plot all of the annotations you created before uploading them to the platform
import matplotlib.pyplot as plt
plt.figure()
plt.imshow(builder.show())
for annotation in builder:
plt.figure()
plt.imshow(annotation.show())
plt.title(annotation.label)
# Upload semantic segmentation to the item
item.annotations.upload(builder)
Convert Mask to Polygon¶
The Dataloop SDK includes a function to convert a semantic mask to a polygon annotation, which is often easier to edit and work with in the UI.The following example filters for items with semantic mask annotations, and converts them into Polygon annotations.
filters = dl.Filters()
# set resource
filters.resource = 'items'
# add filter - only files
filters.add(field='type', values='file')
# add annotation filters - only items with 'binary' annotations
filters.add_join(field='type', values='binary')
# get results
pages = dataset.items.list(filters=filters)
# run over all items in page
for page in pages:
for item in page:
print('item=' + item.id)
annotations = item.annotations.list()
builder = item.annotations.builder()
# run over all annotation in item
for annotation in annotations:
# print(annotation)
if annotation.type == 'binary':
print("Found binary annotation - id:", annotation.id)
builder.add(dl.Polygon.from_segmentation(mask=annotation.annotation_definition.geo,
# binary mask of the annotation
label=annotation.label,
max_instances=None))
annotation.delete()
item.annotations.upload(annotations=builder)
Convert Polygon to Mask¶
The Dataloop SDK also includes a function to convert a Polygon annotation into semantic mask annotation.The following example filters for items with Polygon annotations, and converts them into semantic mask annotations.This script uses module CV2, please make sure it is installed.
from PIL import Image
filters = dl.Filters()
# set resource
filters.resource = 'items'
# add filter - only files
filters.add(field='type', values='file')
# add annotation filters - only items with polygon annotations
filters.add_join(field='type', values='segment')
# get results
pages = dataset.items.list(filters=filters)
# run over all items in page
for page in pages:
for item in page:
print('item=' + item.id)
annotations = item.annotations.list()
item = dataset.items.get(item_id=item.id)
buffer = item.download(save_locally=False)
img = Image.open(buffer)
builder = item.annotations.builder()
# run over all annotation in item
for annotation in annotations:
# print(annotation)
if annotation.type == 'segment':
print("Found polygon annotation - id:", annotation.id)
builder.add(dl.Segmentation.from_polygon(geo=annotation.annotation_definition.geo,
# binary mask of the annotation
label=annotation.label,
shape=img.size[::-1] # (h,w)
))
annotation.delete()
item.annotations.upload(annotations=builder)
Create Semantic Segmentation from Image Mask and Upload¶
The following script creates a semantic mask based on RGB colors of an image item and upload them to the Dataloop platformPlease notice that directory paths look different in OS and Linux and does not require “r” at the beginningMake sure to use install OpenCV package to version 3.4.8.x with the scriptpip install opencv-python == 3 .4.8.latest
from PIL import Image
import numpy as np
import dtlpy as dl
# Get project and dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# image filepath
image_filepath = r'C:/home/images/with_family.png'
# annotations filepath - RGB with color for each label
annotations_filepath = r'C:/home/masks/with_family.png'
# upload item to root directory
item = dataset.items.upload(local_path=image_filepath,
remote_path='/')
# read mask from file
mask = np.array(Image.open(annotations_filepath))
# get unique color (labels)
unique_colors = np.unique(mask.reshape(-1, mask.shape[2]), axis=0)
# init dataloop annotations builder
builder = item.annotations.builder()
# for each label - create a dataloop mask annotation
for i, color in enumerate(unique_colors):
print(color)
if i == 0:
# ignore background
continue
# get mask of same color
class_mask = np.all(color == mask, axis=2)
# add annotation to builder
builder.add(annotation_definition=dl.Segmentation(geo=class_mask,
label=str(i)))
# upload all annotations
item.annotations.upload(builder)
Advance Tutorials¶
Copy Annotations Between Items¶
By setting annotations entity from one item, and uploading it into another, we can copy annotations between items. Running through all items in a filter allows us to copy from one item into multiple items, for example video snapshots with the same object.
# Set the source item with the annotations we want to copy
project = dl.projects.get(project_name='second-project_name')
dataset = project.datasets.get(dataset_name='second-dataset_name')
item = dataset.items.get(item_id='first-id-number')
annotations = item.annotations.list()
# Set the target item where we want to copy to. If located on a different Project or Dataset, set these accordingly
item = dataset.items.get(item_id='second-id-number')
item.annotations.upload(annotations=annotations)
# Copy the annotation into multiple items, based on a filter entity. In this example, the filter is based on directory
filters = dl.Filters()
filters.add(field='filename', values='/fighting/**') # take files from the directory only (recursive)
filters.add(field='type', values='file') # only files
pages = dataset.items.list(filters=filters)
for page in pages:
for item in page:
# upload annotations
item.annotations.upload(annotations=annotations)
Show Images & Annotations¶
This script uses module CV2, please use this page to install it.
from PIL import Image
# Get item
item = dataset.items.get(item_id='write-your-id-number')
# download item as a buffer
buffer = item.download(save_locally=False)
# open image
image = Image.open(buffer)
# download annotations
annotations = item.annotations.show(width=image.size[0],
height=image.size[1],
thickness=3)
annotations = Image.fromarray(annotations.astype(np.uint8))
# show the annotations and the image separately
annotations.show()
image.show()
# Show the annotations with the image
image.paste(annotations, (0, 0), annotations)
image.show()
Show Annotations from JSON file (Dataloop format)¶
Please notice that directory paths look different in OS and Linux and does not require “r” at the beginning
from PIL import Image
import json
with open(r'C:/home/project/images/annotation.json', 'r') as f:
data = json.load(f)
for annotation in data['annotations']:
annotations = dl.Annotation.from_json(annotation)
mask = annotations.show(width=640,
height=480,
thickness=3,
color=(255, 0, 0))
mask = Image.fromarray(mask.astype(np.uint8))
mask.show()
Count total number of annotations¶
The following script counts the number of annotations in a filter. The filter can be set to any context - Dataset, folder or any specific criteria. In the following example, it is set to a dataset.
# Create annotations filters instance
filters = dl.Filters(resource=dl.FiltersResource.ANNOTATION)
filters.page_size = 0
# Count the annotations
annotations_count = dataset.annotations.list(filters=filters).items_count
Parenting Annotations¶
Parenting establishes a relation between 2 annotations, executed by setting the parent_id parameter. The Dataloop system will reject an attempt to set circular parenting.The following script demonstrate setting parenting relation while uploading/creating annotations
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Box(top=10, left=10, bottom=100, right=100,
label='my-parent-label'))
# upload parent annotation
annotations = item.annotations.upload(annotations=builder)
# create the child annotation
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Box(top=10, left=10, bottom=100, right=100,
label='my-child-label'),
parent_id=annotations[0].id)
# upload annotations to item
item.annotations.upload(annotations=builder)
The following script demonstrate setting parenting relation on existing annotations:
# create and upload parent annotation
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Box(top=10, left=10, bottom=100, right=100,
label='my-parent-label'))
parent_annotation = item.annotations.upload(annotations=builder)[0]
# create and upload child annotation
builder = item.annotations.builder()
builder.add(annotation_definition=dl.Box(top=10, left=10, bottom=100, right=100,
label='my-child-label'))
child_annotation = item.annotations.upload(annotations=builder)[0]
# set the child parent ID to the parent
child_annotation.parent_id = parent_annotation.id
# update the annotation
child_annotation.update(system_metadata=True)
Change Annotations’ Label¶
The following example creates a new label in the recipe (an optional step, you can also use an existing label), then applies it to all annotations in a certain filter.
# Create a new label
dataset.add_label(label_name='newLabel', color=(2, 43, 123))
# Filter annotations with the "oldLabel" label.
filters = dl.Filters()
filters.resource = dl.FiltersResource.ANNOTATION
filters.add(field='label', values='oldLabel')
pages = dataset.annotations.list(filters=filters)
# Change the Label of the Annotations - For every annotation we filtered out, Change it's Label to the "newLabel".
for annotation in pages.all():
annotation.label = 'newLabel'
annotation.update()
Video Annotations¶
Tutorials for annotating videos
In this tutorial we create and upload annotations into a video item. Video annotations differ from image annotations since they span over frames, and need to be set with their scope.This script uses module CV2, please use this page to install it.
Setup¶
import dtlpy as dl
if dl.token_expired():
dl.login()
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
item = dataset.items.get(filepath='/my_item.mp4')
Create A Single annotation¶
Create a single annotations for a video item and upload it
annotation = dl.Annotation.new(item=item)
# Span the annotation over 100 frames. Change this or use a different approach based on your context
for i_frame in range(100):
# go over 100 frame
annotation.add_frame(annotation_definition=dl.Box(top=2 * i_frame,
left=2 * (i_frame + 10),
bottom=2 * (i_frame + 50),
right=2 * (i_frame + 100),
label="my-label"),
frame_num=i_frame, # set the frame for the annotation
)
# upload to platform
annotation.upload()
Adding Multiple Annotations Using Annotation Builder¶
The following scripts demonstrate adding 10 annotations into each frame
# create annotation builder
builder = item.annotations.builder()
for i_frame in range(100):
# go over 100 frames
for i_detection in range(10):
# for each frame we have 10 different detections (location is just for the example)
builder.add(annotation_definition=dl.Box(top=2 * i_frame,
left=2 * i_detection,
bottom=2 * i_frame + 10,
right=2 * i_detection + 100,
label="my-label"),
# set the frame for the annotation
frame_num=i_frame,
# need to input the element id to create the connection between frames
object_id=i_detection + 1,
)
# Upload the annotations to platform
item.annotations.upload(builder)
Read Frames of an Annotation¶
The following example reads all the frames an annotation exist in, e.g. the frame range an annotation spans over.
for annotation in item.annotations.list():
print(annotation.object_id)
for key in annotation.frames:
frame = annotation.frames[key]
print(frame.left, frame.right, frame.top, frame.bottom)
Create Frame Snapshots from Video¶
One of Dataloop video utilities enables creating a frame snapshot from a video item every X frames (frame_interval).You will need FFmpeg needs to be installed on your system using this official website.
dl.utilities.Videos.video_snapshots_generator(item=item, frame_interval=30)
Play An Item In Video Player¶
Play a video item with its annotations and labels with a video player
from dtlpy.utilities.videos.video_player import VideoPlayer
VideoPlayer(project_name=project_name,
dataset_name=dataset_name,
item_filepath=item_filepath)
Show Annotations in a Specified Frame¶
import matplotlib.pyplot as plt
# Get from platform
annotations = item.annotations.list()
# Plot the annotations in frame 55 of the created annotations
frame_annotation = annotations.get_frame(frame_num=55)
plt.figure()
plt.imshow(frame_annotation.show())
plt.title(frame_annotation.label)
# Play video with the Dataloop video player
annotations.video_player()
Recipe and Ontology¶
Tutorials for managing ontologies, labels, and recipes
Recipe and Ontology Concepts¶
The Dataloop Recipe & Ontology concepts are detailed in our documentation. In short:
Ontology - an entity that contains labels and attributes. An attribute is linked to a label
Recipe - An entity that ties an ontology with labeling instructions
Linked with an ontology
Labeling tools (e.g. box, polygon etc)
Optional PDF instructions
And more…
In this chapter we will create an ontology and populate it with labels
Preparing - Entities setup¶
import dtlpy as dl
if dl.token_expired():
dl.login()
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# Get recipe from list
recipe = dataset.recipes.list()[0]
# Or get specific recipe:
recipe = dataset.recipes.get(recipe_id='id')
# Get ontology from list or create it using the "Create Ontology" script
ontology = recipe.ontologies.list()[0]
# Or get specific ontology:
ontology = recipe.ontologies.get(ontology_id='id')
# Print entities:
recipe.print()
ontology.print()
Create an Ontology¶
project = dl.projects.get(project_name='project_name')
ontology = project.ontologies.create(title="your_created_ontology_title",
labels=[dl.Label(tag="Chameleon", color=(255, 0, 0))])
Labels¶
Ontology uses the ‘Labels’ entity, which is a python list object, and as such you can use python list methods such as sort(). Be sure to use ontology.update() after each python list action.
ontology.add_labels(label_list=['Shark', 'Whale', 'Animal.Donkey'], update_ontology=True)
Labels can be added with branched hierarchy to facilitate sub-labels at up-to 5 levels.Labels hierarchy is created by adding ‘.’ between parent and child labels.In the above example, this script will get the Donkey Label:
child_label = ontology.labels[-1].children[0]
print(child_label.tag, child_label.rgb)
Attributes¶
An attribute describes a label, without having to add more labels. For example “Car” is a label, but its color is an attribute. You can add multiple attributes to the ontology, and map it to labels. For example create the “color” attribute once, but have multiple labels use it.Attributes can be multiple-selection (e.g checkbox), single selection (radio button), value over slider, a yes/no question and free-text.An attribute can be set as a mandatory one, so annotators have to answer it before they can complete the item.
Add attributes to the ontology¶
The following example adds 1 attribute of every type, all as a mandatory attribute:
Multiple-choice attribute
Single-choice attributes
Slider attribute
Yes/no question attribute
Free text attribute
# checkbox attribute
ontology.update_attributes(key='color',
title='Choose a color',
attribute_type=dl.AttributesTypes.CHECKBOX,
values=['red', 'blue', 'green'],
scope=['<label1>', '<label2>'])
# radio button attribute
ontology.update_attributes(key='occluded',
title='Level of occlusion',
attribute_type=dl.AttributesTypes.RADIO_BUTTON,
values=['no', 'mid', 'high'],
scope=['*'])
# slider attribute
ontology.update_attributes(key='height',
title='Persons height[cm]',
attribute_type=dl.AttributesTypes.SLIDER,
attribute_range=dl.AttributesRange(0, 200, 10),
scope=['*'])
# yes/no attribute
ontology.update_attributes(key='female',
title='Is mosquito female?',
attribute_type=dl.AttributesTypes.YES_NO,
scope=['*'])
# free text attribute
ontology.update_attributes(key='age',
title='How old is the person',
attribute_type=dl.AttributesTypes.FREE_TEXT,
scope=['*'])
Read Ontology Attributes¶
Read & print the all the ontology attributes:
print(ontology.metadata['attributes'])
keys = [att['key'] for att in ontology.metadata['attributes']]
Getting all labels is (including children):¶
print(ontology.labels_flat_dict)
Since a recipe is linked with an ontology, it allows for making changes with labels and attributes. When the recipe is set as the default one for a dataset, the same applies for the dataset entity - it can be used for making changes with the labels and attributes which are ultimately linked to it through the recipe and its ontology.
Working With Recipes¶
# Get recipe from a list
recipe = dataset.recipes.list()[0]
# Get recipe by ID - ID can be retrieved from the page URL when opening the recipe in the platform
recipe = dataset.recipes.get(recipe_id='your-recipe-id')
# Delete recipe - applies only for deleted datasets
dataset.recipes.get(recipe_id='your-recipe-id').delete()
Cloning Recipes¶
When you want to create a new recipe that’s only slightly different from an existing recipe, it can be easier to start by cloning the original recipe and then making changes on its clone.shallow: If True, link to existing ontology,If false clone all ontologies that are links to the recipe as well.
dataset = project.datasets.get(dataset_name="myDataSet")
recipe = dataset.recipes.get(recipe_id="recipe_id")
recipe2 = recipe.clone(shallow=False)
View Dataset Labels¶
# as objects
labels = dataset.labels
# as instance map
labels = dataset.instance_map
Add Labels by Dataset¶
Working with dataset labels can be done one-by-one or as a list.The Dataset entity documentation details all label options - read here.
# Add multiple labels
dataset.add_labels(label_list=['person', 'animal', 'object'])
# Add single label with specific color and attributes
dataset.add_label(label_name='person', color=(34, 6, 231))
# Add single label with a thumbnail/icon
dataset.add_label(label_name='person', icon_path='/home/project/images/icon.jpg')
Add Labels Using Label Object¶
# Create Labels list using Label object
labels = [
dl.Label(tag='Donkey', color=(255, 100, 0)),
dl.Label(tag='Mammoth', color=(34, 56, 7)),
dl.Label(tag='Bird', color=(100, 14, 150))
]
# Add Labels to Dataset
dataset.add_labels(label_list=labels)
# or you can also create a recipe from the label list
recipe = dataset.recipes.create(recipe_name='My-Recipe-name', labels=labels)
Add a Label and Sub-Labels¶
label = dl.Label(tag='Fish',
color=(34, 6, 231),
children=[dl.Label(tag='Shark',
color=(34, 6, 231)),
dl.Label(tag='Salmon',
color=(34, 6, 231))]
)
dataset.add_labels(label_list=label)
# or you can also create a recipe from the label list
recipe = dataset.recipes.create(recipe_name='My-Recipe-name', labels=labels)
Add Hierarchy Labels with Nested¶
Different options for hierarchy label creation.
# Option A
# add father label
labels = dataset.add_label(label_name="animal", color=(123, 134, 64))
# add child label
labels = dataset.add_label(label_name="animal.Dog", color=(45, 34, 164))
# add grandchild label
labels = dataset.add_label(label_name="animal.Dog.poodle")
# Option B: only if you dont have attributes
# parent and grandparent (animal and dog) will be generated automatically
labels = dataset.add_label(label_name="animal.Dog.poodle")
# Option C: with the Big Dict
nested_labels = [
{'label_name': 'animal.Dog',
'color': '#220605',
'children': [{'label_name': 'poodle',
'color': '#298345'},
{'label_name': 'labrador',
'color': '#298651'}]},
{'label_name': 'animal.cat',
'color': '#287605',
'children': [{'label_name': 'Persian',
'color': '#298345'},
{'label_name': 'Balinese',
'color': '#298651'}]}
]
# Add Labels to the dataset:
labels = dataset.add_labels(label_list=nested_labels)
Delete Labels by Dataset¶
dataset.delete_labels(label_names=['Cat', 'Dog'])
Update Label Features¶
# update existing label , if not exist fails
dataset.update_label(label_name='Cat', color="#000080")
# update label, if not exist add it
dataset.update_label(label_name='Cat', color="#fcba03", upsert=True)
Model Management¶
Tutorials for creating and managing model and snapshots
Model Management¶
Quick Overview¶
Dataloop’s model management allows machine learning engineers to manage their research and production processes in one centralized place.
Models are run using a combination of Packagess, Datasets, and Artifacts.
Model architectures are pushed to the cloud via Packages. Packages are bundles of code that contain the codebase required for the model to run. Datasets will include the images being used for training or inference, and they also indicate which images are included within a dataset subset (e.g. train/validation/test, or dividing your datset in other ways to achieve specific model training objectives).
Models can come from ready-to-go packages of open source algorithms (e.g. ResNet, Yolo). Models can also be created from pre-trained models for fine-tuning or transfer learning.
You can also upload your own models and compare model performance by viewing training metrics.
All models can be integrated into the Dataloop platform, connected to the UI via buttons or slots, or added to pipelines.
Introduction¶
In this tutorial we will cover the required Dataloop entities to create, compare, restore, manage, and deploy model training sessions and trained models.
Components of a Model
Package and Model Entities¶
We will use the Package entity to save the architecture of the model (e.g Yolov5, Inception, SVM, etc.) and the model algorithm code.
In “online” mode (see “Model Comparison” below), Packages should include a Model Adapter to create the Dataloop API
Model algorithms that come as-is can be found in the AI Library. All public packages listed in the AI Library are pretrained and include the model algorithm code and default configurations. Users can download the codebase of any packages pushed to the cloud.
Using the Package (code), Dataset and Ontology (data and labels) and configuration (a dictionary) we can now create a Model.
The Model contains the weights and any other artifacts needed to load the trained model and inference.
A model can also be cloned to be a starting point for a new model (for fine-tuning or transfer learning).
Additional Package components¶
Some users may want to further customize their models, such as uploading their own model weights or creating their own custom model. This can be achieved with Artifacts and a Model Adapter.
Artifacts are any additional files necessary for a given model to run on the cloud. For example, if a user wanted to upload their own weights to create a pre-trained model, the weights file would be included as an Artifact. These artifacts can be uploaded via one of the following:
local directory or path
dl.Item
Git repository
other link
The Model Adapter is a python class that creates a single API between Dataloop’s platform and your model. The Model Adapter class contains standardized methods that make it possible to integrate models into other parts of the Dataloop platform. Model Adapters allow the following model functions:
train
predict
load/save model weights
annotation conversion (if needed)
Model comparison¶
All models can be viewed in one place, and different model versions can be compared and evaluated with user-selected metrics.
Model management can be used in two modes: offline (for local model training) or online (for integration into the Dataloop platform).
In “offline” mode, users can run and train models on their local machine using local data, and can compare model configurations and metrics on the platform. “Offline” requires minimal platform integration, and can be used after dl.Package and dl.Model entities are created. This mode allows only visualizing metrics exported from the (local) model training session.
In “offline” mode, code and weights are not saved anywhere in the Dataloop platform. Only model metrics are saved and viewable at a later time.
An example of model metrics
In “online” mode, models can be trained to be deployed anywhere on the platform. For example, you can easily create a button interface to use your model to inference on a new data item and view it on the platform.
To do this, you need to create a ModelAdapter class and implement the required functions to build the Dataloop API.
“Online” mode also includes all the platform features mentioned above in “offline” mode.
Create your own Package and Model¶
You can use your own model on the platform by creating Package and Model entities, and then use a model adapter to create an API with Dataloop.
The first thing a model adapter does is create a model adapter class. The example here inherits from dl.BaseModelAdapter, which contains all the Dataloop methods required to interact with the Package and Model. You must implement these methods in the model adapter class in order for them to work: load, save, train, predict.
In this example, the adapter is defined in a script called “adapter_script.py” and is separate from the rest of the code on this page. This script will load a model from a saved model weights file in the root directory called ‘model.pth’.
import dtlpy as dl
import torch
import os
@dl.Package.decorators.module(name='model-adapter',
description='Model Adapter for my model',
init_inputs={'model_entity': dl.Model})
class SimpleModelAdapter(dl.BaseModelAdapter):
def load(self, local_path, **kwargs):
print('loading a model')
self.model = torch.load(os.path.join(local_path, 'model.pth'))
def save(self, local_path, **kwargs):
print('saving a model to {}'.format(local_path))
torch.save(self.model, os.path.join(local_path, 'model.pth'))
def train(self, data_path, output_path, **kwargs):
print('running a training session')
def predict(self, batch, **kwargs):
print('predicting batch of size: {}'.format(len(batch)))
preds = self.model(batch)
return preds
NOTE: The code above is an example for a torch model adapter. This example will NOT run if copied as-is. For working examples please refer to the examples in the Dataloop Github.
To create our Package entity, we first need to pack our package code to a dl.ItemCodebase, retrieve the metadata, and indicate where the entry point to the package is. If you’re creating a Package with code from Git, change the codebase type to be dl.GitCodebase.
import dtlpy as dl
from adapter_script import SimpleModelAdapter
project = dl.projects.get(project_name='<project_name>')
dataset = project.datasets.get(dataset_name='<dataset_name')
codebase = project.codebases.pack(directory='<path to local dir>')
# codebase: dl.GitCodebase = dl.GitCodebase(git_url='github.com/mygit', git_tag='v25.6.93')
metadata = dl.Package.get_ml_metadata(cls=SimpleModelAdapter,
default_configuration={'weights_filename': 'model.pth',
'input_size': 256},
output_type=dl.AnnotationType.CLASSIFICATION
)
module = dl.PackageModule.from_entry_point(entry_point='adapter_script.py')
Then we can push the package and all its parts to the cloud. To change the computing configurations, see the Dataloop docs for the Instance Catalog.
package = project.packages.push(package_name='My-Package',
src_path=os.getcwd(),
package_type='ml',
codebase=codebase,
modules=[module],
is_global=False,
service_config={
'runtime': dl.KubernetesRuntime(pod_type=dl.INSTANCE_CATALOG_GPU_K80_S,
autoscaler=dl.KubernetesRabbitmqAutoscaler(
min_replicas=0,
max_replicas=1),
concurrency=1).to_json()},
metadata=metadata)
Now you can create a model and upload pretrained model weights with an Artifact Item. Here, the Artfiact item is where the saved model weights are. You can upload any weights file here and name it according to the ‘weights_filename’ in the configuration.
artifact = dl.LocalArtifact(local_path='<path to weights>')
model = package.models.create(model_name='tutorial-model',
description='first model we are uploading',
tags=['pretrained', 'tutorial'],
dataset_id=None,
configuration={'weights_filename': 'model.pth'
},
project_id=package.project.id,
model_artifacts=[artifact],
labels=['car', 'fish', 'pizza']
)
Finally, build the model adapter and call one of the adapter’s methods to see that your custom model works. If you’ve entered a dataset_id when creating the model, you can also train the model on that dataset.
adapter = package.build()
adapter.load_from_model(model_entity=model)
Dataloop Dataloader¶
A dl.Dataset image and annotation generator for training and for items visualization
We can visualize the data with augmentation for debugging and exploration.After that, we will use the Data Generator as an input to the training functions.
from dtlpy.utilities import DatasetGenerator
import dtlpy as dl
dataset = dl.datasets.get(dataset_id='611b86e647fe2f865323007a')
datagen = DatasetGenerator(data_path='train',
dataset_entity=dataset,
annotation_type=dl.AnnotationType.BOX)
Object Detection Examples¶
We can visualize a random item from the dataset:
for i in range(5):
datagen.visualize()
Or get the same item using its index:
for i in range(5):
datagen.visualize(10)
Adding augmentations using imgaug repository:
from imgaug import augmenters as iaa
import numpy as np
augmentation = iaa.Sequential([
iaa.Resize({"height": 256, "width": 256}),
# iaa.Superpixels(p_replace=(0, 0.5), n_segments=(10, 50)),
iaa.flip.Fliplr(p=0.5),
iaa.flip.Flipud(p=0.5),
iaa.GaussianBlur(sigma=(0.0, 0.8)),
])
tfs = [
augmentation,
np.copy,
# transforms.ToTensor()
]
datagen = DatasetGenerator(data_path='train',
dataset_entity=dataset,
annotation_type=dl.AnnotationType.BOX,
transforms=tfs)
datagen.visualize()
datagen.visualize(10)
All of the Data Generator options (from the function docstring):
:param dataset_entity: dl.Dataset entity:param annotation_type: dl.AnnotationType - type of annotation to load from the annotated dataset:param filters: dl.Filters - filtering entity to filter the dataset items:param data_path: Path to Dataloop annotations (root to “item” and “json”).:param overwrite::param label_to_id_map: dict - {label_string: id} dictionary:param transforms: Optional transform to be applied on a sample. list or torchvision.Transform:param num_workers::param shuffle: Whether to shuffle the data (default: True) If set to False, sorts the data in alphanumeric order.:param seed: Optional random seed for shuffling and transformations.:param to_categorical: convert label id to categorical format:param class_balancing: if True - performing random over-sample with class ids as the target to balance training data:param return_originals: bool - If True, return ALSO images and annotations before transformations (for debug):param ignore_empty: bool - If True, generator will NOT collect items without annotations
The output of a single element is a dictionary holding all the relevant information.the keys for the DataGen above are: [’image_filepath’, ‘item_id’, ‘box’, ‘class’, ‘labels’, ‘annotation_filepath’, ‘image’, ‘annotations’, ‘orig_image’, ‘orig_annotations’]
print(list(datagen[0].keys()))
We’ll add the flag to return the origin items to understand better how the augmentations look like.Let’s set the flag and we can plot:
import matplotlib.pyplot as plt
datagen = DatasetGenerator(data_path='train',
dataset_entity=dataset,
annotation_type=dl.AnnotationType.BOX,
return_originals=True,
shuffle=False,
transforms=tfs)
fig, ax = plt.subplots(2, 2)
for i in range(2):
item_element = datagen[np.random.randint(len(datagen))]
ax[i, 0].imshow(item_element['image'])
ax[i, 0].set_title('After Augmentations')
ax[i, 1].imshow(item_element['orig_image'])
ax[i, 1].set_title('Before Augmentations')
Segmentation Examples¶
First we’ll load a semantic dataset and view some images and the output structure
dataset = dl.datasets.get(dataset_id='6197985a104eb81cb728e4ac')
datagen = DatasetGenerator(data_path='semantic',
dataset_entity=dataset,
transforms=tfs,
return_originals=True,
annotation_type=dl.AnnotationType.SEGMENTATION)
for i in range(5):
datagen.visualize()
Visualize original vs augmented image and annotations mask:
fig, ax = plt.subplots(2, 4)
for i in range(2):
item_element = datagen[np.random.randint(len(datagen))]
ax[i, 0].imshow(item_element['orig_image'])
ax[i, 0].set_title('Original Image')
ax[i, 1].imshow(item_element['orig_annotations'])
ax[i, 1].set_title('Original Annotations')
ax[i, 2].imshow(item_element['image'])
ax[i, 2].set_title('Augmented Image')
ax[i, 3].imshow(item_element['annotations'])
ax[i, 3].set_title('Augmented Annotations')
Converting to 3d one-hot encoding to visualize the binary mask per label. We will plot only 8 labels (there might be more on the item):
item_element = datagen[np.random.randint(len(datagen))]
annotations = item_element['annotations']
unique_labels = np.unique(annotations)
one_hot_annotations = np.arange(len(datagen.id_to_label_map)) == annotations[..., None]
print('unique label indices in the item: {}'.format(unique_labels))
print('unique labels in the item: {}'.format([datagen.id_to_label_map[i] for i in unique_labels]))
plt.figure()
plt.imshow(item_element['image'])
fig = plt.figure()
for i_label_ind, label_ind in enumerate(unique_labels[:8]):
ax = fig.add_subplot(2, 4, i_label_ind + 1)
ax.imshow(one_hot_annotations[:, :, label_ind])
ax.set_title(datagen.id_to_label_map[label_ind])
Setting a Label Map¶
One of the inputs to the DatasetGenerator is ‘label_to_id_map’. This variable can be used to change the label mapping for the annotationsand allow using the dataset ontology in a greater variety of cases.For example, you can map multiple labels so a single id or add a default value for all the unlabeled pixels in segmentation annotations.This is what the annotation looks like without any mapping:
# project = dl.projects.get(project_name='Semantic')
# dataset = project.datasets.get(dataset_name='Hamster')
# dataset.items.upload(local_path='assets/images/hamster.jpg',
# local_annotations_path='assets/images/hamster.json')
dataset = dl.datasets.get(dataset_id='621ddc855c2a3d151451ec58')
datagen = DatasetGenerator(data_path='semantic',
dataset_entity=dataset,
return_originals=True,
overwrite=True,
annotation_type=dl.AnnotationType.SEGMENTATION)
datagen.visualize()
data_item = datagen[0]
plt.imshow(data_item['annotations'])
print('BG value: {}'.format(data_item['annotations'][0, 0]))
Now, we’ll map both the ‘eye’ label and the background to 2 and the ‘fur’ to 1:
dataset = dl.datasets.get(dataset_id='6197985a104eb81cb728e4ac')
label_to_id_map = {'cat': 1,
'dog': 1,
'$default': 0}
dataloader = DatasetGenerator(data_path='semantic',
dataset_entity=dataset,
transforms=tfs,
return_originals=True,
label_to_id_map=label_to_id_map,
annotation_type=dl.AnnotationType.SEGMENTATION)
for i in range(5):
dataloader.visualize()
Batch size and batch_size and collate_fn¶
If batch_size is not None, the returned structure will be a list with batch_size data items.Setting a collate function will convert the returned structure to a tensor of any kind.The default collate will convert everything to ndarrays. We also have tensorflow and torch collate to convert to the corresponding tensors.
dataset = dl.datasets.get(dataset_id='611b86e647fe2f865323007a')
datagen = DatasetGenerator(data_path='train',
dataset_entity=dataset,
batch_size=10,
annotation_type=dl.AnnotationType.BOX)
batch = datagen[0]
print('type: {}, len: {}'.format(type(batch), len(batch)))
print('single element in the list: {}'.format(batch[0]['image']))
# with collate
from dtlpy.utilities.dataset_generators import collate_default
datagen = DatasetGenerator(data_path='train',
dataset_entity=dataset,
collate_fn=collate_default,
batch_size=10,
annotation_type=dl.AnnotationType.BOX)
batch = datagen[0]
print('type: {}, len: {}, shape: {}'.format(type(batch['images']), len(batch['images']), batch['images'].shape))