To access the filters entity click here.

8. The Dataloop Query Language - DQL

Using The Dataloop Query Language, you may navigate through massive amounts of data.

You can filter, sort, and update your metadata with it.

8.1. Filters

Using filters, you can filter items and get a generator of the filtered items. The filters entity is used to build such filters.

8.1.1. Filters - Field & Value

Filter your items or annotations using the parameters in the JSON code that represent its data within our system.Access your item/annotation JSON using to_json().

8.1.1.1. Field

Field refers to the attributes you filter by.

For example, “dir” would be used if you wish to filter items by their folder/directory.

8.1.1.2. Value

Value refers to the input by which you want to filter.For example, “/new_folder” can be the directory/folder name where the items you wish to filter are located.

8.1.2. Sort - Field & Value

8.1.2.1. Field

Field refers to the field you sort your items/annotations list by.For example, if you sort by filename, you will get the item list sorted in alphabetical order by filename.See the full list of the available fields here.

8.1.2.2. Value

Value refers to the list order direction. Either ascending or descending.

8.2. Filter Items

Filter items by the item’s JSON fields.In this example, you will get all annotated items in a dataset sorted by the filename.

Note
See all of the items iterator options on the Iterator of Items page.
import dtlpy as dl
# Get project and dataset
project = dl.projects.get(project_name='project_name')
dataset = project.datasets.get(dataset_name='dataset_name')
# Create filters instance
filters = dl.Filters()
# Filter only annotated items
filters.add(field='annotated', values=True)
# optional - return results sorted by ascending file name 
filters.sort_by(field="filename")
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))

8.3. Filter Items by the Items’ Annotations

add_join - filter items by the items’ annotations JSON fields. For example, filter only items with ‘box’ annotations.

Note
See all of the items iterator options on the Iterator of Items page.
filters = dl.Filters()
# Filter all approved items
filters.add(field='metadata.system.annotationStatus', values="approved")
# AND filter items by their annotation - only items with 'box' annotations
# Meaning you will get approved items with 'box' annotations
filters.add_join(field='type', values='box')
# optional - return results sorted by descending creation date 
filters.sort_by(field='createdAt', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))

8.4. Filters Method - “Or” and “And”

Filters Operators
For more advanced filters operators visit the Advanced SDK Filters page.

8.4.1. And

If you wish to filter annotations with the “and” logical operator, you can do so by specifying which filters will be checked with “and”.

AND is the default value and can be used without specifying the method.
In this example, you will get a list of annotated items with user metadata of the field "is_automated" and value True.
filters = dl.Filters()  # filters with and
filters.add(field='annotated', values=True, method=dl.FiltersMethod.AND)
filters.add(field='metadata.user.is_automated', values=True,
            method=dl.FiltersMethod.AND)  # optional - return results sorted by ascending file name
filters.sort_by(field='name')
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))

8.4.2. Or

If you wish to filter annotations with the “or” logical operator, you can do so by specifying which filters will be checked with “or”.In this example, you will get a list of items that are either on “folder1” or “folder2” directories.

filters = dl.Filters()
# filters with or
filters.add(field='dir', values='/folderName1', method=dl.FiltersMethod.OR)
filters.add(field='dir', values='/folderName2',
            method=dl.FiltersMethod.OR)  # optional - return results sorted by descending directory name
filters.sort_by(field='dir', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered items list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))

8.5. Update User Metadata of Filtered Items

Update Filtered Items - The ‘update_value’ must be a dictionary.The dictionary will only update user metadata.Understand more about user metadata <a href=https://github.com/dataloop-ai/dtlpy-documentation/blob/main/tutorials/data_management/working_with_metadata/chapter.md” target=”_blank”>here.In this example, you will update/add user metadata (with the field “BlackDogs” and value True), to items in a specific folder ‘dogs’ with an attribute ‘black’.

filters = dl.Filters()
# For example -  filter only items in a specific folder - like 'dogs'
filters.add(field='dir', values='/dogs')
# For example - filter items by their annotation - only items with 'black' attribute
filters.add_join(field='attributes', values='black')
# to add filed BlackDogs to all filtered items and give value True
# this field will be added to user metadata
# create update order
update_values = {'BlackDogs': True}
# update
pages = dataset.items.update(filters=filters, update_values=update_values)

8.6. Delete Filtered Items

In this example, you will delete items that were created on 30/8/2020 at 8:17 AM.

filters = dl.Filters()
# For example -  filter only annotated items
filters.add(field='createdAt', values="2020-08-30T08:17:08.000Z")
dataset.items.delete(filters=filters)

8.7. Item Filtering Fields

8.7.1. More Filter Options

Use a dot to access parameters within curly brackets. For example use field='metadata.system.originalname' to filter by the item's original name.
{
    "id": "5f4b60848ced1d50c3df114a",
    "datasetId": "5f4b603d9825b9f191bbd3b3",
    "createdAt": "2020-08-30T08:17:08.000Z",
    "dir": "/new_folder",
    "filename": "/new_folder/optional.jpg",
    "type": "file",
    "hidden": false,
    "metadata": {
        "system": {
            "originalname": "file",
            "size": 3290035,
            "encoding": "7bit",
            "mimetype": "image/jpeg",
            "annotationStatus": [
                "completed"
            ],
            "refs": [
                {
                    "type": "task",
                    "id": "5f4b61f8f81ab6238c331bd2"
                },
                {
                    "type": "assignment",
                    "id": "5f4b61f8f81ab60508331bd3"
                }
            ],
            "executionLogs": {
                "image-metadata-extractor": {
                    "default_module": {
                        "run": {
                            "5f4b60841b892d82eaa2d95b": {
                                "progress": 100,
                                "status": "success"
                            }
                        }
                    }
                }
            },
            "exif": {},
            "height": 2734,
            "width": 4096,
            "statusLog": [
                {
                    "status": "completed",
                    "timestamp": "2020-08-30T14:54:17.014Z",
                    "creator": "user@dataloop.ai",
                    "action": "created"
                }
            ],
            "isBinary": true
        }
    },
    "name": "optional.jpg",
    "url": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a",
    "dataset": "https://gate.dataloop.ai/api/v1/datasets/5f4b603d9825b9f191bbd3b3",
    "annotationsCount": 18,
    "annotated": "discarded",
    "stream": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/stream",
    "thumbnail": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/thumbnail",
    "annotations": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/annotations"
}

8.8. Full Examples

8.8.1. How to filter items by their annotations label?

filters = dl.Filters()
filters.add_join(field='label', values='your_label_value')
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))

8.8.2. How to filter items by completed and approved status?

filters = dl.Filters()
filters.add(field='metadata.system.annotationStatus', values=["completed", "approved"])
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))

8.8.3. How to filter items by completed status (with items who are approved as well)?

filters = dl.Filters()
# set resource
filters.add(field='metadata.system.annotationStatus', values="completed")
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))

8.8.4. How to filter items by only completed status?

filters = dl.Filters()
filters.add(field='metadata.system.annotationStatus', values=["completed"])
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))

8.8.5. How to filter unassigned items?

filters = dl.Filters()
filters.add(field='metadata.system.refs', values=[])
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))

8.8.6. How to filter items by a specific folder?

filters = dl.Filters()
filters.add(field='dir', values="/folderName")
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of items in dataset: {}'.format(pages.items_count))

8.8.7. Get all items named foo.bar

filters = dl.Filters()
filters.add(field='name', values='foo.bar.*')
# Get filtered item list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))

8.8.8. Sort files of size 0-5 MB by name, in ascending order

filters = dl.Filters()
filters.add(field='metadata.system.size', values='0', operator='gt')
filters.add(field='metadata.system.size', values='5242880', operator='lt')
filters.sort_by(field='filename', value=dl.FILTERS_ORDERBY_DIRECTION_ASCENDING)
# Get filtered item list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))

8.8.9. Sort with multiple fields: Sort Items by labels ascending and createdAt descending

filters = dl.Filters()
# set annotation resource
filters.resource = dl.FiltersResource.ANNOTATION
# return results sorted by descending label
filters.sort_by(field='label', value=dl.FILTERS_ORDERBY_DIRECTION_ASCENDING)
filters.sort_by(field='createdAt', value=dl.FILTERS_ORDERBY_DIRECTION_DESCENDING)
# Get filtered item list in a page object
pages = dataset.items.list(filters=filters)
# Count the items
print('Number of filtered items in dataset: {}'.format(pages.items_count))

8.9. Advanced Filtering Operators

Explore advanced filtering options on this page.

8.10. Response to DQL Query

A typical response to a DQL query will look like the following:

{
    "totalItemsCount": number,
    "items": Array,
    "totalPagesCount": number,
    "hasNextPage": boolean,
}
# A possible result:
{
    "totalItemsCount": 2,
    "totalPagesCount": 1,
    "hasNextPage": false,
    "items": [
        {
            "id": "5d0783852dbc15306a59ef6c",
            "createdAt": "2019-06-18T23:29:15.775Z",
            "filename": "/5546670769_8df950c6b6.jpg",
            "type": "file"
                    // ...
        },
        {
            "id": "5d0783852dbc15306a59ef6d",
            "createdAt": "2019-06-19T23:29:15.775Z",
            "filename": "/5551018983_3ce908ac98.jpg",
            "type": "file"
                    // ...
        }
    ]
}