Skip to main content

Glossary

Organisation

An organisation is intended to represent a commercial business, academic institution or non-profit entity on the Subworkflow.ai platform.

Workspace

A workspace is a conceptual working boundary within an organisation which scopes (1) datasets and (2) team members. Each organisation must have at least one workspace.

Dataset

Any file uploaded to a subworkflow.ai workspace becomes a "dataset" and only loosely retains it's original context as a "document". A Dataset is a collection of data parts identified by their format (or "row") and part number (or "col") - refered to as "Dataset Items", arranged to be optimised for retrieval only.

Dataset Item

A dataset item belongs to a dataset and is a data part of the original dataset file. The language is generic here because a Subworkflow dataset can represent either a document, audio or video file:

  • For documents, a dataset item is the equivalent to a page
  • For audio files, a dataset item is the equivalent to a 1 minute transcript
  • For video files, a dataset item is the equivalent to a 1 minute clip

A dataset Item can also be identified by their "row" (ie. file format) and their "col" (ie. sequential item index within the original file) values. Again, the available values for these properties will differ dependent on the original file type but as an example:

  • For documents, "row" can be "pdf" or "jpg" and the "col" will represent the page number.
  • For audio and video files, these are to be determined at a later date.

Share links are how you get the dataset binary data out of Subworkflow.ai. For security, all binary assets can only be fetched via the /v1/share endpoint and requires a token. This token is self-expiring and a new one is generated everytime you request the dataset. Please note that old tokens are not forced to expire and remain active even when a new token is generated. The default duration of a token is 10 minutes.

You can override the expiry duration by specifying the expiresInSeconds query paramter. The max duration you can set is determined by your subscription plan. Note, the share links are disabled when the set expiration is reached or when the dataset itself is expired/deleted whichever comes first.

  • Starter: max expiration duration for share links is 7 days
  • Standard: max expiration duration for share links is 30 days
  • Enterprise: max expiration duration for share links is 90 days

If you need the dataset binary asset to remain permanently shareable, it's best to make a copy asset and host elseqhere more appropriate for long term storage.