Skip to main content

Introduction

Welcome to the Subworkflow.ai Documentation.
Here you'll discover and learn how to integrate our Dataset APIs and make the most out of the service.

Subworkflow.AI's core offering is a file processing service which AI developers use to delegate digital document splitting, indexing and vectorization tasks. It is especially designed for AI workflows where Retrieval Augumented Generation (RAG) and Visual Document Retrieval (VDR) play a key part and where documents can be either or both, "large" (eg. 500 - 5000 pages) or received in high frequency (eg. incoming PDFs from across an organisation). This service is internally refered to as the Datasets API.

The Datasets API service architecture was originally inspired by challenges in real-world AI automation projects where in these scenarios, the combination of large document workloads and spikey traffic often led to instability, out-of-memory errors and business disruption. Today, Subworkflow.ai has put considerable effort into improving and refining this process, making it production-ready to support a wider array of AI scenarios and now offering it as a utility service to AI developers building their own AI projects.

Features

  • Simple API schema: Designed for speed, the Datasets API is light and minimal to make integration easier and faster.
  • Distributed Document Processing: Under the hood, a powerful processing engine ensures durability and reliability for your projects.
  • Document Conversion: Supports PDF, Docx, PPTX and more whilst converting pages into images for Visual Language models.
  • Efficient Document Page Retrieval: After processing a document, retrieve only relevant pages rather than all to reduce memory load.
  • Automated Search Indexing: Search APIs provided over document contents via SOTA image embeddings saves time.
  • Self-Expiring Asset Links: Protects against asset links being leaked when passing pages to LLM provider and increase compliance.
  • Web Portal: Manage your organisation, team, workspaces and generate API keys to use the Datasets API.

Use Cases

Though Subworflow.ai is targeted towards AI projects, file processing isn't uniquely an AI problem. There are others applications of the capability such as simple archival tasks so if you have work with documents of any kind, please give us a try.

Use case by Role

You are a Developer
Upload your documents and the Datasets API extracts and converts each page ready to be used as input for LLMs or rendered as part of your app. The vectorize search feature creates image embeddings for each page which enables you to offer your customers document RAG and "grounding" features. Dataset API is intended or server-side use to help enhance your AI workflows and projects.
You are a Project Manager
Your team estimates you'll need several more sprints, servers and devs to handle your client's larger documents and files in your AI project. The scope is vague and costs even more so, which becomes hard to get approval and increases risks to budget and timelines. Subworkflow.ai can fill in as both interim and permanent solution to allow your team to focus on core product proposition and continue serving you as you scale.

Use cases by Industry

  • Real Estate & Property: Process hundreds of contracts, surveys and proposals without bottleneck allowing you to serve more customers.
  • Construction & Builing: Streamline RFPs, questionnaires and siteplans in seconds to get answers quickly.
  • Legal & Compliance: Be free of arbitrary page limits and easily create vector search indexes over documents from 1 page to 5000 pages and even more.
  • Finance & Accounting: Empower your Structured outputs workflow when reports are heavy with charts and graphs and can typically be in excess of 100+ pages.
  • Academia & Research: Simplify research paper automation without increase in additional infrastructure.

Company

Subworkflow.ai is owned and operated by Subworkflow AI Limited (16781125) who is registered in England & Wales, United Kingdom. Our registered address is 71-75 Shelton Street, Covent Garden, London, WC2H 9JQ, United Kingdom. Subworkflow.ai is a trading name of Subworkflow AI Limited. Please ready our terms of use, privacy policy and acceptable usage policy before using our service.

2025 © Subworkflow AI Limited. All rights reserved.

Useful Contacts