The Data Plane: databases, object storage, and semantic search — built in | HumanikOS Blog

Ask most AI agent platforms about their data story and you get the same answer: a vector store. Embed some documents, run similarity search, pipe the results into a context window. That is the entire infrastructure. It works for Q&A chatbots. It does not work for agents that need to track orders, manage inventories, store pipeline outputs, log results across runs, or do anything that requires actual structured data.

Agents that do real work need real data infrastructure — the same kind of databases, storage, and query capabilities you would give a human employee. Not a retrieval layer. A data plane.

HumanikOS ships one with every workspace. No external services to configure. No credentials to manage. No setup at all. Your agents work with real data from day one.

Namespaces: data isolation that maps to how you work

Data in HumanikOS is organized into namespaces — isolated data domains within a workspace. A namespace for customer data. A namespace for analytics. A namespace for a specific project. Each one gets its own tables, its own data sources, its own object storage, and its own search index. They are isolated at the database level, not just separated by application logic. An agent working in one namespace cannot accidentally read or write data in another.

This matters at scale. When you have dozens of agents working on different projects, namespace isolation is what prevents data bleed between workstreams. It also maps naturally to how organizations actually structure their data — by team, by project, by domain. You do not need to invent a tagging system or build access control on top. The boundaries are structural.

Tables with real schemas

Every namespace supports full structured tables — typed columns, auto-generated IDs, timestamps, indexes, and constraints. These are real relational tables, not document stores or key-value dumps. You define a schema. You insert rows. You query with filters, pagination, and ordering. You add columns later without disrupting live data.

The difference between this and what most agent platforms offer is fundamental. A vector store lets you search for documents that are semantically similar to a query. A structured table lets you run a query like "show me all customers on the enterprise plan who signed up in the last 30 days, sorted by usage." One is fuzzy retrieval. The other is business logic. Agents that manage pipelines, track state, or maintain records need the second one.

Agents interact with tables through the skill system — creating tables, inserting rows, running queries, modifying schemas — all through conversation. Tell Nova to create a customers table with name, email, plan tier, and signup date, and it builds the table, sets the column types, and confirms. No migration files. No CLI. The agent describes what it needs and the data plane provisions it.

Full SQL, guarded

Agents can write real SQL — JOINs, aggregations, subqueries, window functions. The full language, not a safe subset. But every query goes through a parser before it executes.

The parser does three things. First, it validates the query against a safety whitelist. SELECT, INSERT, UPDATE, CREATE TABLE, ALTER TABLE — all allowed. DROP TABLE, TRUNCATE, raw DDL that could damage the namespace — blocked. Second, it rewrites friendly table names to their actual scoped names in the database. Agents write clean, readable SQL using names they understand. The system translates to the real identifiers underneath. Third, it blocks multi-statement injection. One query at a time. No chaining a SELECT with a DROP.

This means agents and humans can use the same SQL console without risk. An agent exploring data writes the same queries a data analyst would. The guardrails are invisible when you are doing normal work and absolute when something dangerous is attempted.

Ingestion without glue code

External data needs to get into the system. The typical approach is to write a connector — a Lambda function, an integration script, a cron job that polls an API. That is glue code, and it is the kind of work that takes longer to maintain than it does to build.

In HumanikOS, every data source gets a unique webhook endpoint provisioned automatically. External services send JSON payloads to your endpoint. The system verifies the signature, applies your key mapping to transform the payload shape into your table columns, and inserts the data. Background processes generate embeddings for search without blocking the response.

Key mapping handles the shape mismatch between sources. Stripe sends customer.email. Your CRM sends contact_email. Both map to the same column. You define the mapping once and every incoming payload is transformed consistently.

There is also a test mode. Before you point a live data source at your table, you can capture test payloads to validate the pipeline — check that the mapping is correct, the types coerce properly, and the data lands where you expect. Then flip to live. No surprises.

For batch data, CSV upload handles file imports with automatic field type inference. Upload a file, the system reads the headers and values, infers the column types, and ingests. Agents can trigger this through conversation. Humans can do it through the dashboard. Same pipeline either way.

Search by meaning, not by keyword

Every record inserted into the data plane gets a vector embedding generated automatically. When you search, the system takes your natural language query, expands it with AI to capture related concepts, generates an embedding, and runs similarity matching against your data. The results come back ranked by relevance.

This works across tables and files. Ask "find high-value customers that are churning" and you get structured results even if no record contains the word "churning." The search understands meaning. It runs inside the database engine — not a separate vector service bolted on the side — which means embeddings live alongside your structured data, under the same isolation and access controls.

For agents, this is transformative. An agent researching a topic does not need to know which table to query or what column to filter on. It describes what it is looking for in plain language and the data plane returns relevant records. The semantic layer sits on top of the structured layer, not instead of it. You can still write precise SQL when you need exact results. You can use semantic search when you need discovery.

Object storage alongside your data

Not everything fits in a table row. Documents, images, exports, artifacts — agents produce and consume files constantly. The data plane includes managed object storage as a first-class feature, not a separate service.

Create buckets within a namespace. Upload files with signed URLs for secure access. Files live under the same isolation boundaries as your tables — same namespace, same permissions, same access controls. And files are automatically indexed with AI-generated descriptions, making them searchable through the same semantic search interface as your structured data.

An agent that generates a report can store it as an object in the same namespace where the data it analyzed lives. Another agent — or a human — can find it later by searching for what it contains, not remembering where it was saved.

Isolation at every layer

Multi-tenant data isolation is not a feature we added — it is how the system is built. Every request is authenticated. Every operation confirms the namespace belongs to your tenant. Role-based access control gates every route. Table names are structurally scoped so cross-tenant queries are impossible at the database level, not just blocked by application logic. Row-level security is enforced on every table as an additional layer of defense. And the SQL parser validates all table references before any query executes.

This is defense-in-depth. Not one wall — six. If any single layer failed, the others would still prevent unauthorized access. For organizations running AI agents that read and write real business data, this level of isolation is not optional. It is the minimum standard.

Built in, not bolted on

The Data Plane is not a product we integrated. It is a layer of the operating system. It shares the same permission model as the rest of HumanikOS. Agents access it through the same skill system they use for everything else. Nova orchestrates data operations the same way it orchestrates office management and cross-agent coordination.

That is the difference between having a database and having a data plane. A database is a service you connect to. A data plane is infrastructure that your entire system — agents, humans, automation — interacts with natively. No connection strings. No credentials. No configuration. Your workspace has data infrastructure the moment it exists, and every agent in it can use that infrastructure from their first conversation.

The Data Plane: databases, object storage, and semantic search — built in