Feature · Dataset governance

Govern the datasets behind model development and oversight

SentinelAI extends governance beyond model cards by giving teams a dedicated dataset registry with lineage, approval state, taxonomy-backed classification, sensitivity controls, and enterprise catalog integration hooks.

What this area covers

Dataset governance supports teams that need to understand where training, validation, test, and inference data comes from, how it is classified, how it is approved, and whether it is appropriate for production use. The result is stronger traceability across the model lifecycle.

Related product areas

  • Model registry

    Maintain a governed inventory for AI models and use-case context with lifecycle state, ownership, risk posture, and supporting evidence.

  • AI systems

    Track governed runtime systems that combine models, approved use cases, datasets, release state, and readiness into one operational record.

  • Prompt registry

    Govern versioned prompts, retrieval settings, linked AI systems, and evaluation posture from a dedicated prompt operations record.

  • RAG sources

    Register governed retrieval sources with ingestion status, version history, citation context, and AI-system linkage.

  • Semantic governance

    Operate taxonomy, ontology, relationship, and graph-backed governance workflows across models, use cases, datasets, controls, and evidence.

  • Compliance workflows

    Operationalize evidence collection, control tracking, remediation, and framework mapping across AI systems.

Core capabilities

Built to support production governance work

Dedicated dataset registry

Track dataset purpose, type, quality, sensitivity level, and stewardship details in a domain built specifically for AI governance.

Taxonomy-backed governance

Use centrally managed taxonomy terms so teams can classify datasets, governance states, and rollout concepts with more consistency across the platform.

Lineage and traceability

Represent upstream sources, downstream usage, and model-to-dataset relationships to support provenance reviews and impact analysis.

Approval workflows

Move datasets through draft, review, approved, and deprecated states with a stored event trail instead of informal sign-off.

Quality and catalog controls

Apply quality gates before linking datasets to higher-stakes model states and connect to enterprise catalogs such as Collibra, Purview, Databricks, Alation, or custom systems.

Target users

  • Data scientists and ML teams documenting model inputs and evaluation data
  • Compliance and risk teams reviewing provenance, sensitivity, and approval status
  • Data governance stakeholders coordinating stewardship across enterprise catalogs
  • Auditors and reviewers who need evidence of traceability from data to model use

Governance value

  • Improves visibility into which datasets support which governed AI systems
  • Adds structure to approvals and sensitivity handling for AI-relevant data assets
  • Reduces classification drift by using shared taxonomy-backed terms instead of unmanaged free text
  • Supports better review of lineage, provenance, and data-quality expectations
  • Helps teams connect dataset oversight to model approvals without overclaiming automation

How teams use it

A practical operating flow for this feature family

Step 1

Register and classify datasets

Capture dataset type, purpose, stewardship, taxonomy-backed classifications, and sensitivity as soon as the asset enters a governance process.

Step 2

Link lineage and approvals

Record how datasets relate to one another, which models they support, and whether they have cleared the required review steps.

Step 3

Use approvals in model governance

Feed dataset readiness and quality signals into the broader decision-making process around model lifecycle changes.

Continue exploring

Explore how SentinelAI connects adjacent governance workflows