Demystifying AI Governance

AI governance involves providing guardrails for safely using AI systems. This can seem like an obtuse topic, but once we start approaching it systematically, we find that it breaks into concrete subproblems. A lot of these map to well-understood problems in cybersecurity, with a few twists here and there; the rest are novel, but can be framed as clear requirements. This article, the first in our series of demystifying AI governance, gives a high-level overview of the area.

Life of a Model

The best place to start towards demystifying AI Governance is understanding how AI models are used in enterprises. Models can be built from scratch (e.g. simple neural nets), or pre-trained with some later fine-tuning (e.g. ResNet family). They can be on-prem (e.g. HuggingFace models) or hosted at a vendor (e.g. GPTs). They can be used to power customer-facing features and/or internal tools.

Model Introduction: A model starts becoming a concern when it is being used as part of some functionality, especially if it involves sensitive customer/user/employee data. All models have associated risks – security, privacy, ethics, etc., but some models are safer than others, especially in certain situations.

Training: The data used to train models can lead to restrictions on how the model is used. For instance, applications built on models trained on raw user data must include mechanisms to respect data subject rights, such as the right to be forgotten.

Inference: Models can represent an alternate plane of access to the data they were trained on. For instance, a model trained on internal documents and made available to all employees invalidates whatever access control policies were applied on the docs. Model outputs can contain abusive, inaccurate, or inappropriate information, and usually need to be filtered.

Continued Operation: Models can start showing degraded performance over time, a phenomenon known as model drift. This can affect the safety properties of the models.

Vendor-hosted models: With vendor-hosted models, there are additional issues, as vendors are external third parties. Preventing sensitive data from reaching the vendor via model inputs at training/inference time is important. Preventing abusive prompts is also important and usually stipulated as a contractual requirement by the vendor.

Visibility and Control

A common joke in cybersecurity is that every problem can be framed as either monitoring or controlling access to something. Monitoring/ visibility solutions are usually easy to implement but cannot prevent bad outcomes. Nevertheless, they might be sufficient in some cases. Control solutions have to be more robust, and are hence more expensive to build. The following table takes the problems sketched in the previous section and frames them as problems around visibility/control.

Problem	Visibility	Control
Presence: Which models are being used in the enterprise?	Model Discovery	Model Provenance
Risks: What are the security, privacy, & ethical risks of the models in use?	Model Risk Assessment	Model Provenance
Training Data: Which types of data are being used to train the models?	Data Mapping	Data Minimization
Access: Which users &services have access to which models?	Model Access Control
Inputs: What data is being shared with these models?	Model Audit Logs	Data Loss Prevention, Abusive Prompt Filtering
Outputs: Do model outputs have abusive/ inappropriate stuff?	Model Audit Logs	Output Filtering
Drift: Has the model safety degraded since operation started?	Drift Monitoring

Model Discovery is finding the models that are in use within the enterprise. It parallels familiar problems such as Data Mapping and Vendor Discovery. Solutions include manual information gathering from engineering teams or automated methods like Code Scanning.

Model Provenance refers to standardizing the ways in which models can be procured (both on-prem and vendor-provided models). This is a subcategory of Supply Chain Security and solutions follow a similar template.

Model Risk Assessment ensures that model risks remain within acceptable bounds. This is one of the more unique subproblems within AI governance, although it bears some similarities to Black Box testing, pen testing, and other offensive security techniques.

Data Mapping entails creating a map of sensitive data locations within the infrastructure. It is a foundational challenge in privacy and a prerequisite for addressing other issues. In the context of AI governance, the additional complexity lies in tracking which data sources are utilized to train specific models.

Data Minimization covers a broad family of techniques including pseudonymization, anonymization and synthetic data generation. In this context, data minimization serves a specific purpose, to unlink the training data from the underlying users and hence reduce restrictions on the resultant models.

Model Access Control involves controlling which principals (users, services, etc.) get access to which models. This maps to internal or third-party access control, depending on whether the model in question is on-prem or vendor-hosted. The challenge is in avoiding creating a Shadow Access Plane, and instead capture these rules in the existing IAM graph(s) in the org.

Model Audit Logs capture the details of the training and inference operations with the models. These logs are invaluable for meeting regulatory requirements and for incident forensics. Audit Logs in general are a fairly well understood problem. The challenge is around leveraging these existing tools for AI models instead of creating new, incongruent tooling.

Data Loss Prevention is about ensuring that sensitive data belonging to the enterprise does not get inadvertently shared with AI vendors, or otherwise becomes part of the training corpora of internal models. This is also a well understood problem in general, with extensive solutions. The challenge is in adapting these existing tools to AI-specific use cases such as training and inference, and ensuring consistent usage across the organization.

Abusive Prompt Filtering is a novel problem with AI systems, especially LLMs. It involves detecting and preventing abusive/inappropriate prompts reaching the model.

Output Filtering, i.e., ensuring that model outputs adhere to the policies of the organization and larger regulations, is a new problem with AI applications. Our understanding of this problem as well as the solution landscape are both evolving. The best approach here is adopting a viable solution among the available options and watching the landscape for advancements.

Drift Monitoring is another novel AI problem and involves continuously monitoring model safety performance metrics and acting when these metrics drop below acceptable thresholds.

What Obex can do for you

The gist of demystifying AI Governance is that AI Governance can be broken into sub-problems which map to well-understood problems in cybersecurity.

Obex takes this one level further and enables enterprises to extend their solutions for these sub-problems to the AI governance use-cases. In short, Obex acts as a translation layer between AI governance and existing cybersecurity tooling.

For instance, Obex enables controlling access to AI models via standard IAM tooling such as AWS IAM, instead of introducing a shadow access plane involving API keys and shared secrets. Obex gathers AI audit logs and makes them available at log sinks such as AWS CloudWatch or Datadog, and SIEM tools such as Splunk. Obex lets you use existing DLP tools such as Microsoft Purview to cover AI systems as well. For novel problems such as Model Risk Assessment and Abusive Prompt Filtering, Obex lets you seamlessly integrate solutions with a broad range of AI models and providers.

If you are curious about Obex, check out obex.dev, or reach out to us at [email protected].