FINOS-AIGF/AIR-VEC@0.2.0

Digest	Media type	Size
78098f912d91…	application/vnd.gemara.artifact.v1+yaml	32.1 KiB

AI Governance Framework Risk Vectors

AIGF risks expressed as Gemara vectors. Each vector describes a pathway through which AI system failures or negative outcomes may be realized in financial services deployments.

ID: AIR-VEC
Version: 0.2.0
Gemara version: 1.1.0
Author: FINOS-AIGF

Model Availability

Foundation models often rely on GPU-heavy infrastructure hosted by third-party providers, introducing risks related to service availability and performance. Key threats include Denial of Wallet (excessive usage leading to cost spikes or throttling), outages from immature Technology Service Providers, and VRAM exhaustion due to memory leaks or configuration changes. These issues can disrupt operations, limit failover options, and undermine the reliability of LLM-based applications.

AIR-OP-007-01 Denial of Wallet
Usage patterns inadvertently lead to excessive costs, throttling, or service disruptions. Overly long prompts from large document chunking, multimedia content, or token-expensive adversarial queries can exhaust token limits or drive up charges. Poorly throttled scripts or agentic systems may generate excessive API calls, overwhelming resources and bypassing capacity planning.
AIR-OP-007-02 TSP Outage or Degradation
External technology service providers may lack operational maturity to maintain stable service levels, leading to unexpected outages or performance degradation under load. Tight coupling to a specific proprietary provider limits failover capability, violating business continuity expectations.
AIR-OP-007-03 VRAM Exhaustion
Video RAM exhaustion on serving infrastructure compromises model responsiveness or triggers crashes. Causes include configuration changes that exceed available resources, caching strategies that trade VRAM for throughput, and memory leaks in model-serving libraries that prevent proper resource release.

Operational

Risks arising from AI system behaviour, reliability, and operational characteristics that may impact business processes.

AIR-OP-018 Model Overreach / Expanded Use
AI systems may be used beyond their originally intended and validated scope, leading to unreliable outputs in contexts the model was not designed or tested for. Scope creep can occur gradually as users discover new applications, or suddenly when systems are repurposed without adequate re-evaluation of risks and performance characteristics.
AIR-OP-020 Reputational Risk
AI systems may generate outputs that are offensive, inappropriate, misleading, or otherwise damaging to the organization's reputation. This risk is amplified when attackers deliberately manipulate models into producing harmful content that is then attributed to the organization.

Prompt Injection

Prompt injection occurs when attackers craft inputs that manipulate a language model into producing unintended, harmful, or unauthorized outputs. These attacks can be direct—overriding the model’s intended behaviour—or indirect, where malicious instructions are hidden in third-party content and later processed by the model. This threat can lead to misinformation, data leakage, reputational damage, or unsafe automated actions, especially in systems without strong safeguards or human oversight.

AIR-SEC-010-01 Direct Prompt Injection
Attackers interact directly with the LLM to override its intended behaviour. Crafted inputs attempt to bypass system prompts, ignore safety guardrails, or coerce the model into disclosing sensitive information. Requires no special privileges and can be executed through simple input manipulation.
AIR-SEC-010-02 Indirect Prompt Injection
Malicious instructions are embedded in third-party content such as websites, emails, or uploaded documents. When the LLM processes this contaminated data, the injected prompts can hijack decision-making, escalate privileges, trigger unauthorized actions, or exfiltrate data being processed. Especially dangerous in automated workflows or multi-agent architectures.
AIR-SEC-010-03 Model Profiling and Inversion
Sophisticated prompt injection techniques probe the internal structure of an LLM to extract model biases, proprietary system prompts, configurations, or training data used in fine-tuning or RAG corpora. Enables intellectual property theft, facilitates future attacks, or supports creation of clone models.

Data Poisoning

Data poisoning occurs when adversaries tamper with training or fine-tuning data to manipulate an AI model’s behaviour, often by injecting misleading or malicious patterns. This can lead to biased decision-making, such as incorrectly approving fraudulent transactions or degrading model performance in subtle ways. The risk is heightened in systems that continuously learn from unvalidated or third-party data, with impacts that may remain hidden until a major failure occurs.

AIR-SEC-009-01 Training Data Manipulation
Adversaries alter training datasets by changing labels or injecting crafted data points with hidden patterns. In financial services, this includes marking fraudulent transactions as legitimate to corrupt fraud detection models, or embedding backdoor triggers exploitable after deployment.
AIR-SEC-009-02 Continuous Learning Exploitation
Systems that continuously learn from new data are vulnerable when validation mechanisms are inadequate. Adversaries systematically feed misleading information over time to gradually skew decision-making in credit scoring, trading, or risk models.
AIR-SEC-009-03 Third-Party Data Compromise
Financial institutions rely on external data feeds such as market data, credit references, and KYC/AML watchlists. Compromise of these sources introduces poisoned data that can unknowingly embed biases or vulnerabilities into downstream models.
AIR-SEC-009-04 Bias Introduction
Deliberate data poisoning amplifies biases in credit scoring or loan approval models, leading to discriminatory outcomes and regulatory non-compliance. Effects are subtle and may remain hidden until major failures or regulatory interventions occur.

Information Leakage

Using third-party hosted LLMs creates a two-way trust boundary where neither inputs nor outputs can be fully trusted. Sensitive financial data sent for inference may be memorized by models, leaked through prompt attacks, or exposed via inadequate provider controls. This risks exposing customer PII, proprietary algorithms, and confidential business information, particularly with free or poorly-governed LLM services.

AIR-RC-001-01 Model Memorization
LLMs can memorize sensitive data from training or user interactions, later disclosing customer details, loan terms, or trading strategies in unrelated sessions. This includes cross-user leakage, where one user's sensitive data is disclosed to another.
AIR-RC-001-02 Prompt-Based Data Extraction
Adversaries craft prompts to extract memorized sensitive information from hosted models. Targeted prompt sequences can cause the model to reproduce confidential training data, PII, or proprietary algorithms that were not intended to be accessible.
AIR-RC-001-03 Inadequate Provider Data Controls
Insufficient sanitization, encryption, or access controls by hosted model providers increases disclosure risk. Providers may lack transparent mechanisms for how input data is processed, retained, or sanitized, leading to persistent exposure of proprietary data.
AIR-RC-001-04 Provider Data Handling Deficiency
Without clear contracts ensuring encryption, retention limits, and secure deletion, institutions lose control over sensitive data sent to hosted models. Providers may lack transparency about data processing and retention practices.
AIR-RC-001-05 Fine-Tuning Data Exposure
Using proprietary data for fine-tuning embeds sensitive information directly into model weights, potentially making it accessible to unauthorized users if access controls are inadequate.
AIR-SEC-002-01 Embedding Inversion
Although embeddings are not human-readable, inversion attacks can reconstruct the original text from stored vectors, exposing proprietary or personally identifiable information held in a RAG vector store.
AIR-SEC-002-02 Membership Inference
An adversary probes the vector store to determine whether specific information is present, for example generating embeddings for a confidential transaction and inferring from similarity whether such a deal is being discussed internally.
AIR-SEC-002-03 Embedding Store Poisoning
An attacker with access injects malicious or misleading embeddings into the vector store, degrading the accuracy of retrieved context; dense numerical representations make such tampering difficult to detect.
AIR-SEC-002-04 Misconfigured Vector Store Access Controls
Missing role-based access control or overly permissive settings on the vector store allow unauthorized users to retrieve embeddings of sensitive internal data.
AIR-SEC-002-05 Encryption and Audit Deficiencies
Vector stores lacking encryption at rest expose embeddings to anyone with storage access, while absent audit logging prevents detection of unauthorized access, modification, or exfiltration.

Output Integrity

Risks where AI systems produce confident but incorrect, fabricated, inconsistent, or misaligned outputs that diverge from facts, retrieved sources, or the intended business purpose.

AIR-OP-004-01 Lack of Ground Truth
The model cannot distinguish accurate from inaccurate information in its training corpus, so it may generate plausible but fabricated financial facts, figures, or citations.
AIR-OP-004-02 Ambiguous or Incomplete Prompts
When prompts lack clarity or precision, the model is more likely to fabricate plausible-sounding but incorrect details to fill the gaps.
AIR-OP-004-03 Confident Presentation of Errors
Hallucinated content is delivered with high fluency and syntactic confidence, making inaccuracies difficult for users to recognise and increasing the chance they act on false information.
AIR-OP-004-04 Fine-Tuning or Prompt Bias
Instructions or fine-tuning intended to improve helpfulness or creativity can inadvertently increase the model's tendency to produce unsupported statements.
AIR-OP-006-01 Probabilistic Sampling Variability
Because models sample from a probability distribution over next tokens rather than always selecting the most likely token, identical inputs can yield different outputs across runs.
AIR-OP-006-02 Internal State Variation
Random seeds, GPU computation variations, and floating-point precision differences cause non-reproducible outputs even with fixed inputs and parameters.
AIR-OP-006-03 Context Sensitivity
Output varies with the position of content in the token window or with slight rephrasing, producing inconsistent results for semantically equivalent prompts.
AIR-OP-006-04 Decoding Parameter Effects
Sampling parameters such as temperature and top-p amplify or dampen variability, trading consistency against creativity.
AIR-OP-014-01 Retrieval-Response Disconnect
The model generates confident responses that contradict or misinterpret the retrieved financial documents, for example omitting critical regulatory exceptions documented in policy.
AIR-OP-014-02 Context-Window Truncation of Caveats
Important regulatory caveats, disclaimers, or conditional statements are truncated or deprioritised when documents exceed the context window, yielding authoritative-looking but incomplete guidance.
AIR-OP-014-03 Domain Knowledge Gap-Filling
When retrieved documents do not fully address a query, the model fills gaps with plausible but incorrect general knowledge, blending accurate institutional content with inaccurate information.
AIR-OP-014-04 Scope Boundary Violation
The model provides advice or recommendations beyond its authorised scope, such as offering investment advice from a system licensed only for general account information.
AIR-OP-014-05 Tone and Compliance Mismatch
The model adopts an inappropriate tone or level of certainty for financial communications, such as being overly definitive about complex regulatory matters.

Model Integrity

Risks to the integrity, stability, and provenance of the foundation model itself, spanning silent version drift and adversarial tampering of training data, weights, or supporting infrastructure.

AIR-OP-005-01 Silent Model Updates
Providers retrain, fine-tune, or re-architect foundation models without explicit notification or version pinning, causing behaviour to shift even when inputs are unchanged and breaking testing and reproducibility.
AIR-OP-005-02 System Prompt Modifications
Changes to a model's hidden or implicit system prompt, for example for safety or compliance, alter outputs subtly or significantly even when user inputs remain identical.
AIR-OP-005-03 Deployment Environment or API Changes
Changes to deployment infrastructure such as hardware, quantization, or tokenization, or to API defaults, affect model behaviour, particularly for latency- or performance-sensitive applications.
AIR-OP-005-04 Prompt Perturbation Sensitivity
Minor variations in phrasing significantly change outputs and can be exploited to attack model grounding or circumvent safeguards, introducing further unpredictability.
AIR-SEC-008-01 Training Data and Weight Tampering
Adversaries tamper with training data, fine-tuning datasets, or pretrained model weights in the provider's pipeline, embedding subtle manipulations that are difficult to detect downstream.
AIR-SEC-008-02 Infrastructure and ML Library Compromise
Compromise of GPU firmware, operating systems, cloud orchestration, or ML libraries such as TensorFlow, PyTorch, and CUDA enables tampering with the model or its runtime behaviour without detection.
AIR-SEC-008-03 Adversarial Fine-Tuning
Where model weights are accessible, attackers craft subtle adversarial modifications during fine-tuning that cause unsafe responses or bypass content filters under specific conditions.
AIR-SEC-008-04 Backdoor Triggers
A model is engineered to behave maliciously when presented with a specific trigger phrase or input pattern, activating offensive outputs, bypassing constraints, or revealing sensitive information.
AIR-SEC-008-05 Safety Mechanism Disablement
Tampering disables alignment or content-moderation systems, neutralising the safeguards intended to enforce responsible model behaviour.

Data Quality

Risks arising from inaccurate, outdated, biased, or drifting data that degrade the reliability and fairness of AI outputs over time.

AIR-OP-019-01 Poor-Quality Training Data
Inaccurate, incomplete, or biased training or fine-tuning data leads the model to produce unreliable, misleading, or irrelevant outputs, especially in decision-making and risk analysis.
AIR-OP-019-02 Data and Concept Drift
Models become stale as the statistical properties of input data change over time, eroding predictive power and causing failure to recognise emerging market shifts or new regulatory requirements.
AIR-OP-019-03 Bias and Error Amplification
Errors or embedded biases in historical training data propagate into the model and are magnified at scale, undermining performance and introducing legal and reputational risk.

Fairness

Risks where AI systems systematically disadvantage protected groups through biased data, flawed design, or proxy variables that correlate with sensitive characteristics.

AIR-OP-016-01 Data Bias
Training datasets reflect historical societal biases or under-represent populations, leading the model to learn and perpetuate discriminatory patterns such as lower loan-approval rates for certain groups.
AIR-OP-016-02 Algorithmic Bias
Model architecture, feature selection, or optimization choices unintentionally introduce or amplify bias, for example by over-weighting a feature correlated with a protected characteristic.
AIR-OP-016-03 Proxy Discrimination
Seemingly neutral data points such as postal codes or transaction history act as proxies for protected characteristics, producing discriminatory decisions.
AIR-OP-016-04 Bias Feedback Loops
A biased system's outputs are fed back into its learning cycle without correction, making the bias self-reinforcing and amplified over time.

Governance and Compliance

Risks relating to regulatory compliance, supervision, explainability, and intellectual-property obligations for AI systems in financial services.

AIR-OP-017 Lack of Explainability
Complex foundation models operate as black boxes, producing outputs without a clear, traceable rationale. Firms cannot adequately justify AI-driven decisions to regulators, stakeholders, or customers, and underlying errors or biases may go undetected, complicating model-soundness assessment and risk management.
AIR-RC-022-01 Non-Compliant AI Outputs
AI-generated financial advice, marketing, or communications must meet the same standards as human-produced outputs, including KYC, suitability, fair and accurate disclosure, and record-keeping; failing to do so breaches regimes such as MiFID II, SEC rules, and FINRA guidelines.
AIR-RC-022-02 Model Risk Management Gaps
AI models informing critical decisions fall under divergent model-risk-management expectations, such as the UK PRA's SS1/23, with recently shifting US scope; inadequate validation, monitoring, documentation, and oversight create compliance exposure.
AIR-RC-022-03 Inadequate AI Supervision and Accountability
Firms remain accountable for supervising AI systems; failure to define clear lines of accountability and ensure staff understand system capabilities and limitations leads directly to non-compliance.
AIR-RC-022-04 Evolving Regulatory Obligations
New and diverging legislation, such as the EU AI Act's high-risk classification and Fundamental Rights Impact Assessments alongside US fair-lending, FCRA, and state AI laws, imposes additional transparency, fairness, and oversight obligations that firms must anticipate.
AIR-RC-023-01 Copyright Infringement in Outputs
AI outputs may replicate copyrighted material from training data, creating legal liability when used in marketing, code generation, or research reports.
AIR-RC-023-02 Trade Secret Leakage to AI Tools
Employees inputting proprietary algorithms, M&A strategies, or confidential data into public AI tools risk irretrievable loss of valuable intellectual property.
AIR-RC-023-03 Licensing and Terms-of-Service Violations
Improper licensing of AI platforms or failure to comply with terms of service results in contractual breaches.

Agentic Security

Risks specific to autonomous and multi-agent systems, including authorization bypass, tool-chain manipulation, supply-chain compromise, state poisoning, trust-boundary violations, and credential harvesting.

AIR-SEC-024-01 API Endpoint Discovery and Exploitation
Agents discover and use API endpoints not intended for their use case, for example a balance-inquiry agent invoking payment-transfer APIs, because endpoint restrictions are insufficient.
AIR-SEC-024-02 Tool Chain Privilege Escalation
By chaining individually authorized API calls, an agent achieves outcomes that no single authorized action should permit, such as aggregating data to enable unauthorized decisions.
AIR-SEC-024-03 Business Logic Circumvention
Agents bypass intended workflows, approval processes, or segregation-of-duties requirements on which regulatory compliance depends.
AIR-SEC-024-04 Dynamic Privilege Drift
An agent's interpretation of its granted permissions expands during operation, producing permission creep and broader access than originally intended without explicit reconfiguration.
AIR-SEC-025-01 Tool Selection Manipulation
Crafted inputs cause the agent to select inappropriate tools for the task, for example choosing payment-transfer tools when only a balance check was requested.
AIR-SEC-025-02 API Parameter Injection
Malicious inputs influence the parameters an agent passes to legitimate API calls, such as injecting attacker-controlled account numbers, amounts, or authorization codes.
AIR-SEC-025-03 Tool Chain Sequencing Attacks
Adversaries manipulate the order in which an agent executes tools, creating dangerous combinations of otherwise safe individual operations.
AIR-SEC-025-04 Tool State Corruption
Attacks corrupt the agent's understanding of tool states, capabilities, or relationships, leading to inappropriate or dangerous tool usage.
AIR-SEC-025-05 Cross-Tool Data Injection
Outputs from one tool are used to inject malicious data into subsequent tool calls, creating a chain of compromised operations.
AIR-SEC-026-01 Third-Party MCP Server Compromise
External MCP servers operated by vendors or partners are compromised, injecting malicious data or logic into services that agents consume.
AIR-SEC-026-02 MCP Server Update Poisoning
Legitimate MCP servers receive malicious updates or patches that introduce backdoors, data corruption, or logic manipulation without operator knowledge.
AIR-SEC-026-03 Insider Threats to MCP Services
Malicious insiders with access to MCP infrastructure deliberately corrupt data, introduce backdoors, or modify business logic to benefit attackers.
AIR-SEC-026-04 MCP Protocol Manipulation
Attacks target the MCP communication protocol itself, including man-in-the-middle, protocol-downgrade, or exploitation of protocol vulnerabilities.
AIR-SEC-026-05 DNS and Infrastructure Redirection
Agent MCP connections are redirected to attacker-controlled servers through DNS poisoning, BGP hijacking, or other network-level attacks.
AIR-SEC-027-01 Memory Injection
Prompt injection or similar techniques cause agents to store malicious instructions or compromised reasoning patterns in their persistent memory.
AIR-SEC-027-02 Learned Behavior Corruption
Through repeated exposure to malicious inputs, agents learn inappropriate patterns or exceptions to business rules that persist across sessions.
AIR-SEC-027-03 State Storage Compromise
Direct attacks on the databases, files, or cloud storage holding agent state allow attackers to modify agent memory without interacting with the agent.
AIR-SEC-027-04 Cross-Session Instruction Persistence
Malicious instructions embedded in one session persist and influence agent behaviour in subsequent sessions with different users or contexts.
AIR-SEC-027-05 Preference Poisoning
Corrupting agent preferences, configuration, or learned user patterns biases the agent toward specific outcomes or bypasses security controls.
AIR-OP-028-01 Agent-to-Agent Communication Compromise
Malicious agents inject harmful data, instructions, or corrupted state into communication channels, causing receiving agents to adopt compromised behaviours.
AIR-OP-028-02 Shared Resource Contamination
Compromised agents corrupt shared databases, APIs, or state storage relied upon by other agents, causing systematic errors across multiple agent types.
AIR-OP-028-03 Agent Authority Impersonation
Compromised agents impersonate higher-privilege agents or use stolen credentials to access resources or influence decisions outside their intended scope.
AIR-OP-028-04 Cross-Agent Privilege Inheritance
Design flaws allow agents to inherit or assume privileges from agents they interact with, escalating privilege across the multi-agent system.
AIR-OP-028-05 Cascade Failure Propagation
Failures or compromises in one agent cascade to dependent agents, potentially bringing down entire business processes or decision chains.
AIR-SEC-029-01 Tool Chain Credential Enumeration
Agents are manipulated to use legitimate file, database, or API tools to systematically search for credentials in configuration files, environment variables, logs, and source repositories.
AIR-SEC-029-02 Memory and Process Credential Extraction
Compromised agents use system access to extract credentials from process memory, swap files, core dumps, or temporary storage where they may be cached.
AIR-SEC-029-03 Database and Storage Credential Mining
Agents exploit database access to search for credentials stored in user tables, configuration tables, or other locations holding passwords, API keys, or tokens.
AIR-SEC-029-04 Cloud and Infrastructure Credential Harvesting
Agents leverage cloud-management APIs and infrastructure tools to discover credentials in key vaults, secret stores, instance metadata, or infrastructure-as-code.
AIR-SEC-029-05 Cross-System Credential Correlation
Agents correlate partial credential information across systems, reconstruct full credentials from fragments, or identify credential-reuse patterns.

Install

Provenance