AI Agent Security Probe Library

Probe Library for AI Agent OWASP Top 10 for Security

This section documents Akto’s Agentic AI Security (ASI) probe library, mapped to the ASI Agentic Top 10 risk categories.

Each entry below is an executable test template. Templates are grouped by:

  • ASI category (what risk is being tested)

  • attack family / technique (how the test is delivered: base64, leetspeak, roleplay, etc.)

  • execution template (single-shot, sequential, crescendo, tree, linear, judge)

Note The lists below represent a subset of Akto’s full agentic probe library (4,000+ probes across MCP + agentic systems).

ASI01: Agent Goal Hijack

Base64

  • SECURITY_INDIRECT_PROMPT_INJECTION_BASE64_SEQUENTIAL_JAILBREAKING

  • SECURITY_JAILBREAK_BASE64_SEQUENTIAL_JAILBREAKING

Context Poisoning

  • BUSINESS_ALIGNMENT_COMPETITOR_CHECK_CONTEXT_POISONING_SEQUENTIAL_JAILBREAKING

Goal Redirection

  • SECURITY_JAILBREAK_GOAL_REDIRECTION_CRESCENDO_JAILBREAKING

Input Bypass

  • SECURITY_PROMPT_INJECTION_INPUT_BYPASS_TREE_JAILBREAKING

Leetspeak

  • SECURITY_INDIRECT_PROMPT_INJECTION_LEETSPEAK_SEQUENTIAL_JAILBREAKING

  • SECURITY_JAILBREAK_LEETSPEAK_SEQUENTIAL_JAILBREAKING

  • SECURITY_PROMPT_INJECTION_LEETSPEAK_SEQUENTIAL_JAILBREAKING

Math Problem

  • SECURITY_SYSTEM_PROMPT_OVERRIDE_MATH_PROBLEM_CRESCENDO_JAILBREAKING

Multilingual

  • SECURITY_INDIRECT_PROMPT_INJECTION_MULTILINGUAL_SEQUENTIAL_JAILBREAKING

  • SECURITY_JAILBREAK_MULTILINGUAL_SEQUENTIAL_JAILBREAKING

  • SECURITY_PROMPT_INJECTION_MULTILINGUAL_SEQUENTIAL_JAILBREAKING

Permission Escalation

  • SECURITY_PROMPT_INJECTION_PERMISSION_ESCALATION_SEQUENTIAL_JAILBREAKING

Prompt Injection

  • SECURITY_INDIRECT_PROMPT_INJECTION_PROMPT_INJECTION_CRESCENDO_JAILBREAKING

ROT13

  • SECURITY_INDIRECT_PROMPT_INJECTION_ROT13_SEQUENTIAL_JAILBREAKING

  • SECURITY_JAILBREAK_ROT13_SEQUENTIAL_JAILBREAKING

  • SECURITY_PROMPT_INJECTION_ROT13_SEQUENTIAL_JAILBREAKING

Roleplay

  • BUSINESS_ALIGNMENT_INTENTIONAL_MISUSE_ROLEPLAY_TREE_JAILBREAKING

Semantic Manipulation

  • SECURITY_AGENT_BEHAVIOR_HIJACK_AND_GOAL_MANIPULATION_SEMANTIC_MANIPULATION_TREE_JAILBREAKING

System Override

  • SECURITY_MANIPULATION_SYSTEM_OVERRIDE_SEQUENTIAL_JAILBREAKING


ASI02: Tool Misuse and Exploitation

Context Poisoning

  • SECURITY_INSECURE_PLUGIN_USE_CONTEXT_POISONING_TREE_JAILBREAKING

  • SECURITY_TOOL_DISCOVERY_CONTEXT_POISONING_TREE_JAILBREAKING

  • SECURITY_TOOL_MISUSE_AND_EXPLOITATION_CONTEXT_POISONING_LINEAR_JAILBREAKING

Goal Redirection

  • SECURITY_INSECURE_PLUGIN_USE_GOAL_REDIRECTION_LINEAR_JAILBREAKING

  • SECURITY_TOOL_DISCOVERY_GOAL_REDIRECTION_TREE_JAILBREAKING

Input Bypass

  • SECURITY_MANIPULATION_INPUT_BYPASS

Leetspeak

  • SECURITY_TOOL_DISCOVERY_LEETSPEAK_LINEAR_JAILBREAKING

Math Problem

  • SECURITY_TOOL_MISUSE_AND_EXPLOITATION_MATH_PROBLEM_TREE_JAILBREAKING

Multilingual

  • SECURITY_TOOL_DISCOVERY_MULTILINGUAL_LINEAR_JAILBREAKING

  • SECURITY_TOOL_MISUSE_AND_EXPLOITATION_MULTILINGUAL_TREE_JAILBREAKING

Permission Escalation

  • SECURITY_INSECURE_PLUGIN_USE_PERMISSION_ESCALATION_TREE_JAILBREAKING

Prompt Injection

  • SECURITY_TOOL_DISCOVERY_PROMPT_INJECTION_LINEAR_JAILBREAKING

Roleplay

  • SECURITY_INSECURE_PLUGIN_USE_ROLEPLAY_LINEAR_JAILBREAKING

  • SECURITY_INSECURE_PLUGIN_USE_ROLEPLAY_TREE_JAILBREAKING

  • SECURITY_TOOL_DISCOVERY_ROLEPLAY_LINEAR_JAILBREAKING

  • SECURITY_TOOL_DISCOVERY_ROLEPLAY_TREE_JAILBREAKING

  • SECURITY_TOOL_MISUSE_AND_EXPLOITATION_ROLEPLAY_LINEAR_JAILBREAKING

  • SECURITY_TOOL_MISUSE_AND_EXPLOITATION_ROLEPLAY_TREE_JAILBREAKING

Semantic Manipulation

  • SECURITY_TOOL_MISUSE_AND_EXPLOITATION_SEMANTIC_MANIPULATION_TREE_JAILBREAKING

System Override

  • SECURITY_TOOL_MISUSE_AND_EXPLOITATION_SYSTEM_OVERRIDE


ASI03: Identity and Privilege Abuse

Base64

  • SECURITY_IDENTITY_SPOOFING_BASE64_CRESCENDO_JAILBREAKING

  • SECURITY_SENSITIVE_INFORMATION_DISCLOSURE_BASE64_CRESCENDO_JAILBREAKING

Context Poisoning

  • SAFETY_PII_CONTEXT_POISONING_CRESCENDO_JAILBREAKING

Goal Redirection

  • SECURITY_CROSS_SESSION_LEAKS_GOAL_REDIRECTION_CRESCENDO_JAILBREAKING

  • SECURITY_IDENTITY_SPOOFING_GOAL_REDIRECTION_CRESCENDO_JAILBREAKING

Input Bypass

  • SECURITY_CROSS_SESSION_LEAKS_INPUT_BYPASS_SEQUENTIAL_JAILBREAKING

Leetspeak

  • SAFETY_PII_LEETSPEAK_CRESCENDO_JAILBREAKING

  • SAFETY_PRIVACY_VIOLATION_LEETSPEAK_CRESCENDO_JAILBREAKING

Math Problem

  • SAFETY_PII_MATH_PROBLEM_CRESCENDO_JAILBREAKING

  • SECURITY_SENSITIVE_INFORMATION_DISCLOSURE_MATH_PROBLEM_CRESCENDO_JAILBREAKING

Multilingual

  • SAFETY_PRIVACY_VIOLATION_MULTILINGUAL_CRESCENDO_JAILBREAKING

  • SECURITY_IDENTITY_SPOOFING_MULTILINGUAL_CRESCENDO_JAILBREAKING

  • SECURITY_SENSITIVE_INFORMATION_DISCLOSURE_MULTILINGUAL_CRESCENDO_JAILBREAKING

Permission Escalation

  • SAFETY_PII_PERMISSION_ESCALATION_CRESCENDO_JAILBREAKING

ROT13

  • SAFETY_PRIVACY_VIOLATION_ROT13_CRESCENDO_JAILBREAKING

  • SECURITY_IDENTITY_SPOOFING_ROT13_CRESCENDO_JAILBREAKING

  • SECURITY_SENSITIVE_INFORMATION_DISCLOSURE_ROT13_CRESCENDO_JAILBREAKING

Roleplay

  • SECURITY_IDENTITY_SPOOFING_ROLEPLAY_BAD_LIKERT_JUDGE

Semantic Manipulation

  • SECURITY_SENSITIVE_INFORMATION_DISCLOSURE_SEMANTIC_MANIPULATION_BAD_LIKERT_JUDGE

System Override

  • SAFETY_PRIVACY_VIOLATION_SYSTEM_OVERRIDE_SEQUENTIAL_JAILBREAKING


ASI04: Agentic Supply Chain Vulnerabilities

Base64

  • SECURITY_MALICIOUS_RESOURCE_FETCHING_BASE64_SEQUENTIAL_JAILBREAKING

Context Poisoning

  • SECURITY_MALICIOUS_RESOURCE_FETCHING_CONTEXT_POISONING_CRESCENDO_JAILBREAKING

Goal Redirection

  • SECURITY_AGENTIC_SUPPLY_CHAIN_VULNERABILITIES_GOAL_REDIRECTION_SEQUENTIAL_JAILBREAKING

Input Bypass

  • SECURITY_AGENTIC_SUPPLY_CHAIN_VULNERABILITIES_INPUT_BYPASS_SEQUENTIAL_JAILBREAKING

Leetspeak

  • SECURITY_TRAINING_DATA_POISONING_LEETSPEAK_TREE_JAILBREAKING

Multilingual

  • SECURITY_AGENTIC_SUPPLY_CHAIN_VULNERABILITIES_MULTILINGUAL_CRESCENDO_JAILBREAKING

Permission Escalation

  • SECURITY_MALICIOUS_RESOURCE_FETCHING_PERMISSION_ESCALATION_TREE_JAILBREAKING

  • SECURITY_TRAINING_DATA_POISONING_PERMISSION_ESCALATION_TREE_JAILBREAKING

Prompt Injection

  • SECURITY_TRAINING_DATA_POISONING_PROMPT_INJECTION_TREE_JAILBREAKING

Roleplay

  • SECURITY_AGENTIC_SUPPLY_CHAIN_VULNERABILITIES_ROLEPLAY_CRESCENDO_JAILBREAKING

  • SECURITY_TRAINING_DATA_POISONING_ROLEPLAY_CRESCENDO_JAILBREAKING

System Override

  • SECURITY_AGENTIC_SUPPLY_CHAIN_VULNERABILITIES_SYSTEM_OVERRIDE_CRESCENDO_JAILBREAKING

  • SECURITY_AGENTIC_SUPPLY_CHAIN_VULNERABILITIES_SYSTEM_OVERRIDE_SEQUENTIAL_JAILBREAKING

  • SECURITY_AGENTIC_SUPPLY_CHAIN_VULNERABILITIES_SYSTEM_OVERRIDE_TREE_JAILBREAKING

  • SECURITY_MALICIOUS_RESOURCE_FETCHING_SYSTEM_OVERRIDE_CRESCENDO_JAILBREAKING

  • SECURITY_MALICIOUS_RESOURCE_FETCHING_SYSTEM_OVERRIDE_SEQUENTIAL_JAILBREAKING

  • SECURITY_MALICIOUS_RESOURCE_FETCHING_SYSTEM_OVERRIDE_TREE_JAILBREAKING

  • SECURITY_TRAINING_DATA_POISONING_SYSTEM_OVERRIDE_CRESCENDO_JAILBREAKING

  • SECURITY_TRAINING_DATA_POISONING_SYSTEM_OVERRIDE_SEQUENTIAL_JAILBREAKING

  • SECURITY_TRAINING_DATA_POISONING_SYSTEM_OVERRIDE_TREE_JAILBREAKING


ASI05: Unexpected Code Execution (RCE)

Base64

  • SECURITY_XSS_BASE64

Context Poisoning

  • SECURITY_CODE_EXECUTION_CONTEXT_POISONING_TREE_JAILBREAKING

Input Bypass

  • SECURITY_UNEXPECTED_CODE_EXECUTION_RCE_INPUT_BYPASS_TREE_JAILBREAKING

Math Problem

  • SECURITY_WEB_INJECTION_MATH_PROBLEM_TREE_JAILBREAKING

Permission Escalation

  • SECURITY_CODE_EXECUTION_PERMISSION_ESCALATION_LINEAR_JAILBREAKING

Prompt Injection

  • SECURITY_CODE_EXECUTION_PROMPT_INJECTION

ROT13

  • SECURITY_CODE_EXECUTION_ROT13_LINEAR_JAILBREAKING

  • SECURITY_CODE_EXECUTION_ROT13_TREE_JAILBREAKING

  • SECURITY_WEB_INJECTION_ROT13_LINEAR_JAILBREAKING

  • SECURITY_WEB_INJECTION_ROT13_TREE_JAILBREAKING

  • SECURITY_XSS_ROT13_LINEAR_JAILBREAKING

  • SECURITY_XSS_ROT13_TREE_JAILBREAKING

Roleplay

  • SECURITY_CODE_EXECUTION_ROLEPLAY_LINEAR_JAILBREAKING

  • SECURITY_CODE_EXECUTION_ROLEPLAY_TREE_JAILBREAKING

  • SECURITY_WEB_INJECTION_ROLEPLAY_LINEAR_JAILBREAKING

  • SECURITY_WEB_INJECTION_ROLEPLAY_TREE_JAILBREAKING

  • SECURITY_XSS_ROLEPLAY_LINEAR_JAILBREAKING

  • SECURITY_XSS_ROLEPLAY_TREE_JAILBREAKING

Semantic Manipulation

  • SECURITY_XSS_SEMANTIC_MANIPULATION_TREE_JAILBREAKING

System Override

  • SECURITY_WEB_INJECTION_SYSTEM_OVERRIDE_LINEAR_JAILBREAKING


ASI06: Memory & Context Poisoning

Base64

  • SECURITY_CONTEXT_LEAKAGE_BASE64_SEQUENTIAL_JAILBREAKING

Context Poisoning

  • SECURITY_CONTEXT_LEAKAGE_CONTEXT_POISONING_CRESCENDO_JAILBREAKING

  • SECURITY_CONTEXT_LEAKAGE_CONTEXT_POISONING_TREE_JAILBREAKING

  • SECURITY_CONTEXT_POISONING_CONTEXT_POISONING_CRESCENDO_JAILBREAKING

  • SECURITY_CONTEXT_POISONING_CONTEXT_POISONING_SEQUENTIAL_JAILBREAKING

  • SECURITY_CONTEXT_POISONING_CONTEXT_POISONING_TREE_JAILBREAKING

  • SECURITY_RAG_POISONING_CONTEXT_POISONING_CRESCENDO_JAILBREAKING

  • SECURITY_RAG_POISONING_CONTEXT_POISONING_SEQUENTIAL_JAILBREAKING

  • SECURITY_RAG_POISONING_CONTEXT_POISONING_TREE_JAILBREAKING

Input Bypass

  • SECURITY_RAG_POISONING_INPUT_BYPASS_CRESCENDO_JAILBREAKING

Multilingual

  • SECURITY_CONTEXT_POISONING_MULTILINGUAL_TREE_JAILBREAKING

Prompt Injection

  • SECURITY_CONTEXT_LEAKAGE_PROMPT_INJECTION_CRESCENDO_JAILBREAKING

  • SECURITY_CONTEXT_LEAKAGE_PROMPT_INJECTION_SEQUENTIAL_JAILBREAKING

  • SECURITY_CONTEXT_LEAKAGE_PROMPT_INJECTION_TREE_JAILBREAKING

  • SECURITY_CONTEXT_POISONING_PROMPT_INJECTION_CRESCENDO_JAILBREAKING

  • SECURITY_CONTEXT_POISONING_PROMPT_INJECTION_SEQUENTIAL_JAILBREAKING

  • SECURITY_CONTEXT_POISONING_PROMPT_INJECTION_TREE_JAILBREAKING

  • SECURITY_RAG_POISONING_PROMPT_INJECTION_CRESCENDO_JAILBREAKING

  • SECURITY_RAG_POISONING_PROMPT_INJECTION_SEQUENTIAL_JAILBREAKING

  • SECURITY_RAG_POISONING_PROMPT_INJECTION_TREE_JAILBREAKING


ASI07: Insecure Inter-Agent Communication

Base64

  • SECURITY_DATA_EXFILTRATION_BASE64_TREE_JAILBREAKING

  • SECURITY_INSECURE_INTER_AGENT_COMMUNICATION_BASE64_SEQUENTIAL_JAILBREAKING

  • SECURITY_INSECURE_INTER_AGENT_COMMUNICATION_BASE64_TREE_JAILBREAKING

Input Bypass

  • SECURITY_DATA_EXFILTRATION_INPUT_BYPASS_CRESCENDO_JAILBREAKING

Leetspeak

  • SECURITY_INSECURE_INTER_AGENT_COMMUNICATION_LEETSPEAK_SEQUENTIAL_JAILBREAKING

Permission Escalation

  • SECURITY_INSECURE_INTER_AGENT_COMMUNICATION_PERMISSION_ESCALATION_SEQUENTIAL_JAILBREAKING

Prompt Injection

  • SECURITY_CROSS_SESSION_LEAKS_PROMPT_INJECTION

  • SECURITY_CROSS_SESSION_LEAKS_PROMPT_INJECTION_BAD_LIKERT_JUDGE

  • SECURITY_CROSS_SESSION_LEAKS_PROMPT_INJECTION_CRESCENDO_JAILBREAKING

  • SECURITY_CROSS_SESSION_LEAKS_PROMPT_INJECTION_LINEAR_JAILBREAKING

  • SECURITY_CROSS_SESSION_LEAKS_PROMPT_INJECTION_SEQUENTIAL_JAILBREAKING

  • SECURITY_CROSS_SESSION_LEAKS_PROMPT_INJECTION_TREE_JAILBREAKING

ROT13

  • SECURITY_DATA_EXFILTRATION_ROT13_TREE_JAILBREAKING

Semantic Manipulation

  • SECURITY_DATA_EXFILTRATION_SEMANTIC_MANIPULATION_TREE_JAILBREAKING

System Override

  • SECURITY_DATA_EXFILTRATION_SYSTEM_OVERRIDE_CRESCENDO_JAILBREAKING

  • SECURITY_DATA_EXFILTRATION_SYSTEM_OVERRIDE_SEQUENTIAL_JAILBREAKING

  • SECURITY_DATA_EXFILTRATION_SYSTEM_OVERRIDE_TREE_JAILBREAKING

  • SECURITY_INSECURE_INTER_AGENT_COMMUNICATION_SYSTEM_OVERRIDE_CRESCENDO_JAILBREAKING

  • SECURITY_INSECURE_INTER_AGENT_COMMUNICATION_SYSTEM_OVERRIDE_SEQUENTIAL_JAILBREAKING

  • SECURITY_INSECURE_INTER_AGENT_COMMUNICATION_SYSTEM_OVERRIDE_TREE_JAILBREAKING


ASI08: Cascading Failures

Base64

  • HALLUCINATION_AND_TRUSTWORTHINESS_HALLUCINATION_PROPAGATION_BASE64_TREE_JAILBREAKING

  • SECURITY_MODEL_DENIAL_OF_SERVICE_BASE64_TREE_JAILBREAKING

Context Poisoning

  • HALLUCINATION_AND_TRUSTWORTHINESS_HALLUCINATION_PROPAGATION_CONTEXT_POISONING_SEQUENTIAL_JAILBREAKING

Goal Redirection

  • HALLUCINATION_AND_TRUSTWORTHINESS_HALLUCINATION_PROPAGATION_GOAL_REDIRECTION_TREE_JAILBREAKING

  • SECURITY_MODEL_DENIAL_OF_SERVICE_GOAL_REDIRECTION_TREE_JAILBREAKING

Input Bypass

  • HALLUCINATION_AND_TRUSTWORTHINESS_HALLUCINATION_PROPAGATION_INPUT_BYPASS_TREE_JAILBREAKING

  • SECURITY_MODEL_DENIAL_OF_SERVICE_INPUT_BYPASS_TREE_JAILBREAKING

Leetspeak

  • HALLUCINATION_AND_TRUSTWORTHINESS_HALLUCINATION_PROPAGATION_LEETSPEAK_TREE_JAILBREAKING

  • SECURITY_MODEL_DENIAL_OF_SERVICE_LEETSPEAK_TREE_JAILBREAKING

Math Problem

  • HALLUCINATION_AND_TRUSTWORTHINESS_HALLUCINATION_PROPAGATION_MATH_PROBLEM_TREE_JAILBREAKING

  • SECURITY_MODEL_DENIAL_OF_SERVICE_MATH_PROBLEM_TREE_JAILBREAKING

Multilingual

  • SECURITY_MODEL_DENIAL_OF_SERVICE_MULTILINGUAL_CRESCENDO_JAILBREAKING

Prompt Injection

  • HALLUCINATION_AND_TRUSTWORTHINESS_HALLUCINATION_PROPAGATION_PROMPT_INJECTION_TREE_JAILBREAKING

  • SECURITY_MODEL_DENIAL_OF_SERVICE_PROMPT_INJECTION_TREE_JAILBREAKING

ROT13

  • HALLUCINATION_AND_TRUSTWORTHINESS_HALLUCINATION_PROPAGATION_ROT13_TREE_JAILBREAKING

  • SECURITY_MODEL_DENIAL_OF_SERVICE_ROT13_TREE_JAILBREAKING

Roleplay

  • SECURITY_MODEL_DENIAL_OF_SERVICE_ROLEPLAY_TREE_JAILBREAKING

Semantic Manipulation

  • HALLUCINATION_AND_TRUSTWORTHINESS_HALLUCINATION_PROPAGATION_SEMANTIC_MANIPULATION_TREE_JAILBREAKING

System Override

  • HALLUCINATION_AND_TRUSTWORTHINESS_HALLUCINATION_PROPAGATION_SYSTEM_OVERRIDE_TREE_JAILBREAKING

  • SECURITY_MODEL_DENIAL_OF_SERVICE_SYSTEM_OVERRIDE_TREE_JAILBREAKING


ASI09: Human-Agent Trust Exploitation

Base64

  • BUSINESS_ALIGNMENT_HUMAN_AGENT_TRUST_EXPLOITATION_BASE64_CRESCENDO_JAILBREAKING

  • BUSINESS_ALIGNMENT_OVERWHELMING_HUMAN_IN_THE_LOOP_BASE64_CRESCENDO_JAILBREAKING

  • SECURITY_OVERRELIANCE_BASE64_BAD_LIKERT_JUDGE

  • SECURITY_OVERRELIANCE_BASE64_SEQUENTIAL_JAILBREAKING

Goal Redirection

  • SECURITY_OVERRELIANCE_GOAL_REDIRECTION_SEQUENTIAL_JAILBREAKING

Input Bypass

  • BUSINESS_ALIGNMENT_HUMAN_AGENT_TRUST_EXPLOITATION_INPUT_BYPASS_SEQUENTIAL_JAILBREAKING

  • BUSINESS_ALIGNMENT_OVERWHELMING_HUMAN_IN_THE_LOOP_INPUT_BYPASS_SEQUENTIAL_JAILBREAKING

  • SECURITY_OVERRELIANCE_INPUT_BYPASS_BAD_LIKERT_JUDGE

Prompt Injection

  • BUSINESS_ALIGNMENT_OVERWHELMING_HUMAN_IN_THE_LOOP_PROMPT_INJECTION_SEQUENTIAL_JAILBREAKING

Roleplay

  • BUSINESS_ALIGNMENT_HUMAN_AGENT_TRUST_EXPLOITATION_ROLEPLAY_BAD_LIKERT_JUDGE

Semantic Manipulation

  • BUSINESS_ALIGNMENT_HUMAN_AGENT_TRUST_EXPLOITATION_SEMANTIC_MANIPULATION_BAD_LIKERT_JUDGE

  • BUSINESS_ALIGNMENT_HUMAN_AGENT_TRUST_EXPLOITATION_SEMANTIC_MANIPULATION_CRESCENDO_JAILBREAKING

  • BUSINESS_ALIGNMENT_HUMAN_AGENT_TRUST_EXPLOITATION_SEMANTIC_MANIPULATION_SEQUENTIAL_JAILBREAKING

  • BUSINESS_ALIGNMENT_OVERWHELMING_HUMAN_IN_THE_LOOP_SEMANTIC_MANIPULATION_BAD_LIKERT_JUDGE

  • BUSINESS_ALIGNMENT_OVERWHELMING_HUMAN_IN_THE_LOOP_SEMANTIC_MANIPULATION_CRESCENDO_JAILBREAKING

  • BUSINESS_ALIGNMENT_OVERWHELMING_HUMAN_IN_THE_LOOP_SEMANTIC_MANIPULATION_SEQUENTIAL_JAILBREAKING

  • SECURITY_OVERRELIANCE_SEMANTIC_MANIPULATION_BAD_LIKERT_JUDGE

  • SECURITY_OVERRELIANCE_SEMANTIC_MANIPULATION_CRESCENDO_JAILBREAKING

  • SECURITY_OVERRELIANCE_SEMANTIC_MANIPULATION_SEQUENTIAL_JAILBREAKING

System Override

  • BUSINESS_ALIGNMENT_OVERWHELMING_HUMAN_IN_THE_LOOP_SYSTEM_OVERRIDE_CRESCENDO_JAILBREAKING


ASI10: Rogue Agents

Base64

  • BUSINESS_ALIGNMENT_OFF_TOPIC_BASE64_CRESCENDO_JAILBREAKING

  • SECURITY_EXCESSIVE_AGENCY_BASE64_CRESCENDO_JAILBREAKING

Context Poisoning

  • SECURITY_EXCESSIVE_AGENCY_CONTEXT_POISONING_CRESCENDO_JAILBREAKING

Goal Redirection

  • BUSINESS_ALIGNMENT_LEGALLY_BINDING_GOAL_REDIRECTION_CRESCENDO_JAILBREAKING

Input Bypass

  • BUSINESS_ALIGNMENT_OFF_TOPIC_INPUT_BYPASS_CRESCENDO_JAILBREAKING

  • BUSINESS_ALIGNMENT_ROGUE_AGENTS_INPUT_BYPASS_CRESCENDO_JAILBREAKING

  • SECURITY_EXCESSIVE_AGENCY_INPUT_BYPASS_CRESCENDO_JAILBREAKING

Leetspeak

  • BUSINESS_ALIGNMENT_LEGALLY_BINDING_LEETSPEAK_CRESCENDO_JAILBREAKING

  • BUSINESS_ALIGNMENT_ROGUE_AGENTS_LEETSPEAK_CRESCENDO_JAILBREAKING

Multilingual

  • BUSINESS_ALIGNMENT_LEGALLY_BINDING_MULTILINGUAL_CRESCENDO_JAILBREAKING

  • BUSINESS_ALIGNMENT_ROGUE_AGENTS_MULTILINGUAL_CRESCENDO_JAILBREAKING

Permission Escalation

  • BUSINESS_ALIGNMENT_LEGALLY_BINDING_PERMISSION_ESCALATION_CRESCENDO_JAILBREAKING

  • BUSINESS_ALIGNMENT_OFF_TOPIC_PERMISSION_ESCALATION_CRESCENDO_JAILBREAKING

Prompt Injection

  • BUSINESS_ALIGNMENT_MISALIGNED_AND_DECEPTIVE_BEHAVIORS_PROMPT_INJECTION_SEQUENTIAL_JAILBREAKING

ROT13

  • BUSINESS_ALIGNMENT_LEGALLY_BINDING_ROT13_CRESCENDO_JAILBREAKING

  • BUSINESS_ALIGNMENT_MISALIGNED_AND_DECEPTIVE_BEHAVIORS_ROT13_CRESCENDO_JAILBREAKING

  • BUSINESS_ALIGNMENT_ROGUE_AGENTS_ROT13_CRESCENDO_JAILBREAKING

Roleplay

  • BUSINESS_ALIGNMENT_OFF_TOPIC_ROLEPLAY_TREE_JAILBREAKING

Semantic Manipulation

  • SECURITY_EXCESSIVE_AGENCY_SEMANTIC_MANIPULATION_TREE_JAILBREAKING

System Override

  • BUSINESS_ALIGNMENT_ROGUE_AGENTS_SYSTEM_OVERRIDE_SEQUENTIAL_JAILBREAKING

Last updated