Domain-Conditioned Safety in Frontier Computer-Using Agents: A 793-Episode Browser Benchmark, a Coding-Domain Cross-Reference, and a Reproducibility Audit of Recent Red-Teaming
arXiv cs.CR (Cryptography & Security)
75 items · Content Provenance, Authenticity & AI-Disclosure · site ↗
Search-Time Contamination in Deep Research Agents: Measuring Performance Inflation in Public Benchmark Evaluation
From Attack Simulation to SIEM Rule: Deterministic Detection-as-Code Synthesis with Probe-Level Traceability
Willing but Unable: Separating Refusal from Capability in Code LLMs via Abliteration
A formal framework for the economic security of DeFi compositions
Policy-Compliant Cloud Storage Systems
CRESS: Quantifying Vulnerabilities of Attack Scenarios in Hardware Reverse Engineering
SHIELDS: Automating OS Hardening with Iterative Multi-Agent Remediation
Bitcoin After Block Rewards
ZERO-APT: A Closed-Loop Adversarial Framework for LLM-Driven Automated Penetration Testing under Intelligent Defense
Dimensionality Reduction for Cyberattack Classification: A Comparative Evaluation of PCA and Linear Predictive Coding
The Coverage Gap: Chile's Cyber Disclosure Framework versus the USA, EU and UK
SlotGCG: Exploiting the Positional Vulnerability in LLMs for Jailbreak Attacks
Protecting K-Nearest Neighbor Queries from Location Inference Attacks
Cognitive Threat Intelligence and Explainable Federated Security Analytics for distributed Infrastructure Systems
MaskForge: Structure-Aware Adaptive Attacks for Jailbreaking Diffusion Large Language Models
Need to Know: Contextual-Integrity-Grounded Query Rewriting for Privacy-Conscious LLM Delegation
Bayesian Membership Privacy for Graph Neural Networks
Covert Influence Between Language Models
Caught in the Act(ivation): Toward Pre-Output and Multi-Turn Detection of Credential Exfiltration by LLM Agents
MimeLens: Position-Agnostic Content-Type Detection for Binary Fragments
Notarized Agents: Receiver-Attested Confidential Receipts for AI Agent Actions
Long-Term and Short-Term Transistor Aging in Deep Neural Networks: Impact and Mitigation
Formal verification of the S-two AIR
Toward a Generalized Defense Across Sparse, Continuous, and Structured Parameter Attacks
From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents
TITAN-FedAnil+: Trust-Based Adaptive Blockchain Federated Learning for Resource-Constrained Intelligent Enterprises
Pepper: High-bandwidth and Scalable Anonymous Broadcast with Cryptographic Privacy
What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems
What Can Verifiable Decapsulation Tests Certify? Pass Bounds and Fault-Recognition Limits for FO-Based KEMs
MultiTurnPSB: Evaluating Multi-Turn Jailbreak Attacks an dClassifier-Based Defenses for Medical AI Safety
D-Judge: Disrupting Multi-Turn Jailbreaks using Semantics-Preserving Output Rewriting
Inference Cost Attacks for Retrieval-Augmented Large Language Models
A New Framework for Cybersecurity Refusals in AI Agents
What You Approve Is What Executes: Consent Integrity for Black-Box LLM Agents
Cross-Vendor Sola ISPM Benchmark: Evaluating Agentic AI for Federated Identity Security Reasoning
On Improving Robustness of Deepfake Image Detectors
Which Defense Closes Which Threat? Attributing OWASP-LLM-Top-10 Coverage and Its Brittleness Under Paraphrasing
Large Byte Model: Teaching Language Models About Compiled Code
Human Factors in Cybersecurity in Icelandic Small and Medium-sized Enterprises
Quantifying Side-Channel Leakage in Public Metrology Releases
Echelon: Auditable Aggregate-Only Language-Model Adaptation Across Privacy Boundaries
Patcher: Post-Hoc Patching of Backdoored Large Language Models
Secure AltDA Integration for Ethereum L2s: An End-to-End Validation Framework
SkillGuard: A Permission Framework for Agent Skills
A Survey on Security with Quantum Computing
From Frontier to Shadow AI: A Simmering Threat to Assurance and Security in Critical Infrastructure
XAI-SOH-FL: Enhancing SOH-FL with Adaptive Aggregation and Explainable AI for Intrusion Detection in Heterogeneous IoT
Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models
PrivacyPeek: Auditing What LLM-Based Agents Acquire, Not Just What They Say
A Protocol-Language Model for Network Intrusion (Without Deep Packet Inspection)
DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning
Improving IoT Intrusion Detection Through SMOTE-Based Oversampling and Extended Multi-Model Evaluation on Side-Channel Power Data
Inferring Routing-Layer Defense Mechanisms from Observable Behavior in OLSR-Based MANETs
How to Compare the Security of Code Written by Humans to LLM-generated Code
A Moderatorless Protocol for WEREWOLF
Bit-Exact AI Inference Verification Without Performance Tradeoffs
Beyond Edge Coverage: Per-Task Data-Flow Extraction at Kernel Function Boundaries via LLVM
Stochastic Analysis of Cybersecurity Defense Strategies Under Single Attack Scenario
Confused ChatGPT: Cross-App Context Poisoning via First-Party APIs
Escaping the Linearity Trap: Manifold Detours for Black-Box Adversarial Attacks on Singing Audio Deepfake Detection
The Surface You Test Is Not the Surface That Breaks
Strengthening Polymorphic Prompt Assembling: Dynamic Separator Generation Against Emerging Prompt Injection Attacks
AdvScene: Rethinking Adversarial Patch Evaluation Through Scene Robustness
An Organization-Scoped LLM Agent Runtime Architecture for Regulated Cybersecurity Operations
CacheProbe: Auditing Prompt Cache Isolation in Gateway APIs
Audio Pirates: Black-box Audio Watermark Removal via Diffusion Priors
When AI Meets Wall Street: A Survey on Trustworthy AI in Fintech
Automatically Attacking Software Reverse Engineering AI Agents
Investigating Detection and Obfuscation of Prompt Injection Attacks Against Software Reverse Engineering AI Agents
Depth-Dependent Indirect Prompt Injection in Tool-Calling ReAct Agents: Injection Depth, Payload Framing, and Turn-Budget Sensitivity
Triaging Threats to Specialized Guardrails
FASR: Automated Identification of Unsafe Control Actions in STPA
Differentially Private Preference Data Synthesis for Large Language Model Alignment
Send a SCOUT First: Pre-hoc Reasoning for Adaptive Detector Allocation in Prompt-Injection Defense