Paper • arXiv 2026

Persistent Memory for Agentic Workflows

We introduce a hierarchical memory architecture that achieves 94% retention accuracy across heterogeneous agent sessions. Our approach combines episodic memory buffers with compressed working memory representations, enabling agents to recall and apply knowledge from prior sessions without catastrophic forgetting. Evaluated on 10k+ agent trajectories across code engineering tasks.

K. Chen, A. Patel, M. Johansson • Jun 2026 Read full paper →

Paper • ICML 2026

Tool-Augmented Reasoning at Scale

We show that structured tool-use during chain-of-thought reasoning improves accuracy by 37% on complex software engineering benchmarks. Our method interleaves natural language reasoning with API calls to compilers, linters, test runners, and documentation retrievers within a unified thought loop.

L. Wang, R. Kumar, S. Torres • May 2026 Read full paper →

Model • 2026

Nexus-1: Agent Foundation Model

Our flagship foundation model achieves state-of-the-art results on SWE-Bench, Tool-Use, and Multi-Step Reasoning benchmarks. Nexus-1 is trained on 3M+ trajectories of tool-mediated problem solving using a novel two-stage curriculum: supervised fine-tuning on expert demonstrations followed by RL from tool-use feedback.

Nexus Research Team • Apr 2026 Model card →

Whitepaper • 2025

The Case for Agent-Native Infrastructure

Why the next generation of AI requires a fundamentally new runtime. We analyze the limitations of chat-based interfaces layered on legacy systems and propose a set of architectural principles for building infrastructure designed for autonomous agents from the ground up.

Technical report • 48 pages Read whitepaper →

Blog • 2026

Evaluating Agent Reliability: A Practical Framework

A systematic methodology for measuring and improving the reliability of autonomous AI agents in production. We introduce coverage metrics, failure mode taxonomies, and a continuous evaluation pipeline that runs against every deployment.

Nexus Engineering Blog • Mar 2026 Read framework →

Blog • 2026

Building Agents That Remember: Lessons from Production

Engineering lessons from deploying persistent memory across 10k+ agent sessions in production. We cover memory compaction strategies, retrieval latency, conflict resolution, and the surprising failure modes of long-lived agents.

Nexus Engineering Blog • Feb 2026 Read lessons →

Workshop • 2026

Multi-Agent Coordination via Shared Memory Graphs

We extend persistent memory to multi-agent settings, showing that shared memory graphs enable teams of agents to coordinate, delegate, and resolve conflicts without centralized orchestration.

NeurIPS 2026 Workshop on Agentic AI Read paper →

Preprint • 2026

Safety-Critical Agent Behavior via Constrained Decoding

A method for enforcing operational constraints during agentic decoding, guaranteeing that generated actions satisfy pre-defined safety policies without requiring post-hoc filtering.

Nexus Safety Team • Jan 2026 Read preprint →