Buffaly: Agents That Remember in Code
A different kind of agent: one that turns language into executable structure instead of keeping everything in text prompts.
A Different Kind of Agent
Buffaly is different enough that it may take a minute to understand.
Most agents are built around prompts, tool calls, and text-based models. They improve by adding more context, more retrieval, more model calls. Buffaly takes a different path.
It is not an LLM wrapper. It is built on years of work in language understanding, semantic representation, prototype graphs, and real-world usage. Memory in Buffaly is a first-class, executable substrate. Instead of remembering in text, Buffaly remembers in code.
Truly different AI software is rare right now because most AI products are racing along the same path. Bigger prompts. More tools. More retrieval. More context. More model calls.
Buffaly is exploring a different path.
What I am showing today is only an early preview of what becomes possible when we stop assuming that agents have to live entirely inside text prompts. If the current generation of agents is built around conversation plus tools, Buffaly is built around structured execution plus learning. That difference is harder to explain in one sentence, but it is exactly why it matters.
Most agents remember in text. Buffaly remembers in code.
Instead of trying to explain the theory first, let’s start with examples and really pull back the curtain.
All examples below come from real usage in remote care and revenue cycle management.
1. Semantic Entities and Prototypes: Turning Language into Structured, Usable Knowledge
Memory in Buffaly is a first-class feature. It is the core of everything. So let’s start by showing how Buffaly might remember a basic fact:
APCM is a remote care program.
A normal agent might store that as a note, retrieve it later as text, or include it in a prompt. That can help the model remember what APCM means, but the sentence is still just language. The model has to reinterpret it every time it matters.
In Buffaly, that sentence can become a typed object in the system.
[SemanticEntity("care program")]
prototype CareProgram : BaseObject
{
String ProgramCode = new String();
String DisplayName = new String();
String Description = new String();
}
[SemanticEntity("remote care program")]
prototype RemoteCareProgram : CareProgram
{
String ProgramCategory = "Remote Care";
}
[SemanticEntity("APCM")]
[SemanticEntity("advanced primary care management")]
partial prototype CareProgram#APCM : RemoteCareProgram
{
ProgramCode = "APCM";
DisplayName = "Advanced Primary Care Management";
Description = "A remote care program used in medical administrative workflows.";
}That is the first important step. The phrase “APCM is a remote care program” has moved out of plain text and into a structure Buffaly can actually use.
This difference matters. Buffaly has not merely stored a sentence. It now has a typed object with identity, fields, and natural language bindings. Later, when a user asks about APCM, Buffaly does not have to rediscover what APCM means from a prompt. It can resolve the phrase to CareProgram#APCM and use that object directly.
This is the first hint of why ProtoScript matters. It gives Buffaly a way to turn language into structured working knowledge, instead of leaving everything as text for the model to reinterpret again and again.
2. Incremental Refinement: Memory That Becomes More Useful Over Time
Buffaly does not require every object to be perfectly modeled on day one. That is important. In real medical administration, the system often starts with partial knowledge and refines it as repeated work reveals what matters.
A care program might begin with a shallow representation:
partial prototype CareProgram#APCM : RemoteCareProgram
{
ProgramCode = "APCM";
BillableCodes = ["G0556", "G0557", "G0558"];
}That is already better than leaving the information buried in prose, but it is still shallow. The codes are just strings. The system can list them, but it does not yet know much about them.
As Buffaly sees those codes used in eligibility checks, billing readiness checks, claim review, payer exceptions, or denial workflows, it can promote the raw strings into typed objects:
[SemanticEntity("billing code")]
prototype BillingCode : BaseObject
{
String Code = new String();
String CodeSystem = new String();
String Description = new String();
String Category = new String();
Bool TimeBased = false;
String BillingFrequency = new String();
}
[SemanticEntity("APCM level 1")]
[SemanticEntity("G0556")]
partial prototype BillingCode#G0556 : BillingCode
{
Code = "G0556";
CodeSystem = "HCPCS";
Description = "Advanced Primary Care Management level 1.";
Category = "Advanced Primary Care Management";
TimeBased = false;
BillingFrequency = "Monthly";
}
[SemanticEntity("APCM level 2")]
[SemanticEntity("G0557")]
partial prototype BillingCode#G0557 : BillingCode
{
Code = "G0557";
CodeSystem = "HCPCS";
Description = "Advanced Primary Care Management level 2.";
Category = "Advanced Primary Care Management";
TimeBased = false;
BillingFrequency = "Monthly";
}
[SemanticEntity("APCM level 3")]
[SemanticEntity("G0558")]
partial prototype BillingCode#G0558 : BillingCode
{
Code = "G0558";
CodeSystem = "HCPCS";
Description = "Advanced Primary Care Management level 3.";
Category = "Advanced Primary Care Management";
TimeBased = false;
BillingFrequency = "Monthly";
}
partial prototype CareProgram#APCM : RemoteCareProgram
{
ProgramCode = "APCM";
BillableCodes = [BillingCode#G0556, BillingCode#G0557, BillingCode#G0558];
}Now the memory has improved. The system does not just know that APCM uses three code strings. It knows those codes are billing code objects. They have a code system, description, category, billing frequency, and relationship to a care program.
At this point, the system has moved from text memory to usable structure. It can answer questions like:
- Which care programs use G0557?
- Which billable codes are monthly?
- Which programs are non-time-based?
- Which patient-months are eligible for an APCM code?
- Which payer rules modify how this code should be handled?
That is a very different kind of memory from a note in a prompt.
A normal agent would probably store something like:
APCM uses G0556, G0557, and G0558. These are monthly HCPCS codes. Remember to check chronic condition count and QMB status.
That can help, but every future use requires the model to reread, reinterpret, and apply the text. The information is still advisory. It is not a working object model.
Buffaly can promote the information into the runtime. The code becomes an object. The object gets relationships. The relationships support actions. The actions can later be hardened into deterministic checks.
That is the important idea:
Buffaly’s memory does not just grow. It becomes more operational.
What starts as a string can become a typed object. A typed object can gain relationships. A relationship can support a rule. A repeated rule can become an action. A critical action can become hardened native code.
That is how knowledge compounds in Buffaly. Not by adding more text to the prompt, but by improving the shape of the system itself.
3. Equivalent in a Non-Buffaly Text/Prompt-Based Agent
A typical agent would try to capture the same knowledge like this:
System prompt excerpt:
APCM = Advanced Primary Care Management. APCM is billed with monthly HCPCS codes G0556, G0557, and G0558. APCM is not time-based. Code level depends on patient complexity, including chronic condition count and Qualified Medicare Beneficiary status. Consent is required.
RPM = Remote Physiologic Monitoring. Core CPT examples include 99453, 99454, 99457, and 99458. These relate to setup, device supply/data transmission, and treatment management time.
CCM = Chronic Care Management. Core CPT examples include 99490, 99439, 99491, 99437, 99487, and 99489. These include standard CCM, practitioner-performed CCM, and complex CCM codes with time-based thresholds.
RTM = Remote Therapeutic Monitoring. Core CPT examples include 98975, 98976, 98977, 98980, and 98981. These relate to setup, device/data collection, and treatment management time.
When a user mentions any of these programs, remember the definitions, billing codes, and requirements above. Use them when checking eligibility, billing readiness, or documentation completeness. Do not hallucinate additional rules.Every time the model needs this information, it has to re-read and reinterpret the text. Context windows fill quickly when you add patient data, payer variations, documentation rules, corporate policies, and exceptions. Updates require prompt editing. There is no native identity, no typed traversal, and no way for the system to promote a new requirement into executable structure. It stays as language the model has to parse every time.
That is the core contrast: Buffaly turns domain language into typed, executable objects that the system can reason over natively. A normal agent leaves it as text for the model to reinterpret on every request.
4. “So It’s Basically a Graph Database?”
Yes.
Buffaly uses graph structure. And it uses ontology concepts. And it uses vector embeddings for entity discovery. And it uses a type system. And it uses executable actions. And it uses C# interop. And it uses native objects. It uses tool discovery. It uses provenance. It uses a learning loop.
None of those words are magic by themselves. Buffaly puts them into one working substrate.
In most systems, those pieces live in different places. The graph database stores facts. The vector database retrieves similar text. The tool registry stores tool descriptions. The application code runs somewhere else. The type system belongs to the host language. The prompt explains the rules. The audit log records whatever happened afterward. The model is expected to glue all of that together through text.
Buffaly takes a different bet: put as much of that as possible into a code-like working medium the agent can read, write, inspect, execute, and improve.
That medium is ProtoScript.
ProtoScript is intentionally familiar. It looks enough like ordinary code that an LLM can work with it. It has types, objects, functions, and C#-style syntax. But it also carries semantic annotations, prototype relationships, executable actions, and bridges into native code. So the model is not just editing a prompt or calling tools from a flat list. It is working with the actual material of the runtime.
That matters because LLMs are good at reading and writing code. Code has structure. Code has syntax. Code can be compiled or interpreted. Code can be diffed, reviewed, tested, versioned, and audited. Prompt instructions are slippery. Code gives the system a harder surface.
Most agents remember in text. Buffaly remembers in code.
5. ProtoScript: Where the Structure Starts to Execute
Up to this point, CareProgram#APCM is still mostly structure. It is better than text because the system can resolve it, inspect it, and use it as a typed object, but the bigger shift happens when functions enter the picture.
This is where ProtoScript stops being “a way to represent remote care concepts” and becomes the working medium for the agent.
Tools as Typed Functions
Start with a simple remote care action:
[SemanticProgram.InfinitivePhrase("to check documented minutes for a remote care program")]
prototype ToCheckDocumentedMinutesForRemoteCareProgram : MedicalAdminAction
{
Description = @"Minutes - documented minutes for the period.
Program - remote care program to evaluate.";
function Execute(Decimal Minutes, RemoteCareProgram Program) : String
{
if (!Program.RequiresDocumentedMinutes)
{
return "Ready: this program is not based on documented minutes.";
}
if (Minutes >= Program.MinimumMinutes)
{
return "Ready: documented minutes meet the program minimum.";
}
return "Not ready: documented minutes are below the program minimum.";
}
}The important part is the function signature:
function Execute(Decimal Minutes, RemoteCareProgram Program) : StringThis tool does not take a vague JSON blob. It takes two typed parameters:
Minutes:DecimalProgram:RemoteCareProgram
Buffaly can expose that to the model compactly:
{
"Tool": "ToCheckDocumentedMinutesForRemoteCareProgram",
"Parameters": {
"Minutes": "Decimal",
"Program": "RemoteCareProgram"
}
}The model can call it like this:
{
"Tool": "ToCheckDocumentedMinutesForRemoteCareProgram",
"Minutes": 12.5,
"Program": "RemoteCareProgram#CCM"
}Or:
{
"Tool": "ToCheckDocumentedMinutesForRemoteCareProgram",
"Minutes": 0,
"Program": "RemoteCareProgram#APCM"
}Buffaly resolves RemoteCareProgram#CCM or RemoteCareProgram#APCM inside the runtime. The model does not need to paste the whole program definition into the prompt. It passes a typed reference, and the runtime executes the function against the actual object.
A normal tool call usually says: Here is some data. Model, decide what it means.
Buffaly says: Here is a typed function. Here is a typed object. Runtime, execute it.
This is the important transition: the same layer that represents the care program can also define the functions that act on it.
In a normal agent, “APCM is not time-based” is usually text in a prompt, a retrieved note, or a fact from a graph database. The model still has to apply that fact correctly every time.
In Buffaly, that distinction becomes part of the executable structure. The type of the object helps determine which function runs, what checks apply, and what actions are valid next.
That is why ProtoScript matters. It gives the agent a place where medical knowledge can become usable code, not just more context.
6. Native C# Interop: Real Objects, Not JSON Theater
ProtoScript is not isolated from the host application. It can run against a substrate of imported C# types and C# methods. That is critical.
A Patient does not have to be redefined as a fake ProtoScript object. If the real Patient class already exists in the medical admin system, Buffaly can import that C# type, call C# functions that return it, keep the object in memory, and pass it into other ProtoScript actions.
The result is that ProtoScript prototypes and native C# objects can live side by side in the same workflow.
Example Native C# Model
namespace MedicalAdmin.Runtime;
public sealed class Patient
{
public string PatientId { get; set; }
public string MRN { get; set; }
public string DisplayName { get; set; }
}
public sealed class APCMReadinessResult
{
public bool ReadyForBilling { get; set; }
public string BlockingReason { get; set; }
public string RequiredNextAction { get; set; }
}
public static class PatientLookup
{
public static Patient FindPatientByMRN(string MRN)
{
// Query EMR, billing DB, internal API, cache, etc.
return PatientRepository.GetByMRN(MRN);
}
}
public static class APCMReadinessChecker
{
public static APCMReadinessResult Check(Patient Patient)
{
// Native business logic.
// Can inspect internal objects without exposing them to the LLM.
return APCMRules.Evaluate(Patient);
}
}ProtoScript can import those C# symbols and use them directly:
// illustrative import shape
import MedicalAdmin.Runtime MedicalAdmin.Runtime.Patient Patient;
import MedicalAdmin.Runtime MedicalAdmin.Runtime.APCMReadinessResult APCMReadinessResult;
import MedicalAdmin.Runtime MedicalAdmin.Runtime.PatientLookup PatientLookup;
import MedicalAdmin.Runtime MedicalAdmin.Runtime.APCMReadinessChecker APCMReadinessChecker;Now a ProtoScript action can return a real C# Patient object:
[SemanticProgram.InfinitivePhrase("to find a patient by MRN")]
prototype ToFindPatientByMRN : MedicalAdminAction
{
Description = @"MRN - medical record number to search for.";
function Execute(String MRN) : Patient
{
return PatientLookup.FindPatientByMRN(MRN);
}
}That action is not returning a paragraph. It is not returning JSON for the model to parse. It is returning a native C# object.
Patient Patient = ToFindPatientByMRN.Execute("12345");That Patient object can then be passed into another action:
[SemanticProgram.InfinitivePhrase("to check APCM readiness for a patient")]
prototype ToCheckAPCMReadinessForPatient : MedicalAdminAction
{
Description = @"Patient - native Patient object returned from patient lookup.";
function Execute(Patient Patient) : APCMReadinessResult
{
return APCMReadinessChecker.Check(Patient);
}
}Now the chain is native-object based:
Patient Patient = ToFindPatientByMRN.Execute("12345");
APCMReadinessResult Result =
ToCheckAPCMReadinessForPatient.Execute(Patient);The model does not need this:
{
"PatientId": "P-77821",
"MRN": "12345",
"DisplayName": "John Doe",
"Insurance": "...",
"Documents": "...",
"Notes": "...",
"Programs": "..."
}The model can see a controlled handle or summary:
Patient resolved.
Handle: Patient#A17F
Type: PatientThe runtime keeps the actual object.
Then the model can call the next action using the handle:
{
"Tool": "ToCheckAPCMReadinessForPatient",
"Patient": "Patient#A17F"
}Buffaly resolves Patient#A17F back to the in-memory C# object and runs:
APCMReadinessResult Result =
ToCheckAPCMReadinessForPatient.Execute(Patient#A17F);That is the interop point.
ProtoScript can define semantic concepts:
[SemanticEntity("APCM")]
[SemanticEntity("advanced primary care management")]
partial prototype CareProgram#APCM : NonTimeBasedCareProgram
{
ProgramCode = "APCM";
DisplayName = "Advanced Primary Care Management";
}And in the same workflow it can use native C# objects:
Patient Patient = ToFindPatientByMRN.Execute("12345");
APCMReadinessResult Result = APCMReadinessChecker.Check(Patient);So the system can combine:
- ProtoScript prototype:
CareProgram#APCM - Native C# object:
Patient - Native C# result:
APCMReadinessResult - ProtoScript action:
ToCheckAPCMReadinessForPatient
That is the important difference.
ProtoScript is not trying to replace C#. It gives the agent a semantic working layer that can call C#, hold C# objects, pass them between actions, and expose only controlled references to the model.
A normal text-first agent usually does this:
MCP tool returns JSON
→ JSON goes into LLM context
→ model reads patient data
→ model calls next tool with more JSONBuffaly can do this:
C# returns Patient object
→ ProtoScript holds Patient object
→ next action receives Patient object
→ model sees only handle / controlled resultThat is why the C# interop matters. It lets Buffaly operate on the real objects the business already uses instead of forcing everything through text.
7. MCP / CLI Wrappers vs. Native Execution
The difference becomes even sharper when you look at how actions actually execute.
Traditional Text-Based Agents: MCP / CLI Approach
To let an LLM interact with your systems, you typically build translation layers: custom APIs, CLI wrappers, or Model Context Protocol servers.
When the model decides to call a tool:
- The model outputs a text command or JSON tool call.
- Middleware catches it, parses the text, and translates it into a real function call.
- The real system executes and returns data, usually serialized as JSON or text.
- That serialized data is pushed back into the model’s context window.
In our example:
Model decides → calls get_patient_by_mrn("12345") via MCP/CLI.
Middleware translates → actual lookup happens.
Full patient record or large JSON returns to the model.
Model now has raw PHI in its prompt space for the next reasoning step.This creates multiple points of fragility: parsing errors, serialization overhead, constant streaming of sensitive data, and ongoing maintenance of the translation layer. The model is forced to reason over flattened text representations of complex objects.
Buffaly: Native Object Execution
Buffaly bypasses the translation layer entirely. ProtoScript actions bind directly to your existing native code and objects.
In the same example:
prototype ToFindPatientByMRN : MedicalAdminAction
{
function Execute(string mrn) : Patient
{
return PatientLookup.FindPatientByMRN(mrn); // native C# / your system
}
}The model selects the high-level action: to_find_patient_by_mrn.
Buffaly’s runtime executes the ProtoScript action directly.
A real native Patient object stays in memory inside the runtime.
The model receives only a safe, typed handle:
Patient#A17FNo JSON serialization of the full record. No middleware parsing. No raw PHI entering the model’s context. The runtime controls exactly what the model can see or do.
The same pattern applies to the readiness check:
Traditional:
Model passes patient JSON into check_apcm_readiness(patient_json).
Buffaly:
Runtime passes the native Patient object into ToCheckAPCMReadinessForPatient.Execute(patient, program)
→ returns a clean APCMReadinessResult.This is not a minor optimization. It is the reason Buffaly can offer strong control, provenance, and safety in a regulated domain. The model still provides flexible reasoning, but it never owns the data model or the execution environment. The runtime does.
| Aspect | MCP / CLI / Text Layer | Buffaly Native Execution |
|---|---|---|
| Data movement | Full records serialized into model context | Native objects stay in runtime; only handles exposed |
| Sensitive data exposure | High, because PHI often enters prompts | Minimal / controllable |
| Translation overhead | Constant parsing + middleware maintenance | Direct binding |
| Execution control | Model + prompt engineering | Runtime + typed actions + policy graph |
| Auditability | Text transcripts + generated explanations | Structured logs + provenance through graph |
| Performance & safety | Serialization cost + injection risk | Native speed + strict type/visibility bounds |
8. End-to-End Execution: “Find the Patient with MRN 12345 and Check Whether They Are Ready for APCM”
This simple request reveals the architectural difference in how the two systems operate.
Find the patient with MRN 12345 and check whether they are ready for APCM.Buffaly still uses the LLM, but the LLM is not the working medium for the whole task. Buffaly uses ProtoScript and native runtime objects as the execution medium.
The first step is entity and action resolution.
"APCM" -> CareProgram#APCM
"MRN 12345" -> lookup key
"check readiness" -> ToCheckAPCMReadinessForPatientBuffaly does not need to dump a full patient record into the prompt. It can call a ProtoScript action that bridges into native C#:
[SemanticProgram.InfinitivePhrase("to find a patient by MRN")]
prototype ToFindPatientByMRN : MedicalAdminAction
{
Description = @"mrn - medical record number to search for.";
function Execute(string mrn) : Patient
{
return PatientLookup.FindPatientByMRN(mrn);
}
}The C# hook returns a real Patient object, not a text blob for the model to parse.
ToFindPatientByMRN.Execute("12345")
-> native Patient objectBuffaly keeps that object in memory. The model may only see a safe handle or controlled statement:
Patient resolved.
Handle: Patient#A17FNow the next step can operate on the object directly:
[SemanticProgram.InfinitivePhrase("to check APCM readiness for a patient")]
prototype ToCheckAPCMReadinessForPatient : MedicalAdminAction
{
Description = @"patient - native Patient object returned from patient lookup.
program - care program to evaluate.";
function Execute(Patient patient, CareProgram program) : APCMReadinessResult
{
return APCMReadinessChecker.Check(patient, program);
}
}Buffaly executes:
ToCheckAPCMReadinessForPatient.Execute(Patient#A17F, CareProgram#APCM)
-> native APCMReadinessResult objectAgain, the full underlying data does not need to be passed into the LLM. The runtime can expose only the controlled result:
ReadyForBilling: false
BlockingRequirement: Missing initiating visit documentation
AllowedNextAction: Create missing documentation follow-up taskIf the next action is allowed, Buffaly can execute it through another typed action:
[SemanticProgram.InfinitivePhrase("to create a missing documentation task")]
prototype ToCreateMissingDocumentationTask : MedicalAdminAction
{
Description = @"patient - native Patient object.
requirement - missing documentation requirement.";
function Execute(Patient patient, MissingRequirement requirement) : Task
{
return TaskCreator.CreateMissingDocumentationTask(patient, requirement);
}
}The model helps interpret the request and choose among valid actions, but Buffaly owns the execution environment. The patient object, program object, readiness result, and task object can remain native runtime objects. They do not have to become raw prompt text.
Side-by-Side Contrast
| Stage | Normal Text-Based Agent | Buffaly |
|---|---|---|
| Initial prompt | Broad instructions + large tool list | User request + resolved entities + candidate actions |
| Tool / action selection | Model chooses from many tools | Runtime narrows, model chooses from candidates |
| Patient lookup | Tool returns patient JSON/text to model | ProtoScript calls C#, native object stays in runtime |
| What model sees | Full or partial patient data as JSON/text | Opaque typed handle, such as Patient#A17F |
| Readiness check | Model passes patient JSON into next tool | Runtime passes native object into typed action |
| Sensitive data | Often enters prompt space | Can stay entirely inside Buffaly runtime |
| Control | Prompt asks model to behave | Runtime controls tools, objects, and exposure |
The core architectural contrast is simple.
Normal agent:
Model sees tools
→ Model chooses tool
→ Tool returns patient data as text/JSON
→ Model reads data
→ Model chooses next tool
Buffaly:
Runtime resolves entities
→ Runtime narrows actions
→ Model chooses from candidates
→ Runtime executes on native objects
→ Model sees only controlled handles and resultsA normal agent gives the model the full tool list and the raw data, then hopes it manages the workflow correctly.
Buffaly keeps the data and execution inside a controlled runtime, gives the model only the relevant action surface, and lets it operate through safe, typed references instead of raw patient records.
That is the fundamental difference.
9. Scaling Example: “Now Check These 15,000 Patients for APCM”
This is where token size and operational reality diverge dramatically.
Normal Text-Based Agent
The model, or agent loop, has to work through the list. Even with batching, each iteration typically involves:
- Pulling patient records or summaries via tools.
- Serializing them as JSON/text.
- Feeding that data into the model’s context so it can reason about APCM readiness.
- Calling
check_apcm_readinessor equivalent for each.
For 15,000 patients, even a modest patient record: demographics, insurance, recent documents, notes, claims: can be 2 to 10 KB of JSON per patient. At scale this quickly explodes into hundreds of thousands or millions of tokens per batch. The model ends up re-reading similar context repeatedly, paying full token cost every time, and risking context window limits, higher latency, higher cost, and more hallucination risk on edge cases. Staff still end up manually reviewing large exception queues because the agent cannot reliably promote patterns.
Buffaly
The runtime handles the heavy lifting at native speed with almost no token cost to the model.
A user or scheduled job requests a batch check for APCM on 15,000 patients, given as MRNs, filters, or a query handle.
Buffaly’s Policy Graph and runtime resolve the batch into a set of native Patient objects or efficient cursors. No full serialization is required.
For each patient, or in optimized batches, a ProtoScript action runs directly:
prototype ToCheckAPCMReadinessForPatientBatch : MedicalAdminAction
{
function Execute(Collection<Patient> patients, CareProgram program) : BatchReadinessResult
{
return APCMReadinessChecker.CheckBatch(patients, program);
}
}The native C# implementation processes the objects in-memory at full speed: database queries, rule engine, and internal objects.
Only exceptions or summary results are surfaced to the model as controlled handles and compact results:
Batch APCM Readiness Summary:
Processed: 15,000
Ready: 9,847
Blocking Issues:
Missing initiating visit doc: 3,214 (handles available)
Eligibility gap: 1,939
Allowed bulk actions:
to_create_documentation_tasks
to_route_to_eligibility_reviewThe model sees a tiny, high-level result and can choose high-level follow-up actions, such as bulk task creation. It never sees 15,000 full patient records. Token usage stays minimal and constant, regardless of batch size. Repeated patterns, such as the most common missing document, get promoted into the graph for even cheaper future processing.
Token and Cost Contrast
| Aspect | Normal Text-Based Agent | Buffaly |
|---|---|---|
| Data per patient in model context | Full or summarized JSON, often hundreds to thousands of tokens | Tiny handle + result summary, often a few dozen tokens |
| Token scaling with volume | Linear or worse because context repeats | Near-constant for the model |
| Compute location | Mostly in expensive LLM calls | Mostly native runtime + C# |
| Exception handling | Model re-reasons each similar case | Runtime promotes common patterns automatically |
| Practical limit | Hits context windows or cost thresholds quickly | Scales to tens or hundreds of thousands efficiently |
This is the margin advantage in action. Traditional agents move the labor cost into a more expensive cloud token loop. Buffaly turns repeated work into native infrastructure that gets cheaper and more reliable over time.
10. Runtime Self-Extension: Saving 14,999 Tool Calls
Assume Buffaly starts with only one available capability:
check_apcm_readiness(patient)That tool works. It checks one patient and returns whether that patient is ready for APCM.
Now the user asks:
Check APCM readiness for these 15,000 patients.A normal agent is trapped by the tool surface it was given. If the only tool checks one patient, the agent has to call it 15,000 times. That means 15,000 tool calls, 15,000 round trips, 15,000 result payloads, and potentially 15,000 chances to serialize patient context or intermediate results back into the model loop.
It may try to be clever. It may summarize results. It may chunk the patients. It may ask for batching. But unless somebody has already built a batch tool, the agent cannot change the fact that its actual executable surface only contains a single-patient operation.
Buffaly can.
Buffaly can recognize that the system is about to repeat the same action thousands of times and create a new action at the right level of abstraction. It can write a new ProtoScript skill while running, load it into the runtime, expose it as a new callable action, and then use that action immediately.
That is the key move.
[SemanticProgram.InfinitivePhrase("to check APCM readiness for a batch of patients")]
prototype ToBatchCheckAPCMReadiness : MedicalAdminAction
{
Description = @"patients - collection of Patient objects.
program - care program to evaluate.";
function Execute(Collection patients, CareProgram program) : BatchReadinessResult
{
BatchReadinessResult results = new BatchReadinessResult();
foreach (Patient patient in patients)
{
APCMReadinessResult result = APCMReadinessChecker.Check(patient, program);
results.Add(patient, result);
}
return results;
}
}Now Buffaly has a new tool:
to_batch_check_apcm_readiness(patients, CareProgram#APCM)The model no longer has to orchestrate 15,000 single-patient calls. It makes one decision:
Run the APCM readiness check across this cohort.The runtime executes the loop natively. The patient objects stay in memory. The readiness checker runs directly. The detailed results stay structured. The model receives a compact summary and handles for follow-up:
Processed: 15,000
Ready: 9,847
Exceptions: 5,153
Exception groups:
- Missing initiating visit: 1,204
- Missing consent: 836
- Insurance issue: 492
- Needs review: 2,621That saves 14,999 tool calls.
But more importantly, it shows the architectural difference.
A normal agent can remember that batching would be better. Buffaly can make batching exist.
A normal agent can write down advice. Buffaly can turn the advice into executable runtime capability.
A normal agent is limited to the tools it was handed. Buffaly can rewrite its own working tool surface while the task is still happening.
That is the point of ProtoScript in Buffaly: repeated reasoning does not have to remain reasoning. It can become code the agent can call.
11. Architectural Implications
Buffaly is not “prompts + tools + vector DB + graph DB glued by an LLM.” It is a single coherent substrate:
- Prototype graphs for semantic identity and relationships
- Typed objects that evolve from shallow to deep
- Executable ProtoScript functions that bind directly to native code
- In-memory native object lifetime with no mandatory serialization
- Runtime self-extension, where new actions can be created while running
- Provenance and auditability built into the graph
The LLM is still essential for language understanding, intent resolution, and high-level orchestration, but it is no longer the working medium. The working medium is code.
This is why Buffaly can operate safely and efficiently in regulated domains: the model never owns the data model or the execution environment. The runtime does.
ProtoScript, prototype graphs, and native interop give agents a fundamentally different substrate: one that turns language into executable structure instead of leaving everything as text for the model to reinterpret on every request.
12. Traditional Agent vs. Buffaly: Final Architectural Recap
| Aspect | Traditional Text/Prompt-Based Agent | Buffaly |
|---|---|---|
| Memory | Text notes, prompts, retrieved fragments | Typed prototypes + executable ProtoScript code in the runtime |
| Knowledge representation | Natural language sentences that must be re-interpreted every time | Semantic entities that become objects, relationships, rules, actions, and native code |
| Knowledge compounding | Adds more text to prompts | Strings → typed objects → relationships → rules → actions → hardened native code |
| Tool / action surface | Flat list of JSON tools; model chooses from everything | Runtime resolves entities first, narrows to relevant typed functions |
| Data and objects | Full JSON serialization into prompt context on every step | Native C# objects stay in memory; model sees only safe handles |
| Execution | Model reads data → decides → calls next tool | Runtime executes typed actions directly on native objects |
| Sensitive data / PHI | Frequently travels through prompt space | Stays inside runtime with controllable exposure |
| Scaling, such as 15,000 patients | Repeated serialization + token burn per record | Native-speed batch actions on in-memory objects; near-constant tokens |
| Self-extension | Limited to pre-built tools | Can synthesize, compile, and expose new ProtoScript actions at runtime |
| Working medium | The LLM prompt and tool-call loop | Executable code + native runtime objects |
| Control and safety | Relies on prompt engineering and model discipline | Runtime owns data model, execution, and visibility rules |
| Auditability | Text transcripts + generated explanations | Full provenance through the prototype graph |
This table distills the core architectural shift: traditional agents keep everything in language for the model to reinterpret; Buffaly turns language into a structured, executable runtime the model can use rather than constantly re-parse.