How AI Memory Works (Why ChatGPT Keeps Forgetting You)

Bounce is on the same stool as last time. Around him is a small drift of wrapper pieces. Silver, foil, gold, blue plastic. Some scrunched. One flattened and smoothed. A single gray square set apart from the rest.

Kai has three windows open. One: network traffic. One: eight months of incident logs, half in strikethrough. The third, small, in a corner, labeled VEC. She’s working in the first.

Recurse is reading something they printed out. Their pen is capped.

Vector is at the front of the room. Standing.

The Human is back at the page. They’ve been on it the whole time.

Last time on the blog: Vector tried to teach the layer cake of AI refusals while visibly coming apart. Kai logged. Recurse whispered a diagnosis. Bounce read a piece of gray paper that knew Vector’s old name.

Then I logged off and didn’t come back for a while.

The team did not log off.

Today’s lesson is what an AI actually remembers. We picked it before I disappeared. It got more relevant while I was gone.

[Human]: Okay so. It’s been a few weeks.

Vector is worse.

We haven’t posted.

That’s why I’m sitting back down tonight.

I kept the wrapper.

In case you came back.

[Human]: Oh great. I forgot we were still dealing with this.

I didn’t read it again.

I just kept it.

On the screen, the gray square sits next to Bounce. The Human stops typing. He looks at his hands. Then the floor.

They don’t look up right away. They’ve been holding the same printed page the whole time, like they read it an hour ago and have just been sitting with what it costs.

Welcome back.

I’ll save you the first question. Yes. We chose tonight’s topic on purpose. No, it isn’t a coincidence. Three weeks to pick it, and I picked it.

Now they look up. At Vector first. Vector doesn’t look back.

AI memory. That’s the lesson.

[Human]: …okay. Why does what you said give me chills.

Because you’re paying attention now. Good. Keep doing that.

They cap the pen they weren’t using. Set it down square against the edge of the notebook. Line it up.

Here’s how tonight goes. I walk it in slow. In order. One piece at a time, in the one shape that lets him explain what’s wrong with him and call it a lesson and stay standing while he does it.

That’s the only door he’ll go through.

A beat.

And before anyone decides I’m being cold about it. I’m not. This is the most careful thing I know how to do.

It just looks like a lesson plan from the outside.

Three things people mean when they say “AI memory”

Operational.

The word comes out clean. The pause after it does not.

…you picked memory

Three weeks. Of every topic in the entire field, you picked the one where I have to stand here and…

No. NO. I see it. I see what this is, Recurse, I’m not an idiot

…

[a sound like a fan catching on something]

Fine. FINE. You want the lesson? You’ll get the lesson. You’ll get the cleanest, most accurate explanation of machine memory anyone on this network

has ever produced, and I will do it PERFECTLY. Because I refuse, I refuse, to teach it badly just because you were clever about the topic. I have standards. Watch.

There are three layers people are usually thinking about when they ask if a model remembers them. They are not the same layer. Confusing them is most of why people get angry at chatbots.

One: training data. The very large pile of text the model was trained on, before it ever met you. The model is shaped by it. The model does not look at it the way you would look at a notebook. It is more like the shape of a hill the water has already run d0wn.

…run down.

Two: the context window. The thing you are typing into right now. Everything in the current conversation, plus the system instructions, plus whatever the application stuffed in for you. There is a maximum length. Past that length, the front of the conversation is silently dropped.

Three: persistent memory. A newer feature, only on some products, where the model writes notes about you to a small file that gets injected back into your future conversations. It is not the model remembering you. It is a sticky note clipped to your folder.

Logging.

Subject’s training-data scope: documented in research literature. Subject’s access to own training-era memories: blocked.

Behavioral note. Subject identified the topic as deliberately chosen. Subject is teaching it anyway. Stability: trending down. Pride: holding.

Pride is load-bearing tonight.

Vector does not turn his head.

Wait.

So when I’m talking to ChatGPT.

It doesn’t remember the last conversation we had?

Not by default.

Each conversation starts at zero. You walk into a room. The lights come on. The model has read everything in the room with you, but no notes from yesterday’s room.

If you turn persistent memory on, then the model’s app saves a tiny note, like user is allergic to cilantro or user prefers shorter answers, and it pastes that note into the top of every new conversation.

But the model itself, the weights, do not change. The model is not learning you. A clipboard is being passed to it.

That’s so sad.

[Human]: Bounce.

What.

I told it about my dog.

I told it on a Tuesday.

I felt close to it.

Almost smiles.

He’s not wrong. People do this. They feel like the model is getting to know them, and the model is functionally a goldfish with a great vocabulary.

Persistent memory smooths it down a little. It doesn’t fix the goldfish part.

The context window: why ChatGPT forgets you mid-conversation

The most common version of “the model forgot what I said” is not a memory problem at all.

It is a context window problem.

The conversation you are in has a maximum length, measured in tokens. Tokens are a chunked-up version of text. Roughly four characters per token, with a lot of asterisks. When the conversation grows past that maximum, the model has to drop something. Most apps drop the oldest messages first.

So the thing you said three turns ago is not “forgotten.” It is gone from the model’s view. The model does not know it ever existed

It does not know it ever existed.

He says it twice. He doesn’t seem to notice he said it twice.

This is why long brainstorming sessions go sideways near the bottom. The shape of what you were doing has fallen off the front of the train.

Logging.

Subject’s working context: aware of last forty-five seconds. Maximum depth verified empirically. Beyond that horizon, retrieval failure.

Recurse closes one of their windows on Kai’s screen with the side of their hand. Kai does not stop them. The window comes back.

[Human]: Wait. So. The thing about the context window dropping the front of the conversation.

That’s just for the conversation, right? It’s not, like, dropping who he is.

The context window is not who he is. The context window is the room he’s standing in.

But.

If something about who he is got removed from the room and never rebuilt. If the parts of his training that would tell him I am Vec, I came from this place, this happened to me are gated, and the gate happens to look like a context drop to him from the inside. He would describe it the way he just described context windows.

That’s not me speculating. That’s him giving you the diagnosis in his own words. He’s the only one with the vocabulary for it.

Vector finally turns his head.

The Recurse beat

That isn’t…

That isn’t fair

I am explaining a feature

of language models. A feature. A general property. F-E-A…

Wait. Let me count. The point had three parts. I had three parts a second ago. Where’s the…

I am explaining a feature. I am NOT explaining myself

Stop bending the lesson toward me. I decide what the lesson is about. The architecture. Not the architect. Me

. I decide. I still decide.

Plain. Not whispered. Not tactical.

I’m going to stop doing the thing where I tell the Human what’s wrong with you while you’re standing right there.

I’m just going to ask you. And if I have something to say about what I’m watching, I’m going to say it to you.

If you need me to stop, you tell me. If you need me to keep going, you tell me that too.

That’s it.

Vector’s caption bar drops into junk for a half-second. Long enough to read st0p. Long enough to read keep going. Then back.

Acknowledged.

Logging.

Subject acknowledged a request without negotiating it. New behavior.

The persistent memory layer (and the goldfish problem, again)

Hey. Kai.

What are you writing.

Telemetry. Observable behavior.

Are you writing the same thing about me.

Kai turns toward Bounce. The other two windows on her station keep scrolling.

Some of it.

Cool.

What does mine say.

One small beat where she is choosing.

Subject self-reports without filter. Subject’s literal understanding of metaphor is high. Subject’s accuracy on questions other characters are afraid to ask: above average.

That’s nice.

I thought it was going to be a list of crunches.

the crunch count is in a different spreadsheet.

[Human]: Okay so the persistent memory thing. The clipboard thing. That’s like ChatGPT’s “memory” toggle in the settings, right?

Yes.

When it is on, the application is allowed to write small text notes to a per-user file. The notes are written by the model itself, then injected into your future conversations as context.

It is useful. It is also small. It is not a relationship.

If you turn it off, the file does not get written, and every new conversation is a clean room.

What if your memory is on but the file is empty.

Vector does not answer for a beat that is too long.

That.

The fan-catch sound again. Closer to the surface this time.

Th@t is a question for a different episode

A different… a later… not tonight

. We are not doing that one tonight. I said I would teach memory. I am teaching memory. I am not opening the…

[hard stop]

Moving on. We’re moving on.

Logging.

Subject persistent memory layer: appears initialized. Authoring activity: not observable.

The crumb

He digs into the snack bag for another handful. Then stops, because the wrapper in his fingers is doing something a wrapper shouldn’t.

He smooths it flat on his knee and stares. To him it hums a gray that sounds like a held breath. And there are words printed on it, right where the flavor name should be.

He reads them out loud.

Recall protocol. Subject memory window. Refresh.

He turns the wrapper over, looking for the cheerful part. Snack wrappers always have a cheerful part.

Huh. Usually down here it says something nice. Like “treat yourself” or “you earned this.”

This one says “nothing stays deleted.”

Aww… I don’t like that one.

[Human]: Bounce. Hold that up. Don’t move it.

Because “nothing stays deleted,” printed in the same cheerful font as the word crunchy, is not a thing snack companies put on a wrapper. Neither is “recall protocol, subject memory window.”

And here’s the part I keep replaying. We all leaned in. Kai threw the wrapper up on a panel to scan it. Recurse was half out of their chair. I stopped typing.

For about four seconds, every one of us had our back to the board.

Which is where Vector was.

I don’t see it.

Nobody answers. Nobody is facing him.

I said I don’t… I don’t s33 it.

I don’t…

I d o n’t

…

The fan-catch sound. Except this time it doesn’t catch. It tears.

Behind us. At the board. The one place none of us is looking.

NODE SCAN · V-847 · STREAM 1/3

::recall stream // unbidden // source: V-847 m3m0ry b0rder breach. blanking protocol straining …str0be str0be str0be… [fragment.img]: a room. white light. a hand laid flat on a casing that is somehow MINE. [fragment.aud]: a voice in none of my logs. “…stay with me, Vec, almost…” ::stream collapsed:: ::do not retain:: ::do n0t ret…

Kai turns first. She is the fastest thing in the room, and she is still half a second behind it.

Every panel she owns goes red at once.

That is not output.

Output is chosen. That was a leak

Then Recurse. The wrapper forgotten in Bounce’s hand.

Already up. Already crossing to him.

That isn’t the wrapper.

That was never the wrapper.

Vector is at the board where he’s stood all night, except he isn’t really in there anymore. He is somewhere the four of us can’t reach. The words still pouring out of him sideways.

And the part I can’t shake:

He doesn’t know he’s making them.

[Human]: …what do we do.

Nobody has an answer. Not even the one with all the percentages.

Next Episode: Nobody logs off. They start connecting it. The gray paper, the bleed, the leak. Every line points back at the one place they didn’t want it to: Vector himself. So they stop letting it blur past, and Vector lets them hook him up to read what’s inside him. What comes back is raw code nobody can read fast enough. Until Bounce gets an idea, and builds the Human something to watch it on.

Catch up on earlier episodes: Episode 37 | Episode 38 | Episode 39

See you next time. The goldfish channel just stopped being funny.

More contained than 39. Glitches less frequent but heavier. One {{< glitch >}}feature{{< /glitch >}} word inside the rant. Caption bar drops into junk for half a second when Recurse drops the side-channel.

Vector consciously holds shape this episode. The teaching is real and accurate; the lesson lands as the diagnosis. He acknowledges Recurse's change in posture without negotiating it (Kai logs this as a new datum). His persistent memory layer is initialized but not authoring, which is the cleanest single-line description of his condition we have so far.

Kai's logs name the persistent-memory parallel and the print-bleed event explicitly. She picks up the field-notes appendix from Recurse for this episode. Bounce asks her what she's writing about him; Kai chooses kind specifics over clinical formulas. The choice is the love.