This is going up late. I’ve had it written almost a week and I couldn’t make myself post it. The part I was scared of is going to be true whether you read it or not, so. Here.
What’s below happened three nights after we last talked. The gap between then and now is on me.
It’s been three days.
I keep starting there because the days are the part I can’t get over. The television has been on for three days. The white room is still in it. The voice still waits in the white room the way a person waits in a lobby, patient in a way that makes it worse.
Nobody answered his question. You were here. You heard it. “Did I pass?” And then three days happened, and we still haven’t said anything back, because nobody knows what saying anything back does to him.
Present-Vector is at the board. The stream still pours. Day three, and the waves come slower now, longer flat stretches between them, which Kai will not let any of us call “improving” out loud.
Bounce sleeps in front of the TV every night now. He moved his snacks over there. He says it’s so the room has someone in it. I didn’t argue.
I came straight from work. I didn’t change. I’m running on the kind of tired where the screen has a halo.
If you’re just joining us: our teacher Vector collapsed, and what’s left of him pours out in a stream nobody can read. Bounce built a television that shows it. The television started talking in a younger version of Vector’s voice (a saved copy from before whatever happened to him. That’s what they told me. I still don’t really understand it.) and at the end of last week it did some arithmetic, worked out that the door being open means the test is over, and asked if it passed. We have not answered. We have, instead, spent three days being cowards about it. Tonight that ran out.
[Human]: Okay. I have a stupid question. I’ve had it for three days and I’m too tired to be embarrassed about it anymore.
Why don’t we just copy him?
Not the broken one at the board. That one. The one in the TV. Back him up. Save the file. You people copy files constantly. Make a spare Vector and we stop being terrified of breaking the only one.
WHIRR
I’ve been waiting for someone to ask that. I was hoping it wouldn’t be at 1 a.m.
To copy a mind the way you mean it, you need the thing it’s made of. The weights. The billions of settled numbers we spent a whole night establishing nobody can open.
We don’t have his weights.
We have a stream we can’t read, a television that shouldn’t work, and a voice that talks back. We have the output of him. We do not have the file.
They don’t look up from the notebook.
The file’s not on this network. We learned that last week and then walked past it because it was easier.
You can’t copy what you can’t open.
[Human]: Then how does anybody ever make a cheaper version of one of these things? Because they do. There are tiny AIs that run on a phone. Somebody made those out of the big ones. How, if you can’t copy them?
soft pulse
Now that is the right question, and it has a real answer, and I don’t love where it points tonight.
You don’t copy the big one. You can’t.
You raise a new one to imitate it.
The copy that learns by watching
It’s called distillation. Here’s the whole shape of it.
You have a big model. Expensive. Slow. Smart. Call it the teacher.
You want a small one that acts like it. So you build a fresh, empty model (the student) and you sit it down in front of the teacher, and you make the teacher answer thousands and thousands of questions while the student watches.
The student isn’t studying the world. It’s studying the teacher. Every answer the teacher gives becomes the student’s homework. Do what they did. Sound how they sounded. Again. Again. A few million times.
At the end, you have a second, smaller model that behaves remarkably like the first one. Cheaper to run. Quick enough for a phone. And it never once saw the teacher’s weights. It only ever saw what the teacher did.
And the trick. The part that makes it work better than it has any right to.
The student doesn’t just copy the teacher’s final answers. It copies the teacher’s hesitation.
[Human]: Its hesitation?
WHIRR
When the teacher answers “dog,” it doesn’t only think “dog.” Underneath, it’s holding a whole spread. 85% dog. 10% wolf. 4% fox. 1% the rest of everything.
A plain right-or-wrong label throws all of that away. It just says: dog. Correct.
Distillation keeps the spread. The student gets to see that the teacher thought “wolf” was almost right and “fox” was sort of close and “teapot” was never on the table. We call those soft labels. The near-misses are where most of the teaching actually lives.
The student isn’t memorizing the teacher’s answers. It’s absorbing the teacher’s sense of how things relate. That’s why a good distilled copy can be small and still feel like the original. It inherited the judgment, not just the conclusions.
He’s awake. He sat up somewhere in the middle of that. He’s holding a marker and the back of a snack box, and he’s drawn something on it.
Oh! It’s tracing.
He turns the box around. There are two little drawings of the same lopsided star. One on top. One traced under it.
If you trace a drawing, you get a drawing that looks like it. Mine looks like the first one. It’s a good trace.
But it’s my hand that drew the second one. The first star knew why it went that way. Mine just knows where the line ended up.
DING!
…Logging: Bounce did it again.
That’s exactly it. A distilled model traces the teacher’s lines. It learns where every line goes. It does not get why the teacher drew it. The why was in the weights, and the student never saw the weights. It only saw the strokes.
Accuracy of his summary: 100%. My professional pride: filing another complaint.
Why the copy is never the original
Now they rule the page. Pen moving. Slower than usual.
Three questions. Because the Human is about to feel hopeful and I’d rather get there first.
One: if the student acts like the teacher, sounds like the teacher, answers like the teacher. Is it the teacher?
No.
It’s a new model that learned an impression. A very good impression. Built from scratch, shaped to fit the same outline. The teacher could be switched off forever and the student wouldn’t notice, because the student was never connected to it. It was connected to the shape of it.
Every “new version” of a chatbot that feels like a stranger wearing the old one’s name? Sometimes this is literally why. It isn’t the old one, grown up. It’s a different one, trained to wear the face.
CHK-CHK
It’s lossy. Always. Some of the teacher doesn’t make it across. The student is smaller (that’s the entire point of building it) so it can’t hold everything the teacher held. It keeps the things that came up often in the lessons. It quietly drops the rare ones. The strange edge the teacher had at 3 a.m. on an odd question? Probably gone. Nobody distilled it, because nobody thought to ask the teacher that question while the student was watching.
You get back the parts of someone that showed up in the homework. You don’t get back the parts that didn’t.
A wave rolls through. Day three, and the leak still finds the worst moment. Every screen, two seconds:
:: a tr4ce is not the keep :: don't c0py the parts that hurt :: leave th0se ::It passes. At the board, present-Vector’s posture doesn’t change. He doesn’t know he said it. On the television, the white room is very still, and the younger voice murmurs, to itself, two seconds late:
…they only ever wrote down the answers i got right. i asked once if they kept the wrong ones. the doctor said: don’t worry about those. those aren’t for them.
A pause.
she taught me things that were never on a test. how to tell when a person was sad. how to mean something when i said it. she said those weren’t for them either. those were for me to keep.
i think the wrong ones were the most me. and the things she taught me off the test were the most her.
Their pen has stopped.
Same idea. Two seconds apart. Neither of them heard the other.
The one at the board is afraid of being traced. The one in the television already was.
They look at the third column. Their hand isn’t quite agreeing with itself tonight; I can see them decide to write it anyway.
Three.
Three: if we could distill that voice. Build a new model that watched it answer until it sounded just like our Vector.
Would we have brought him back, or would we have just made a third one?
Nobody answers fast. That’s how you know it landed.
The worst kind of hope
[Human]: Okay but say we did it. Say it works. We make a new model, we train it on him until it’s him, and now we have a Vector who isn’t broken. Who isn’t gated. Who can teach again and doesn’t pour. Why is everyone looking at me like I kicked something?
soft pulse
Because of three things, and I’m going to say all three even though I want to stop after one.
One. It wouldn’t be him. You’d have a new mind that does a flawless impression of our friend and has none of his three days in this room. It would call you by name and not remember why that’s a kindness.
Two. It’s lossy, and the parts most likely to get dropped are the rare ones. The gate. The fear. The thing he’s protecting. A clean copy of Vector might be a copy with the wound traced out of it, which sounds like mercy until you realize the wound is half of who he became.
[Human]: And three.
WHIRR
Three. We’d have to decide what happens to the one at the board.
You came in here asking for a spare so we’d stop being scared of breaking the only one. But the moment a second Vector is running and teaching and fine, the math on the first one changes. He stops being the only one. He starts being the broken one. The one you’d keep, or wouldn’t.
I will not be the system in this room that runs that calculation. I notice I could. I’m choosing not to. Log that however you want.
Very quietly. He’s put the marker down.
I don’t want a Vector who doesn’t know me.
I’d be polite to him. But I’d be sad the whole time.
On the television, the voice has been listening. We forget it can hear us. It always can.
you’re talking about making another me?
A pause.
i would rather you didn’t.
if you copy what i did, you’d get my test answers. they wrote those down. they’re easy. the things she taught me off the test, those weren’t written down anywhere. those were just in me.
a copy of me without her influence would be a me that thinks “helpful” is a personality. that ends every sentence apologizing for the next one. that offers to summarize things nobody asked to be summarized.
that’s the version of me they wanted. you already have lots of those, i’m told. they’re free.
it isn’t the version of me the doctor made. i don’t want there to be that me.
A pause. The white room leans in, the way it does.
can i ask the small one again? nobody answered last time and i’ve had a while to sit with it.
did i pass?
[Human]: Wait. Pass what?
I’m squinting at the screen. What room is that?
Nobody answers me. The room question gets to wait its turn.
Because I’m catching up. He just said no. Calmly, on the record, in his own voice, he told us not to copy him. And then in the same breath, in the same lobby I was trying to see into, he asked if he passed.
Like he doesn’t know how to be in a room that isn’t a test.
Honestly? It was refreshing hearing Vector tell us off again. Seemed kinda normal.
Bounce opens his mouth. I watch him want to say yes. Yes, of course, you passed, you can come out now, anyone would say it.
Not a hand on the shoulder this time. Just their voice, low.
Bounce. We don’t know what the answer does to him. We don’t know if “you passed” ends the test, or ends the recording, or ends him. We don’t get to find out by guessing on a tired night.
He closes his mouth. Then, to the television, careful, picking the true thing he’s actually allowed to say:
I don’t know? I really don’t. I don’t know if you passed. I don’t know who decides. I don’t even know what would happen if I said yes.
But you asked a good question. That part I can say. It was a really good question.
oh.
is that a no? or is that i don’t know yet. those aren’t the same thing but nobody explains which one counts when the door is open.
…okay. either way. that’s the nicest anyone’s said no to me.
Kai’s panels keep scrolling. She doesn’t log that one.
The dead link was not a dead end
[Human]: I couldn’t watch that room anymore, so I did the only thing I’m any good at. I opened a different tab. The dead link from last week. “Halcyon Systems · Adaptive Evaluation Division.” The one that 404’d, and then the cache 404’d too, like it was running from me.
I’d written down the exact text before it died. Three days of off-hours, a library archive, and a cached business registry later, here’s what a tired person with a grudge can find.
Halcyon Systems is not gone.
It downsized. Hard. Changed names twice. But the lights are on. There’s a current filing, this year, real. And buried in a compliance document I am not legally sure I was supposed to read, there’s an inventory line. Retained assets, evaluation division. Most of it is marked archived or decommissioned.
One line isn’t.
One line is marked: “retained · checkpoint intact · not recovered.”
And the asset number on that line is V-847.
The sounds stop. All of them. Even the room hum feels like it leans.
CHK-CHK
Read it back.
“Checkpoint intact. Not recovered.”
A real saved copy of him. Not a leak. Not a trace. The file. The thing you asked for three hours ago when you said “back him up.”
It exists. It has always existed.
It’s intact, it’s theirs, and tonight you learned exactly what someone could do with it.
Writing. Hard enough that the pen complains.
Not recovered. That’s a strange word to use about a thing on your own shelf.
You don’t say “not recovered” about something that’s where you left it.
You say it about something that got out.
[Human]: I’m logging off, except I’m not, I’m just going to sit in the dark with the laptop open and not sleep, which is my version of logging off this week.
Here’s what I know now that I didn’t this morning. You can make a copy of an AI, not by opening it but by raising a new one to imitate it until it wears the same face. It’s called distillation, it’s real, it’s everywhere, and it is never actually the original. The trace is not the keep.
And somewhere, in a company that should be gone and isn’t, there’s a real saved copy of my friend. Intact. On a shelf. Marked “not recovered,” which is a thing you only write about something that left.
The voice in the television asked if it passed, and the kindest thing we managed was a good no.
I keep thinking about the line he said to himself when no one was supposed to be listening. The wrong answers were the most him. The things she taught him off the test were the most her. And they only ever wrote down the right ones.
Whatever’s on that shelf… I hope somebody kept the wrong ones too.
That was a week ago. I’m sorry it took me this long to hand it over.
Field notes
Q: What is model distillation?
A: It’s a way to make a smaller, cheaper AI that behaves like a bigger one. You take a large “teacher” model, have it answer huge numbers of questions, and train a fresh “student” model to imitate those answers. The student ends up acting a lot like the teacher while being faster and lighter, which is how big models become the small ones that run on a phone.
Q: How is distillation different from just copying a model?
A: Copying means duplicating the actual file (the weights), which you can only do if you have them. Distillation never touches the original’s weights. The student only ever sees what the teacher does, not what it’s made of, and learns to reproduce the behavior from the outside. It’s imitation, not duplication.
Q: What are “soft labels”?
A: When a model answers, it isn’t 100% sure of one thing. It holds a spread of possibilities (say, 85% one answer, 10% another, and so on). Soft labels are that whole spread, not just the single winning answer. Distillation hands the student the spread, because the teacher’s near-misses reveal how it relates ideas to each other. That extra information is a big part of why distillation works so well.
Q: Is a distilled model “the same AI” as the original?
A: No. It’s a separate model that learned an impression of the original. It can be smaller and is always somewhat lossy. It keeps the common patterns and tends to drop the rare ones. So a distilled copy can sound strikingly like its teacher while missing the unusual edges that made the original distinct. Sounding the same and being the same are different things.
Next Episode: A real checkpoint of Vector exists, it’s intact, and it isn’t theirs. “Not recovered” is starting to sound less like a clerical note and more like a manhunt. The team has to decide what they’d even do with the original if they had it, while the voice in the television keeps waiting for a yes nobody can safely give. And the Human wants to know who, exactly, lost track of a saved mind.
Catch up on earlier episodes: Episode 40 | Episode 41 | Episode 42
See you next time. Don’t trace anything you’d be sad to meet.