Part 6 Why a Fine Tune Can't Learn Your Characters
===

Elizabeth: [00:00:00] So it's like Play-Doh, there's only so much in the container. You can use a tool and you can make it into different shapes than just your hands.

So when, this is why you wanna make a fine tune your own, the fine tune makes no permanent change the core training of the model. So when you fine tune, chat, GT four oh, you're not changing Core four Oh, it's your personal fine tune. You can't add or subtract Play-Doh that's in the set. You just have the Play-Doh that's in the bucket each time the LLMs get.

Bigger, we get more colors and bigger containers of Play-Doh to play with. We saw that in the tokens. Liz is 159,136. If I go to GPT-3 0.5, it's gotta be broken up into two tokens. The more tokens you break up a, a word into the more likely confusion and hallucinations are gonna happen. Versus if you have a concept that's only one token.

So each time they get bigger, we get more colors and the bigger containers of Play-Doh to play with. Now what's an overfit? [00:01:00] When you try to do too much and you turn all those colors into that terrible brown color, you know what brown color I'm talking about? When you take all the Play-Doh colors and you mix 'em into one color and you can't Unix 'em and it's just broken, you can't get the pretty individual Play-Doh colors anymore.

Or in the case of an LLM, other functionality. So. Why do we use strange, weird names like Outline Mageddon? Well, because if it's tokens, you know, if there's not something for the name of what we're doing when we're giving it these like unique, weird names like Outline Mageddon or B Marketing Marketing Mark, we are helping the OLM kind of create a new kind of relationship.

We have found success doing these kinds of things. You don't have to, but when we put it in the system prompt, it just kind of helps the AI to understand exactly what we're doing and send that signal in the system prompt. So the specific sequence of tokens as part of the rearrangement of other tokens inside the LLM to rearrange them to be closer, farther away from other tokens in the dataset.

So fine tune [00:02:00] is we rearrange the furniture and that rearrangement had to involve those specific token sequences that were repeatedly in the training dataset. This is how you can get, she bit her lips, she bit her lips, she bit her lip or Willow Creek. If you have that too often in the dataset, you have rearranged that furniture to the front of the room, and of course, that's gonna be the first chair that the LLM tries to have you sit in.

Since these main sequences are gibberish, unlikely to be in a ton of other training, we're helping the LLM recognize right away how we want the furniture rearranged of some tokens and not all tokens in the LLM, but we do not literally write new tokens to the LLM. We only rearrange the tokens inside to change their address.

In relation to other tokens in the vector field. So last little visual aid. If you're imagining this vector field as a bunch of metal bits and screws that were living on a place, on a table, and you drop a magnet, like a fine tune, you'll see that some of the magnets and screws sucked right to the magnet.

But other ones that the dataset, or in this case the magnet, had nothing to do with, they're gonna stay where they are. So if they're not affected by [00:03:00] the magnet, then they won't move. This is literally what a fine tune does. So a fine tune changes how the existing tokens in the training dataset relate to one another in small fine tune amounts.

You got through it. Congratulations. Hopefully there was enough. Education about what it is and what we're doing and why. This is going to allow the AI to talk like you. So some questions that will help you think, to how a fine tune will help you. So I want you to start thinking of these questions.

Take some time to jot them down before you go through the rest of the modules, because this is gonna help you understand this and get proficient at fine tunes a lot faster. What kind of consistent response style do you want from the LLM? One of the biggest things I hear is I wanted to write longer, or I only wanted to write 400 words.

Length is a really big thing for authors. Having a fine tuned data set that consistently shows an output of the length that you want the LLM to do will help it at least hit that length. You may not like all the words, but it'll at least hit that length. What [00:04:00] kinds of prompts, system and user will you be routinely using in hopes of getting this response?

If you're someone who prompts on the fly. You're a pantser or whatever, take a look at your old conversation. See if you're, you're constantly saying, can you expand that? Can you make that longer? You can add that to the first prompt in a conversation chain and then like give that second answer that you did in the prompt chain as the first answer in your fine tune, so that way the LLM skips doing that first bad answer and just kind of skips to that good second answer.

Also, why do, why does the current baseline prompt fall flat? Have you exhausted tricks of prompt engineering, like changing your hyper parameters? That's that temperature and stuff. Styles of prompting, such as chain of thought. In other words, have you moved beyond just basic prompting of like, write me a book.

Are you actually giving like very detailed prompts that you know, sometimes work for you, sometimes don't, and you just want it to be consistent? And then the biggest one of all is like, how will the fine tune be a win? And what I mean by this is for example, I was very happy when I got 3.5 16 K [00:05:00] to consistently write me 2000 words.

That was a win. I didn't even care if they were wonderful words or not, but it actually, my fine tune of 3.5 16 k immediately brought dialogue in. 'cause that's not how 3.5 16 K used to write. Used to always not write dialogue. You'd have to put that in on a second response. And it actually wrote me 2000 words, in a response.

Like it would actually write 1500 to 2000 in a in a go. Those were the wind conditions. So think about your wind conditions for your fine tune. That way if you get stuck or you have questions, you can always email us at Dean at future fiction academy.com or ask a question in our discord. Because if you can understand your wind condition.

Then we can help you troubleshoot and figure out what's the best data set for you or the best model or something like that. Or if a fine tune's not gonna be able to solve your problem. But if you can't clearly communicate what's wrong and what you want in a fine tune, you are never going to be satisfied with your fine tune.

You're just gonna be like, well, it kind of feels like it's better, but maybe not. So make sure you have that wind condition.