Part 4 The Basic Anatomy of a Fine Tune
===

Elizabeth: [00:00:00] So we're gonna talk about the anatomy of dataset to 3.5 16 k. why am I going all the way back? Nobody's using 3.5 16 K Elizabeth. I know, but this is where it started. And so this being the intro, I'm sharing this information because if this is enough information for you to just go watch, read the documentation for diner trainer and go use diner trainer.

We're rooting for you at the FFA. We don't want authors to have to be reliant on us. We teach and we do the things that we do so that authors have the technology and have the education if they need it. So if you can do it on your own, we're like saluting you. If you need our assistance or our help, we're here for you.

That's, that's how this works. So, like I said, data sets are at least 10 examples of a prompt. A response. So system and user and response. This is where it started. There's more options now. 10 is the minimum. If you're doing long form, you really want to do as many examples as you can.

You can use synthetic data, which means [00:01:00] it's a bunch of AI writing that you validated and you changed, such as using outputs from a higher model, and then you're feeding it into 16 k to make. Six 3.5 16 K, right? Like GPT-4. Oh yeah, you can do that. We did this it says last month 'cause these notes are from, a while ago.

So we did this actually last year. Feeding GPT-4 outlines into GPT-3 0.5 16 k. The resulting fine tune wrote a longer and more detailed outline than either baseline. So you can also use only human data. I have a fine tune, that's just my human writing that it's trained on. Data sets are written in JSON l formatting for open ai and minstrel, Google's gotta be the odd one out.

They're using CSV, but there's some limitations on that, so I have a feeling they'll eventually change over. So let's take a look at some examples for a data set. So these are, it has a system prompt, a user prompt, and then it has an assistant prompt. In this case, AI did write some of this, but AI doesn't have to write any of it.

What you're doing is you are, you [00:02:00] are putting these together. So that you're basically like when you teach a kid how to tie their shoe and you just show 'em how to tie their shoe over and over and over and over again, you're gonna show the LLM how to tie their shoe over and over and over and over again.

So this first one, fabulous. Greg, a writing assistant. You analyze writing and rewrite it to remove cliches. So I would give some examples, rewrite the following. She goes between a rock and a hard place because she couldn't decide between Brad or Chad. Her stomach was in nuts over the two men. And little did she know her life would never be the same.

I tried put in as many cliches as I. Now I am rewriting this. She found herself torn between two men who loved her, Brad and Chad. The two men could not be more different in how they made her feel, but she didn't want to hurt either one. Somehow the decision felt final. Two final. Now, you could still say like, this is a little bit cliche, Elizabeth, but it's not nearly as bad as this one.

So what I would do is I would do 10 examples of this and I would put it together, and then when I wanted to go use this fine tune, I would use the same system prompt. I'm like, you're a fabulous writing, existent named fabulous, Greg. And [00:03:00] then you would analyze, I would give it some writing, and then it would immediately.

Write back stuff that's not cliched. And it would be consistent. Now, some people will be like, Elizabeth, I don't need your fine tune. I can just prompt this and then I'm gonna get good responses. If that's the case, you're right, you don't need a fine tune. But if you're finding yourself still editing even this part of what the AI puts out, then you can start using a fine tune to give it a bunch of examples.

So it'll start be, it'll start being closer to what you want. Here's a marketing example. You're a marketing genius named marketing. Marketing Mark. Like Marky Mark, you take a blurb of a book and you write three funny short hooks for social media videos. Now here I use an example of Frankenstein. So this data set, this middle one might, you know, be a whole bunch of different book blurbs.

And I'm showing it examples of like what kind of hooks I want. So here's an example. You thought he was dead. Well, no one be his friend. It's alive and very shocking. Um, okay, and then here you're a writer's best friend who always gives good story [00:04:00] ideas mixed with humor named loud Lizzie. So this is for someone who likes to brainstorm, but you want your AI to have a bit of personality.

Now you can always just say in the prompts, you're like loud Lizzie, and you make crude jokes and stuff like that. And like I said, that might be sufficient. That might be like, nah, that's good enough. These are for when people want the AI to always be on the ball. I need story ideas for my genre of a historical fiction book involving a duck.

Sure. Best friend. I'm so glad we're hanging out today. Wanna go to the mall later? Here's that story idea premise. A duck is stuck at the Eiffel Tower in Paris during World War ii. Historical fiction, outline, quacks and courage. A duck's tail from the Eiffel Tower. Now these are silly examples. They're just to make you guys laugh and to see like your opportunity to fine tune a model is.

Sky's the limit. Anything you can think of that you can give the AI examples for, it will impact and start to respond back in that style. If your first one doesn't do very well, try increasing the number of examples you're giving it, make sure you have [00:05:00] clear criteria to judge it.

Okay, so there's now two new versions of data sets that you'll also be learning in this class. The first, the examples I just showed you, where you would do that, you know, 10, 20 times. That's the very first style, which is, um, structured fine tuning. So conversational is where you can now do a system. User assistant, user assistant, user assistant, user assistant.

The assistant is ai, the user is the human role. When I say the assistance, the ai, it doesn't mean the AI has to write those responses. It means you're modeling what you want the AI to write. And then direct preference optimization, DPO, this is where you give the model a prompt and you say, good, bad, good, bad, good, bad.

Both of these methods are less than six months old. So do make a data set of a variety of high quality story components , for your genre summary, et cetera.

Ask or scene briefs. Make the example. AI responses high quality, consistently formatted scenes. So you can do those kinds of data sets. You can do data sets of high quality writing, and [00:06:00] the AI will start matching that high quality writing just like it tries to match your writing style. Just when you're doing one shot prompts inside of chat.

You don't wanna run the character list over and over again and expect the LLM to know your character. We saw this with steps. Steph Fine Tune actually had, a dog that was in a couple of the different, examples. And so then when she was using the fine tune for other things, it started using the dog's name for characters.

This is because when you have a small data set. Every token in that dataset suddenly gets higher attention. And we're gonna talk about this more in just a second with some visuals. The way LLMs work is everything is pieces of words kind of suspended in this multidimensional space called a vector store, and they're connected by their calculated relationships to other words.

So for example, the words dog and cat probably live closer together than say, dog and lion. Dog and cat they're most often written together in passages of writing, [00:07:00] more so than dog and lion. But cat and lion have a different relationship than lion and dog, and all of those words, dog, cat, and lion, are gonna have completely different relationships with the word pet.

So if you can imagine these words kind of like grooving and singing out in this space of like their location is their. Is their mathematical relationship to every other token that's there that they have a connection to. It's bonkers, right? Just like it's mind-boggling. It breaks my brain too. So when you put them in your dataset, you start to shift them around.

And so when a word or token is in a dataset and it gets. A lot of attention. It basically, it's giving extra attention. That's what the core functionality of, an LLM is the paper that started these transformers, which is what the T stands for in GPT says, it's from a paper called All you need is Attention.

So you're giving more attention, attention [00:08:00] to that one word. You're gonna see that word in all of your responses a lot. , which is the same thing that humans do. By the way. Humans have crutch words too. My crutch word is just, so when Steph did this first dataset and it had the dog's name a couple of times in there, all of a sudden the AI that was fine tuned was like, oh, she really likes this word for a name.

So the more varied your dataset can be, the stronger your fine tune's probably going to be. Over fitting, over training with a small un varied data set that makes the LLM unable to perform other tasks because it's stuck on a very narrow, fine tuned behavior. So in step's case, the dog name every once in a while is not the end of the world.

And we see this too in overfitting, in the base models. If I asked you all, give me a list of five words that the AI constantly writes you, you can gimme that list, cacophony, tapestry, whatever the word of the day is, Willow Creek. Yeah, that's because those phrases, those tokens are overfitted in the model.

They're just overtrained, they have too much attention on them. So you [00:09:00] can introduce that with a fine tune, that has that. I did it with my fine tune too. I realized that when I kept getting responses back that Elizabeth was biting her lips. She bit her lips, she bit her lips, she bit her lip. I'm like, oh my gosh, she can't bite her lip this many times.

And then I go to my dataset and I find out that like in four of the 20 chapters I gave it, I have Elizabeth Bitter Lip. So I basically overtrained that one phrase on accident.