How to master prompt engineering & the future of generative AI
Join Dr Keith Grimes for an engaging session on "Mastering Prompt Engineering & The Future of Generative AI." Prompt engineering is the process of providing and manipulating detailed instruction in generative artificial intelligence (generative AI) software to achieve desired high quality and relevant outputs.
This session will begin with a hands-on demonstration designed to teach or improve upon your prompt engineering skills. Together we will discover techniques to craft precise and effective prompts that optimize AI performance.
Following this, we will briefly explore the future of Generative AI. Explore emerging trends, potential innovations, and ethical considerations shaping the next wave of AI advancements. This session will equip you with practical skills and forward-thinking insights essential for navigating the evolving AI landscape.
Expert: Keith Grimes, Digital Health & Innovation Consultant, Curistica
Transcription:
Good afternoon to everybody joining us and welcome to today's webinar. Kieran Walsh is my name. I'm clinical director at BMJ. It's my pleasure to be the moderator for today's session. The topic for today's webinar is how to master prompt engineering and the future of generative AI. And we're delighted to be joined once again by our expert speaker, Keith Grimes.
BMJ Future Health is a new community that we've created that consists of webinars, podcasts, and a live event in November. We believe that innovation through digital solutions is essential to creating thriving, healthcare systems that are stable, financially conducive to a supportive workforce, supported workforce, and that deliver both better patient outcomes and healthier populations.
At the end of this presentation, we'll have a few minutes Q& A where you can submit your questions. Please do add your questions to the Q& A box. Just a reminder that this session is being recorded and will be available to watch again. So it's my pleasure to introduce Keith Grimes, Digital Health and Innovation Consultant at Curistica.
Keith will be talking about prompt engineering, playing safe, RAG, and reuse, and will also briefly explore the future of generative AI. So Keith, warm welcome to BMJ Future Health. Over to you, really interested and looking forward to this talk. Fantastic. Hello, everyone. Now, first of all, can you see the screen and hear me speaking?
Kieran, just thumbs up will do it. Yes. We're ready to go. Okay. Fantastic. Oh I'm totally delighted to be speaking to you all today and talking about prompt engineering and the future of generative AI. an area that I find endlessly fascinating and practically very useful in my work as a digital health doctor, but it's also in time and even now going to be very useful to almost everyone.
If you understand the strengths and weaknesses of the technology and how to maybe get the most out of it. That is my mission today. So we're going to get started. So this is me. I won't go through this in too much detail. You've heard all you really need to know. But why am I teaching you about this?
Okay. So this gentleman here, Ethan Mollick. Ethan Mullett is a lecturer in the U. S. and he wrote a book called Cointelligence. Generative AI and large language models are very capable tools. They can do a lot of different things and they're good at some things and they're not so good at other things and they're changing all the time.
So how do you use a bag of tools like that? One way is to use it alongside what you do every day and work it out yourself. And a study that they did at the Boston Consulting Group found that if you gave consultants a place like this, some tasks that were very similar to the work they would do every day.
And you just let them use something like ChatGPT to see if it could help them. What would happen? And they found out for the very many tasks, they got better. They were able to complete more work. They were faster. The quality of the work improved. And those that started weakest actually raised the most.
But for some tasks, They weren't so good. It actually made things a little bit worse. And Ethan Bullock describes this as the jagged frontier. This is the sort of boundary inside which AI performs very well and you get better, outside which you get worse. Now, why is that useful to know? Number one, that wall is always moving because AI is always changing.
And the second thing is that people's jobs are all different. So how do you work out how to use AI in that situation and the answer is you have to use it. You have to work this out yourself and share how you're getting on. And this is why I talk about generative AI and prompt engineering because that's the key way that you can do this safely and effectively.
So we're going to very briefly recap LLMs. Super brief on this one considering the strengths and weaknesses and risks. And then we're going to go straight into prompt engineering, which I covered quite quickly last time. I'm going to take a bit more time this time. I'm going to tell you about playing safe, about the things you need to know, and tell you a little bit about something called RAG and how you reuse these models.
We're going to look at the future of generative AI. And then at the end, we can have some questions and answers. And if you put them in the chat, then we can go through them afterwards. And I'm always available on LinkedIn and in contacting if there's questions that we don't answer. Blitz recap. Okay.
Deep learning, type of machine learning that uses neural networks. multiple layers, hence deep, and identifies patterns in data. Generative AI is a form of deep learning, and what that does, it takes that learning and it generates new data off the back of it, across all modalities. Text, speech, video, you name it.
But a large language model works specifically on text. Something called a transformer model, trained on a very large amount of data, taking the text in, predicting the next word until it says it should stop. That is a large language model. It is strong in being very broadly capable. It can summarize, expand, reduce, translate and even reason in a chat interface, so it's very convenient to use.
But it is quite weak in some areas. It can't search the internet without specific tools. It's not very good at maths or references. It hasn't got a very good memory unless it's provided with the means by which to hold on to these thoughts. It's not very good at explaining itself in the sense that it's not explainable, although it can give reasons and it can actually lose focus a little bit.
And then that means there's some risks. It can hallucinate. It can create very convincing information that's incorrect because it's trained on real world data. It can present biases that are in that data or how it's being used. Data that is private can sometimes be surfaced from within this as well.
And because it's trained in a period of time, it's performance can drift. Over time, it can get less good over time, or if you put it in certain populations. If you're using internet data, copyrighted data can be incorporated. It can be quite costly in terms of energy, and even water to cool the data centers.
And then when it comes to clinical use, issues of safety and regulatory compliance. And again, to this date, there is not a large language model that is in a regulated medical device to my knowledge, although a lot of work is going into it. Okay. Let's recap done. Hopefully some of you aren't coming to this fresh, but you remember, Oh yes, this is what they can and can't do.
And this is very important because we're about to try and push it a little bit to go beyond where you are right now to somewhere else. Prompt engineering. Why prompt and why engineering? Prompt is this kind of instruction. If you want to get a large language model to do something, you have to tell it to do something.
Or you can train your own model, phenomenally expensive. Or you can even fine tune a model, get your own data and do that too. But that's quite hard. The thing that's available to you most of the time is providing information in the form of a prompt. So it's an instruction, natural language, like a task or a question or a series of instructions that allows the model to respond.
Now, how that prompt is phrased has an enormous significance in terms of how the model then performs. And prompt engineering is just a term that has evolved about how people can then understand how this works and craft their instructions in a way that gets them better output. That's pretty much it.
Okay, basic approach. You always have to think about what you want to do. Don't, spend a moment thinking about exactly what it is you want to do and then get set up to do it safely. I'll talk you through that. And then the basic thing is you'll start just by asking and then increasingly add new things, structure, context, step by step thinking, further instruction, and then I'll show you how you can then wrap it up and reuse it in the form of a custom GPT.
So Here we go. How do you get set up? There's a number of different chatbots out there that you can use. The first thing you need to do is say do I need to use AI at all? AI is great fun and quite capable, but it's not always the best tool or the safest tool for the task. There are many different ways that you can approach this that don't use such complex technology, but it does start with thinking about what is it I want to do?
You're thinking, what is this problem? That reflects nicely on the theme of everything we're doing. What's the problem that you're trying to solve? What's this task that you're trying to do? And then think what's the kind of key information that I need to provide? What's this context that I need to provide?
Think of this as a very capable intern walking through the door. You're describing a task, what is this problem? What do you need to know to solve this task? Are there any constraints? Are there any things that are very specific about what you want to do? And then when it produces this output, is there any kind of format?
Is there any sort of like template that you want to use? Now, these models are very capable, but you might want to break it into smaller chunks. And then make sure that you have a step to review and evaluate the output. It can help to think about what would success look like in this, particularly for more complex or repeating tasks?
Think about how will I judge whether this is good or not, particularly if you're changing the prompt, like how I know it's going to get better. And then plan for any follow up questions or any clarifications that you have. So that's the general structure about the problem. And now you're ready to start using the tool.
You have to choose the right tool. Now, there are a few tools out there and this constantly changes. Still, I'd say that the paid version of GPT 4, which is OpenAI's ChatGPT or something through Microsoft Copilot, which is Is a really good one to use. And why do I say that? GPT-4 oh is still a very well performing model.
Although there is a lot of competition now when you're using it in the paid environment or in an enterprise environment, it's secure. And the data that you put in, there's text that you put in is not used to train the model afterwards. So you want to be able to assure yourself of that. You have much higher chat limits.
If you use the free version, you'll run outta chat. So you will be able to do so much of it and you can upload documents and we'll come back to that. It can also use tools, web search, creating images, and even things like code assistance and data analysis, which I won't go into in this talk. And then the lovely thing at the end of it all is if you build something that works, you can put it into something called a custom GPT, which you can reuse.
You can share. And that's a very powerful way of taking the work that you do and sharing it with others. There are other models though, so in the middle you've got Co-Pilot and co-Pilot is provided within the Microsoft Office suite and you can get paid versions as well. A lot of the models are the same that O OpenAI uses.
But in addition to that when it's integrated in Microsoft Office, it can integrate with your. drive and other documents like email and everything. So that might be valuable to you. On the left, you have Claude by a company called Anthropic. I very much like Claude. It doesn't have all the tools that ChatGPT has.
For example, it can't search the web, can't create images, but in all of the respects, It's actually one of my go tos. I recently asked on LinkedIn what people use, and about 75 percent of people use ChatGPT, and about 17 percent use Claude. It's worth using to see, you can have a free go on it, and you can have a paid version as well.
It doesn't have some of the tools, but the job that it does with the tools that it has, A very good, including something called artifacts, which is shared documents that you can work on. So yeah, I do love that. And then there's other ones as well. Perplexity is a common one. Perplexity really focuses on leveraging things like web search and different models too.
And you can access lots of different models. Again, there's a free version of the paid version, and then there are many others that we're going to mention throughout this, but for the purposes of the demonstration today, I'm going to use. chat GPT. So what do you do? Okay the first thing that you do is just give it a go, because these models have been trained on an enormous amount of data and information.
That sounds great. Actually, maybe just asking it to do what I want it to do in a very simple way will do the trick. And you don't need to do very complex prompting to do this in the first instance. You probably will have to, but you might as well give it a go because you might have a small task that you want to do.
Okay, what we're going to do today is we're going to talk about a use case. Now, remember I mentioned earlier on about the risks of using large language models, and it is true that if you're using it in a medical situation, that you are going to be taking on a lot of risk with this. These aren't medical devices.
They've not been designed for that purpose. And we're still unclear as the regulatory states as well, you need to make sure that it's safe and everything like that. So I'm going to steer clear of that and look at a more common area. Which is the kind of middle office or back office in clinical settings.
What are the sort of things that aren't immediately patient facing, but will be a good demonstration. So I'm going to talk about answering a complaint. Now, again, if you were doing this in the wild, you'd want to be careful. You don't want to put any patient identifiable information in there. You want to be careful about checking it and so on, but I'm going to use it to illustrate how prompt engineering can work.
So if you can accept it in that way, I think we're good to go. All right, so let's think of a scenario. So I'm a GP and I've had complaints in my time. I'm sure any doctor, anyone here has had complaints about practice. It's part of normal practice. So we have to make sure that we respond to them. So maybe that's a task that I would want to do.
So let's throw up an example complaint. I've just made up a complaint here right now. Here we go. Okay. So dear Dr. Grimes, when you saw me last week, you told me I had viral tonsillitis and I didn't need antibiotics. It didn't get better. So the next day I saw another doctor and they gave me antibiotics.
And within hours of the first dose, I started to feel better. I think you're a bad doctor and I wanted to complain. Okay. So hopefully your complaints are going to be maybe not quite as direct as this, but you get the idea that we're going to be dealing with a complaint in this situation. Okay. So I said, let's try and do this in a straightforward way.
Let's create a vanilla prop. Let's create a straightforward one. Okay. Respond to this complaint letter. And what I've done here is I've done a little bit of redaction. I just put this in here, took the name, my name and the patient's name out. Okay. So we're going to work with this right now. So for the purposes of this, I'm going to use chat GPT life and I've got some backups in case the tech gods don't favor me today, but I'm just going to drop out of this and I'm going to go into chat GPT and hopefully you can see that in front.
So speak up if you can't, but I'm assuming that you can, and I am going to drop that prompt. directly into here. Oh, here's a useful tip for you as well. If you want to format new lines, if you press enter right now, it would just submit it. But if you hold down shift and enter, you'll get a new line.
So that's actually very handy little hint there. Anyway. Okay. There's the prompt. Let's see how we do. All right. Dear patient, thanks for sharing the concerns. Sorry I didn't prove expected. Understand how frustrating it is. There's a bit of empathy in there. Based on the symptoms and examination findings, peer U condition was consistent with viral tonsillitis.
I didn't tell her that. It's right. Doesn't resolve. Recognize infections can come develop and change. Yeah. Feedback is important. Okay. All right. All right. As a starter for 10, but I think you'll probably agree that we can do a bit better than this. And there's some problems in here as well.
I've given it so little information that it's had to infer some of this as well. So I'm going to go back to my talk
and think how do we improve this? The first thing that you do is you add structure. Now, there's lots of different frameworks out there, and I, before now, was using one called CoStart, but I've created my own because I like that. I'm not a doctor, I like mnemonics. So this is my own create framework, and it does much the same thing.
It's maybe a little bit simpler. Think about the task that I want to do just now and break it into these settings. So context, what is the context? I'll provide some background information and material. The roles. This is really important. This LLM can do loads of different things. Let's hone in on what role I want it to play.
Be nice and clear. The expectation, what kind of, what do I want this to do? What are the instructions here as well? What audiences is going to define the sentiment and the emotion in this as well, the tone that's being used at the end of this all. And then the examples. What's the output format and the structure for this?
So the other thing that you want to do is maybe use a bit of formatting and syntax because this can help the model attend to the right section. So I will generally use caps for things that are really important. I can use delimiters, symbols to indicate what I'm changing tense or section. And if you use any coding, you can use XML and JSON, but if you don't know these terms, don't worry about it.
It's just helpful in there. Now, what you want to be is consistent if you're referring to things as well. So if I talk about giving patient notes and I call them notes, refer to it that way. And as I'm iterating, every time I do this, I try and restart the prompt. If I don't, what happens is that the model's considering everything that's said before in that chat.
So I might actually get a response that I want, but it's not responding to the prompt I gave it. It's responding to everything that was said before. So with that in mind, how are we going to change this? All right. Okay. So I'm using the create prompt here. Here we go. So context, I'm a UK based GP who's received the complaint letter.
The role AI, you are an expert clinical complaints, AI. And it is helpful to say that it's expert or very good. We discussed this in the last talk. The expectation, I expect the model to I will provide it with a letter, and you will review this letter, and you will write a response. The audience is a patient who doesn't have any medical training, and the tone is professional and polite.
And the examples in this case, I'm not giving it any examples of complaints and answers from the past, although I could do that. In this case, I'm just going to say, look, generate the response in the form of an email. And then the complaints I would insert into this section. Okay. So let us come out of this, go across the chat GPT, Start a new chat and I am going to put in a structured prompt and I'll paste it in.
I've got a little side document here and paste it in and you can see here, there's my prompt and then there's the complaint at the end. I've tagged it complaint. All right. How about it? Let's see what it does. All right, here we go. So dear patient's name. Okay. And it's nice. It's got it's already taken the form of an email here.
Thank you for your letter regarding the consultation. Sorry to hear a difficult experience. Yes. Nice consultation. Yeah. It's still making some assumptions here as well. Got so much, the same information could have evolved condition can present as virally initially and provide the best care. I regret you're dissatisfied.
My God, that's us talking here. Assuming my role here in the response, that I don't want people to feel unhappy about this and arranging a follow up appointment. So I think that's improved. We can definitely get it better, but that's definitely improved there as well. So you can see how structure allows me to control this task and instruction, but also it allows me, if I want to make any tweaks.
I can just change one section as well. Now I would, I won't do it inside here, I'd do it in the separate prompt, but I could change it. So for example I might say something about a reading age in here, patient without medical training, target at a reading age of whatever or a language in particular, let's say.
Okay. So you can hopefully see. You add some structure, you get a better response. And you can reuse that quite easily, and you can adapt it as you wish. So the next thing you want to do is provide a little bit of context. So remember, you've got this intern walking into the room and you just dump a task on top of them.
You're going to want to give them a little bit of background as well, and context is very important. Large language models are constrained by what they're trained on. They don't know what they don't know. They don't know a lot, but they don't know what they don't know. And actually, you may want to provide some information that isn't in that.
You maybe have some very specific medical information, like a guideline or something like that, or maybe a specific bit of local guidance that you want to put in my complaint response. So one thing that you can do is you can just dump it straight into the context window. You can just put it into the prompt.
And time was that the context window was small, but now they can be very big. GPT 4 has 128, 000 tokens. That's about 100, 000 words. Do you want to paste the entire content of an iSkyline in there? Probably not. But helpfully, ChatGPT allows you to upload. files. So I think we may give that a go next time.
Now, bear in mind long context is fantastic, but longer context does increase the risk of confusing it a bit. So it's not a perfect answer, but it's definitely enough for us to be going on here. So I'm going to change my prompt this time. Okay. I'm highlighting in bold the things that I'm changing here.
So UK based GP, I've had a complaint letter. Yeah. Blah, blah, blah, blah. Okay. Here we go. I'll provide you the letter and the notes. Yeah, of course. I'm going to refer to the notes that I took at the time, provide you the nice guidelines on a sore throat acute antimicrobial prescribing. This is the exact name of the full file that I'm going to upload in a minute.
And then in this, I've also added a little section notes, and I'm going to put them in here. Okay. All right. Back to chat GPT. Let's start a new chat and let's add this new prompt. So I'm going to add it in here. We'll take you through it. So there's the context. There's the role. There's the expectation, which I've updated.
I've added notes and also the nice guidelines. You review the letter and write response, no change to audience tone or examples, but here's my notes. And of course I've made up some notes here. I've redacted it. Patient, no details. Two day history of pain on swallowing. Otherwise well, no other medical problems, no meds.
On examination afebrile, throat appears normal to examination, no nodes. Impression viral erty. Plan, self care advice. Review as required. Explain viral nature of illness and natural cause. Of course I could write better notes than this. So please don't criticize me on that. And then the complaints in there.
Okay. One other thing I said I was going to do is going to upload a document. So I'm going to click on and I'm going to upload from the computer. And if I go into here, I downloaded this a little bit earlier on, and that is just the PDF from Nice. You could try and put a link in there or ask it to look for it, and if they have web tools they can find it.
But the less latitude you give the model, the more confident you are that it will actually use this. Okay, so we're all set to go.
Draft a personal polite response. Here we go. Here we go. Dear patient. Starts the same. Little short intro. And fantastic. It's starting to specifically reference my notes. Okay. You mentioned two day history of pain of swallowing. Otherwise During the examination appeared normal, no other concerning signs.
Based on these findings and following national guidelines for managing acute sore throat, I advise self care because it appeared your condition was likely viral, which typically resolves on its own without antibiotics. Viral infections, as you may know, are not treated with antibiotics. And then a little bit more about how viral illnesses can change and why things might have developed.
Regret you don't feel better. Hope it explains why. Any further questions, come back to me. And there you go. Designed to be respectful and informative and so on. Okay that's again looking better. I may go through it and say I didn't specifically reference the NICE guidelines in my notes.
So maybe some of the wording you'd want to tweak on this. But again, I hope you can see that adding a little bit more context here has improved the quality of the output. All right, so now we're getting somewhere.
What next? Taking things step by step. Okay, what do I mean by that? Remember when you were at school and they asked you to show you're working? Same trick works with large language models. Why? A lot of theories, no one's exactly sure but essentially just asking the model to think things through step by step.
allows it to think it through step by step or move through it step by step, first generating a plan of action, and then executing this. And you can even ask it to show you that as well. This improves the quality of the output. So what you're going to want to do is use that language, or if you've got a complex task, you can break each task into steps and say, first do this, then do that.
Then do the next thing. And you either do that as a series of prompts, and that can work quite well because you control it, or you can put it all into one prompt and get it to execute this. So it's a little bit like writing a computer program, except you're using natural language. Let's see what that might look like.
Okay. The rest of the prompt is the same. The prompt starting to get a bit longer now but I'm changing the expectation and the expectation. Let's see. So first of all, let's think things through step by step important magic phrase at the top. I'll provide you with the notes, the complaint and the nice guidelines on sore throat.
You'll first read the notes. and consider the diagnosis and treatment plan against the NICE guidelines. You'll then read the complaint and consider it against the notes and best practice. You'll then write a response referencing the notes and NICE guidelines where appropriate. Okay, a bit more detailed, but I'm forcing it to consider things in a certain order.
Otherwise, you're just putting it to the chance of the machine that's going to do it in this particular order. Let's have a look at this. We're going to start again. I have to recreate, just for the purposes of this, I'm going to recreate the steps. So I have to do a couple of things.
Let me just see. I'm in the wrong section. Here we are.
I am pasting it in here. So there's the prompt, and I just quickly need to re add that guideline again.
At this point, this is where I start hoping that things work exactly as planned. Let's see. All right, step one. Oh, here we go. Breaking it down into steps. All right, here we go. Reviewing my notes. Nice bullet point there. Breaking it all out. And then the nice guidelines. There you go. It's referencing the thing that I put in there.
That's good. All right. Fever, pain and centaur criteria. This will be familiar to some of you in there. Suggested that now, of course I didn't write down those in the notes, but it is at least calling this out that this is comparing it back strongly advises. Oh, there you go. Great references. Okay.
Considering the complaint, saw another doctor and they felt, okay, that's all good. Here's the response. Okay. Taking the time to share. It's getting a little bit more formal here. I've recapped the history again according to the guidelines. And then it specifically references NICE, breaking up what NICE is, recommending this.
That's all pretty good. So it's justifying this. I understand this happened. Some of the explanation as to why this might've happened. Again, I haven't said this. This has come from the inherent training within the model. And then actually it's got, Encourage patients to return and seek further advice.
Hope explains it. Yeah, okay. At this point, I'm actually getting close to what I might take away and work up a little bit further. That's pretty good, I think. But of course we can do better than that. So I'm going to try one more time. And let's tweak this. Let's give it a little bit of a tweak.
What other things can we do? The first thing you do is I'm showing you how to do it with a single prompt and get it done, but of course you're not at all limited to that. I could ask other questions. I could give it further instructions. So for example, that language, I might not have liked the language.
I'd like maybe to simplify it or translate it. I could ask it to do now, if I wanted to do it in a single prompt, I might update the prompt, but I can use it that way. I might try a few variations of the prompt and see how it might get better. And remember what I said about evaluating? You want to think at the start what will be a good success measure as well.
If it's not working, like it's not doing what you want, go back, have a look, check the order of things. You want to make sure the right order. Make sure the ordering is clear. Models will often pay greater attention to things at the start as opposed to the end. And if you're really struggling, you can try shouting, putting block caps in the areas that are very important.
And these last two little bits of seasoning, you can offer financial incentives. You can tip it, pay it, encourage it. Again, this does actually make a difference, but I don't really rely on it that often because it feels a little bit outside what I'm trying to do, but you might try this. Okay, so how am I going to tweak this?
While I'm speaking with the BMJ today, while I look at what the BMA says, BMA has got some great guidance here about dealing with complaints made. Okay. And you can visit the site and find out some more. What I did here is that I went to the site and actually copied and pasted some of the information into the prompt.
And you might do something similar, or you may have some guidelines that you have yourself. In your own practice or your own trust or whatever, or even your own personal approaches here as well. So why don't we try that? So I'm going to go back. We're going to go for what I'm calling the tweaked prompt.
Again, we're going to reconstruct it the same way as we did last time. I've got quite a long prompt now because I've got some of the BM, BMA guidance, which you can see at the bottom here. I tagged it guidance. Do you see following is taken from the BMA website. And just to make clear, I also referenced it.
Up here, I changed the expectations, so provided the notes, the complaint, guidance, and nice guidelines. So I'm referring to every single part of what I've got in here, and then, and I updated how they would read the complaint. Okay, very good. Alright, and I just need to add the document. Again, if I don't add the document, it may try and source it from its training knowledge, and it may be out of date.
All right, here we go. All right, so doing what we did last time, you'll recognize this first section and the second section. This is not exactly the same. This will be slightly different. This comes into something called temperature. That's for another time, perhaps. Okay. It's going to have some slight variations.
So you do want to check this. There's the complaint. And then this is a scenario talking about it and the draft response. Okay.
Recognizing the frustration of the patient, my focus on their health, during the visits, reflecting back on this, going back to the nice guidelines, going again into what the doctor did the next day, bit of information about how long antibiotics take, there we go, feedback seriously, and there we are. And it's been, at this point, not as clearly, but has been influenced by the information I put in from the BMA guidance as well.
And you may have more specific guidance or even sort of framework, you might have a template that you want to use as well. That could be put into the example section. All right. I think we've moved to the point where things look as though they're getting about the point where I'm thinking, that's as much as I need to do.
I'm going to go away and work on this and check it and then use it to have saved me some time. Of course, I spent a bunch of time going through the iterations and you're like if I have to do that every single time, what's the point? I'm having to do this loads and loads. You're going to get better at wording things the first time around.
You're going to have a better hit rate, but also. Why don't we package it up? Why don't we find a way to reuse this? Now, one thing you can do is you can just get that prompt and then keep it in a document and just cut and paste it. There you go. Okay. You can do that, but no, we're here to make you masters of prompt engineering.
So we're going to try some different ways. And the easy way is using something called a custom GPT. So remember I said about paid version of chat GPT, you get access to the ability to create custom GPTs. And what custom GPT is essentially a guided way to create this sort of custom agent which allows you to control the system prompt.
This is the hard instructions inside the model. It can, you can upload documents and they will stay there. So remember I had to keep uploading it. You wouldn't have to do that. It can give, you can give it access to some tools as well. And if you use it in the easy way, it will suggest a name for this.
It'll give you an icon. It's a lovely easy way of sharing and saving structured prompts. You can control how it uses tools and you can share it and control access for other people. So for example, at Curistica with the team that I work with, we have a team version of this and we have custom GPTs that we use internally that are only used internally within the organization.
So that's a very helpful thing. I want to say also that Microsoft also offers this in Copilot, something called Declarative Copilots. And Claude uses something called Projects, which is not quite the same not quite as flexible as what ChatGPT does but it's still worthwhile looking. So the easy way is you go into ChatGPT, you create a custom GPT, and you see where it says here, Create.
You just talk to it about what you want to do. Large language models can create their own prompts, people. And being taken through it this way is actually an interesting way of doing it. If you're feeling particularly early and you're learning about this might be a nice way to do it. Now, I don't do it this way because I have a clearer idea of what I want to do.
But it might get you off to the races. So what you do is you just talk through, you can preview it, you can test it, and when you're ready, click publish, ready for you to use, either personally or share. So fantastic, that's a nice, straightforward way to use that. But you've got a prompt, or I've got a prompt, and I want to reuse that.
And so you do it the harder way, it's not that hard, and I'll talk you through it in a moment, where you go, instead of create, you go to configure, And then everything I just described you is in your control. You can give it a name, you can describe it, you can add a little picture, you can get DALI to create it.
You can upload knowledge and it will stay there every single time. You want to make sure you reference it correctly in the prompt. If you have a file name, make sure you use the same file name. And then you can give it capabilities like web browsing, DALI, and this code interpreter and data analysis.
Like I said, this is more advanced use, so you might want to experiment with it. But I turn it off for most, Then you click create, and then you're off to the races. Oh, and the other thing you can do is you can put conversation starters to help people understand how to use this product, or you can put a description in place.
Okay, because I'm doing it this way, and because I'm wanting to reuse it, I'm going to change the prompt slightly. The difference is, thinking through step by step, but I'm giving it step by step instructions. Think about this, the person uses it, they haven't put anything in, and you want to be reusable in different situations.
When you use it, First ask for the complaint, wait for the user to provide the complaint before moving on, then ask for the notes, wait for the user to provide the notes before moving on, then ask for the guidelines and wait for them to either write it or upload it, and then read the notes, blah, blah, blah, blah, blah.
So this is what we're going to try right now in chat GPT. So I'm going to go to new chat. Now you're going to see a new button here in the sidebar. You can see some of the other ones that I've done, and we're going to essentially explore this. And this is like a marketplace where you can see lots and lots of other ones that might turn up in a wee second.
Hopefully, if not, I'm just going to, there you go. You can see all the different ones, the ones that we use and other bits and pieces. So I'm going to click create. All right, here's the create one. Remember I said, just have a conversation, but instead I'm going to go to configure. So we're going to call this complaint demo, helps answer complaints.
Okay. The instructions this point. I am going to put the prompt in, and this is very similar to the one that we had before with that little tweak. Pop it in here. Conversation starters. I'm just going to put start because I'm also starting it running. And then at the bottom here, let's turn off web browsing.
Let's turn off Dali cause I don't really need that. Let's upload some files. We know actually I'm not going to upload a file. I don't need to upload a file here because we're going to get people to upload files. All right. So I'm going to create this
and then I'm going to view it. Okay. Here we go. Okay. Let's give it a try. So this is a prompt. I'm going to say let's get going. Start. Details of the complaint. Okay. So I did this before. So I'm going to put the complaints in here. I'm going to run the same complaint, but you could run any complaints here.
Next step. What are the notes? Okay. Here we go. Now I might want to put some other instruction here. Make sure you don't include patient identifiable information, potentially even have the model check that I've not put anything in there as well. All these things are possible.
Do I have any guidelines? Why, yes, I do.
There we go. Presses the button.
Oh, very nice. Now, at this point, it hasn't actually gone through all the steps the way it did last time. If it's important to me, I might want to change the prompt slightly. But to be perfectly honest, I just want this to do what I want it to do. And here you go. Let's have a look.
I'll go through it in more detail, but there you go. And that is now available for me to reuse whenever I want. In fact, I could just type complaint here with an at sign and it should bring it up. There you go. So I can make it easy again. And I can share this with other people as well. So coming back to our talk.
Oh yes, Claude. Claude is very nice. You might want to use it by comparison. Why Claude? Claude has two features. One is called artifacts. And artifacts appear on the side of the screen when you're talking to them, or when you're talking to the model. And that means if you had that complaint and it was up, I could then say, Oh, I don't like the tone.
Can you change it? Can you add this? Can you add that? And instead of it being in the chat, it would just appear on the side. Or I can go back to previous versions. So it's a really convenient way, a really nice bit of product work to add that to the place. And then projects is a little bit like custom GPTs, where you can have a custom system prompt with a little bit of context about who you are, gives you much more context, gives you document uploads, and you can use it to do repeated work in that area as well.
So maybe give it a try. But everything I've shown you today has been on chat GPT. So although we're going to move on in a second maybe move straight into the future of generative AI to leave the time for questions at the end. But if you want to learn more about all of this, cause you're all dying to go out there and try things.
A few things I was going to say, number one, do remember the risks I mentioned earlier on as well. Do remember that anything you put in here and use is going to be on you and you want to be careful about where you use it, which is why you might want to practice on things that are well away from the clinical frontline.
Initially, you want to actually be understanding how these models work. So a good example, I always say to people is try things like Complaint letters, not receiving them, but generating them you've got a bad service from I'll give you an example of one that I've had, recently is I've got a bad service from Amazon on delivery of something.
I can actually help write a complaint letter quite easily and send it off. Or I did another one recently about creating party invitations or planning for something, all these things use the same principles. And then you can start bringing it into that kind of back office and mid office administrative use and see if it can help.
Or you might want to use it for educational purposes, for your own professional development to take you through, not do your work for you, but to give you more. insights into the things that you might want to be able to do. And I can talk separately another time about how you might use it for appraisal and professional development.
And then you can start inching towards things that might actually have some real world impact. But again, even if it's a middle office tool use, you might be impacting on clinical safety. So proceed with care and attention. And I'm very pleased to say that people like Dave Triska has published or posted online on LinkedIn.
And I'll share these links afterwards guides on the sort of things that you want to bear in mind. But if you want to learn some more, what do you do? There's lots of links and I will share this again either online. The companies will all provide guides to prompt engineering and prompting.
There are some good online courses from deep learning and Coursera. And I would be remiss not to mention that Curistica also offers this as well. And I offer this too. So you can get in touch and we have a newsletter, which I'll share later on that you can sign up to. And this along with a whole bunch of other resources will come your way.
In summary, think about what you want to do. Do this safely. Start by just asking. Actually, start by thinking about whether you need to use AI at all, but start by just asking. Structure, context, step by step, instructions, examples can be helpful too. Wrap it up in a model and reuse it. And if it doesn't work, keep tweaking it.
Now, before I move on to the future of generative AI, I did mention RAG. Now RAG is a deeper topic and something we can cover maybe elsewhere, and I'll try and post some information on the video about this too. But RAG stands for Retrieval Augmented Generation. Now I was uploading single documents at a time here, but what if you've got a hundred documents?
Or what if you've got an enormous amount of data that you want to be intelligently? RAG Users technology to when you put your question in what it will do is it will convert that into what's called an embedding and then it will look. at the entire data stack and pull back relevant information and put it into the context of the instruction or prompt and then run it.
And that is an excellent way of getting much more control over the data that the model is going to draw from. Like I said, it is a much deeper subject, but it's something that you very much want to look at. And we're talking about clinical use cases. It's one of the key ways in which we in the manufacturing side of things are looking to improve things for you.
All right. Okay. A few minutes on the future of generative AI, and then we'll go to questions. So what else can generative AI do? This Could be an enormously long part of the talk, but it won't be. And it's also changing all the time. So this is just a sort of a view as of the 11th of September, 2024.
While I was talking about text, AI can also see, read, and draw. What do I mean by that? Large language models and the process by which it processes text also works for images and any other modality too. So it can allow you to submit not just words and text like we did, but images. So imagine patient history with the x rays or the ECG being put in place.
This is exactly what large multimodal models can do. And when you combine all these different modalities, you get more precise and more holistic output. And let's face it, medicine is full of multimodal information as well. So multimodal models include GPT 4 0. I didn't show you, but you could have uploaded pictures there as well.
You can use Claude for this. You can use Gemini, any of the models. In fact, most models will offer this right now. But there's also the reverse thing about creation, and that's Using something called a diffusion model, not a transformer, and things like DALI, which you could see inside ChatGPT and all the pictures, of course, inside this presentation I made that way is available in ChatGPT.
And then there's many others, MidJourney, StableDiffusion, Flux, Ideagram, all out there, all have different strengths and weaknesses to allow you to create pictures. But it can also, AI can also listen, it can speak, and it can sing, and it can make music. In what way? Voice interfaces. If you download the ChatGPT app, you can use a voice interface and speak with it, and it will speak back.
And some of the demos earlier on this year, and a model that's going to come out soon, is extremely fluent and easy to have a sort of to and fro conversation. So chat GPT, there's something called cerebrus that I'm going to talk about in a moment can do it as well, but you can also get it to speak in a voice that you like.
And 11 labs allows you to go from text to speech and also in different voices, but you can also clone your own voice as well. And I've done this a couple of times and it's okay but it's always improving. And you want to think about when you're using this, because there is some ethical concerns about whether you use someone's voice.
Is there consent for it? Is there transparency about who's speaking and so on? And we can start diving into the world of deep fakes at this point. So all of these interesting use cases do come with a darker side as well. And the music you can create whole songs. And if you want to go check it out, Suno does a good job with that as well.
I could also make videos. And here on the right is a video created by OpenAI's product. And what. It does here. So if you see that prompt on the bottom that is the instruction that was given to the model that then produced this entire video. So it's about a minute long. There's no sound, so you're not missing anything.
But if you take a moment to look at it, that's pretty detailed. That's pretty impressive based on a short prompt in there as well. Now it's There's other things in there that I haven't described or haven't been described and so on. But you can see how there's an awful lot of consistency with the world model inside there too.
This is quite incredible and moving very fast. If you look at runway, if you look at synesthesia it allows you to create video. It also allows you to create avatar presentations. So you can create a presentation with your choice of avatar, your choice of voice, and all you do is you provide it with text.
Now, that's the kind of flashy front end stuff. What's behind the wings? Foundation models are commonly made by closed source companies like OpenAI and Anthropic, but there are open source models. And what does open source mean? Open source models make the models and the weightings, the sort of detail of how they work, available for people to download or put into a cloud server and then use for other things or develop on top of it.
Now, these licenses are more permissive in terms of actually being able to fine tune a model, take a model from someone like Meta and they have a product called LLAMA 3. 1 or Mistral or from Abu Dhabi. They have Falcon 2. Around the world, there are different culturally specific large language models that are published in open source that allow people to build, extra fine tuned models.
And these models are very close to leading edge models. In some examples, exceeding leading edge models in the closed source community. Why is this good? Accelerates research and innovation. There's more transparency, as is common with the open source movement. It allows you to democratize access to technology and find those more marginalized groups.
And it also allows you to have a sort of stepping point to be able to do specialized, fine tuned models. And the community of course contributes towards it as well. So there's lots of positives. But there is risks, of course. If you put these powerful tools out there, it could be misused for harmful purposes.
You can create biased models as well. You can intentionally bias things because of the increased compute, environmental concerns, IP challenges. and also maintaining the quality control. To quickly finish off, there are smaller models. Big models do lots of things. Smaller models do less things, but increasingly they're doing very fast and very well.
Models like the smaller Lama 3 8B and so on. And they can run locally on your phone. So if you get a new Apple Apple iPhone the next version of the operating system will run our model locally on your phone. Memory can be an issue. Some platforms allow you to have memory to remember things between. Different conversations and perplexity is a good example, but there's a few others out there and AI is also getting faster.
You saw how fast this was working. I'm just going to give you a quick demo here. We've got OpenAI's mini model, Grok, not Elon Musk's one, a different model, and Cerebrus. And if I just click them all at the same time and ask them, how do MRI machines work? Just watch how fast it spits up the answer.
While ChatGPT is still putting on its shoes, we have Grok at 1, 200 tokens per second and Cerebras at 1, 800. Very fast.
And, talked about that. Copilots. Eventually we're going to have these models operating on their own under your control or instruction. And these are going to be integrated inside Office suites. They can give you more tools, they get much more capable as well. And so I think even if you're using Microsoft 360 right now or Google Workspace, you'll have the option to integrate this, bringing in context from your email, your stored documents, your spreadsheets.
Imagine that securely and safely within an electronic health record. And over the next few years, I think that's the thing that's going to make the biggest difference. I think people are going to find that useful. Microsoft logo. AI as a product, something, a tool that you can use alongside you, I think is going to be the most important thing.
We've still got some problems to resolve though. Copyright issues remain a big bugbear and the regulatory side standing importantly in front of us that we have to resolve. But at this point, Kieran, all done, over to you, sir. Fantastic, thanks very much, Keith. Again, we have some questions in the Q& A, thanks to everybody who entered questions.
Keith, you may be able to see them yourself. I'll bring them up in the background and then we'll see where we are, yeah? Live, live, live. Flashman, he said I wonder if you take into the amount of AI, the time that's taken, could you've written a decent response yourself? And you can read the rest. Yeah.
Yeah. So Clive, absolutely. You're right. And this comes down to, remember I said at the start though, is AI the best use of your time? If this is like a one off thing that needs an awful lot of your input. Remember I said you have to check it afterwards, then you might actually say. No, I'm not going to use it for this.
But remember, what I'm doing is I'm experimenting with this. This is playful experimentation, but I'm also, as I understand this, identifying things that I have to do again and again. So as you learn more about the models, you understand what they can and can't do and what you can and can't do, and you bring them in appropriately.
So I think what you're saying there about iteratively improving the response is another way. Yes, I've already written the complaint, and then I create a prompt which says, look at my complaint response. Can I improve this? That's another way in. Yeah, these are not definitively ways in. What I'm more interested in is that people understand how to use these tools and find out yourself.
Sure, got it. Thank you, Sue, Lacey, Brian. Thank you for asking a question. Keith, the environmental impact. Yeah. Let's have a look. The scope and size of energy is very according to what you're talking about. Customized version. So what are your concerns about the environmental impact? Yes, we all have a responsibility to do there's the cost money, but then there's all the cost in terms of carbon and the cognitive carbon and the water use and so on.
And I previously described, using GPT four to do something like write a poem is like driving a Rolls Royce two miles or a mile down the road to get a pint of milk. It's wasteful. What we want to do is get to the point where we're using the right size model for the right size task.
And you'll see with these smaller models, that's exactly what's happening. That we're getting a model that does what you want to do, but energy efficiently. And I think that is the right way to go. I think we're also going to have to increasingly account for how we use the technology that we use.
So if you have a company building with this you don't want to build it on top of the, like the most expensive model out there. It just costs too much for a start. Are you going to try and find the right size as well? So I think that's very important. And also as you're doing this, you'll be thinking about there is a point at which.
Using an AI to do the model to do the work is actually caught from a carbon perspective, more efficient than using a person to do it as well. So if you try and take a whole system approach on this as well and consider every aspect of this, which itself can be complicated, I think we're going to start having more interesting discussions about how can we drive down a holistic approach.
Holistically, how can we drive this down? And that's before you start talking about AI working on climate issues as well. But yeah, you want to use that approach. Thanks, Keith. Got it. Sorry, Keith, I'm going to hurry you on. Yeah, okay. Lisa Drake, can multiple documents be added in different formats?
Yes. Okay, lovely. Thank you. Karen Wallace, thank you. Does open source model move further away from black box thinking? In some ways, yes. In some ways, no. Open models allow you often know an awful lot more about what training data has gone into it. But the fundamental issue with the black box is that it's built on a deep learning like multi layer neural network.
And that is very hard to unpick because when you've got, Billions of parameters all operating across many layers. Understanding exactly how you got to a particular answer, if that's where you're coming from, can be very difficult. It's not to say people aren't trying. And actually Anthropic have done some really interesting work about this that you might want to look into it.
But open source models help open it up a little bit more, absolutely. And open research. Okay. Thank you. There's some questions coming through in the chat, actually, if you Oh, I'll put them in the chat as well. Yeah. What do you say? Yeah. If you recommend one LLM to get paid version, what would it be?
And is Llametri good enough? I haven't used Llama3 enough to be able to answer that question formally. Essentially because I'm using it in a kind of business setting as well, I'm going to have to use it in a way that's enterprise for security's sake as well. So then it leaves me between choosing between ChatGPT and Anthropic basically or Microsoft's 360 and Gemini.
I tend to use, the Google workspace at the moment. And so I have an option of using Gemini, but I'm not so comfortable with how it's being implemented right now. So I choose chat. I choose both essentially. You're surprised I'm talking about this all the time, but if you had to choose one, I'd still probably say chat GPT or co pilot if you're paying for it, because it actually has some of these additional tools that Anthropic doesn't have, but you really have to watch this space because Anthropic are moving very fast on this.
And in truth, there are some tasks, particularly long form writing and code writing that I prefer AnthropX Claude for. Okay, Keith, thank you very much. I don't think we're going to have time for more questions than answers. Paul made an important point, actually, rather than the question. He says medical defense, we're stating not to use LLMs for complaint replies, and he makes a couple of more points about it.
But anyways, Keith, I'm not going to speak against that. If that's what they've said, follow that. I was just using it to demonstrate this. Yeah, I know. Understood. Understood. Absolutely. We should say to follow what these organizations advise. So thank you very much, Keith for a fantastic presentation and discussion.
On behalf of BMJ Future Health, we're excited to announce that registration for our face to face event on 19th and 20th of November is now open. Please scan the QR code on the screen and if somebody could pop the QR code on the screen, that would be great, or if not visit the BMJ Future Health website for further information.
Also, please do stay connected with us by joining our Future Health LinkedIn group. If you scan the QR code you'll go there as well. Meanwhile, our next webinar will be on Friday the 27th of September, 2 to 3 p. m. with Jessica Morley and Zhou Zhang on the subject of How to Approach Digital Transformation Across Global Markets.
So finally, once again, thank you all for participating. Thank you, Keith, in particular, for another fantastic presentation. We hope you all found it valuable. Thank you.