LLMs: what are they and what opportunities do they offer?
What are the available tools and techniques for effective prompt engineering? What are the strengths and weaknesses of ChatGPT today? How can we help it improve? Anywhere Club Product Manager Leonid Ardaev answers these timely questions and shares useful resources for learning prompt engineering.
In this article
What is an LLM or Large Language Model?
— LLM refers to Large Language Models, which are machine learning models that operate with a vast number of parameters. They utilize extensive data and transformer networks to construct their structure. You're probably familiar with models such as GPT-3, GPT-3.5, and, of course, GPT-4, which is used in OpenAI's chatbots. Currently, OpenAI does not disclose specific details about the volume of information and parameters used in GPT-4, but there are reasons to believe it is one of the largest models available.
There are also other less popular models that, however, fall behind OpenAI in many tests. AI21 Labs, an Israeli company, offers Jurassic 1 and Jurassic 2 models. Additionally, Google Bard is a chatbot based on the PaLM-2 model, which recently became widely accessible.
There are several interesting developments from NVIDIA (NeMo, Picasso, and BioNeMo) that are designed for a wide range of applications, from text and video generation to scientific research. Their next version will probably have the largest number of parameters, allowing it to consider more details when generating content.
The variety of language models is great, and each model has its own characteristics, advantages, and specific areas of application.
— GPT (Generative Pre-trained Transformer) is a large language model (LLM), and ChatGPT is based on the GPT model designed for engaging in natural language conversations with humans. ChatGPT can maintain dialogues by remembering previous statements and providing responses that resemble coherent human-like conversations. The model is trained on large volumes of textual data and utilizes transformer architecture for generating responses.
Strengths of ChatGPT
- Information location: ChatGPT excels at finding relevant information quickly in most cases. Unlike search engines, which may have thousands of developers and still encounter difficulties finding highly important or specific information, ChatGPT can handle queries well due to its ability to consider the sequence of requests and its overall approach to query processing. This makes it a useful tool for information retrieval projects. LLM can handle complex queries that may not be present as keywords in the text.
- Text creation and summarization: ChatGPT also performs exceptionally well with text; able to generate and summarize large volumes of it. It can take structured or unstructured text and produce summaries or synopses according to the query.
- Translation skills: ChatGPT demonstrates excellent translation skills. Even for complex languages that are read from right to left or left to right, translating them to other languages through Cyrillic and English can be more accurate and of higher quality than using traditional tools like Google Translate. ChatGPT correctly recognizes and reproduces specific characters and considers nuanced details thanks to its knowledge base, which encompasses an immense amount of information from the internet.
- Code generation: ChatGPT is capable of easily generating code, including JSON structures, based on the data provided and can automatically determine where to insert corresponding values. For example, if you provide it with a resume (CV) and request conversion to JSON format without providing extensive context, ChatGPT can generate a structured output with section divisions and correct field names. It also performs well with repetitive tasks if the sequence of requests is appropriately programmed.
- Role-playing: ChatGPT can be utilized as an English language tutor or for other specific subject areas. For instance, if you ask it to impersonate someone and engage in a conversation in natural language, it excels at playing the role and can interact effectively with the context. This allows for the creation of interesting and realistic situations for educational purposes or training in negotiation skills.
Weaknesses of ChatGPT
- Inability to provide evaluative judgments: ChatGPT lacks the capability to give subjective evaluations such as "good" or "bad" or to exhibit an emotional tone in its responses. OpenAI has implemented mechanisms to try to limit the generation of answers that might be deemed inappropriate or incorrect from an ethical standpoint.
- “Hallucinations”: Since ChatGPT is trained on publicly available information, which may contain errors or inaccuracies, there is a risk that the model may generate incorrect or inaccurate responses. If the model receives incorrect data, it may produce a combination in response that seems legitimate but is actually incorrect. This is particularly noticeable in logical questions or tasks requiring mathematical calculations. It is important to approach the model's responses with caution and to evaluate them before accepting them as truth.
- Possibility of factual errors in text generation: While the model can generate high-quality text, it is not always a source of accurate information.
- Bias tendencies: Since the model is trained on a large body of public data, it may exhibit preferences or biases toward more probable answers. This can result in the model persistently adhering to a certain viewpoint on a particular question, even if it is not accurate. This is because the model generates responses based on the available data. It is important to fact-check and verify the model's responses.
- Attention to detail, especially with long prompts: The free-access GPT3.5-turbo model, which powers ChatGPT, can hold around 4000 tokens or approximately 3000 words in its short-term memory. The paid GPT4 model is capable of processing up to 32000 tokens. There are already models in the market, such as Claude by Anthropic, capable of "remembering" up to 100,000 tokens, equivalent to a small literary novel in terms of volume. If a prompt is too long, the model may lose some details and make errors in its response. Paying attention to detail and controlling the prompt size can help mitigate such issues.
Of course, ChatGPT has many more restrictions at this point, and when they will be removed is a matter of time. For now, you need to remember the limitations and try to work around them.
Basic rules for compiling prompts
— Prompt Engineering is an important aspect of interacting with the ChatGPT model. To get an accurate and correct answer, you need to carefully compose a prompt. A prompt for a model cannot be the same as for a human. A person can interpret non-verbal signals and comprehend implicit intentions. The model does not have this ability. It requires sufficient context and a clear understanding of the task. If the model does not receive this information, it will still return a response, but most likely the response will be one that does not match the user's expectations and request.
- The key rule to remember is to be sufficiently specific when formulating prompts. Provide specific details and probabilities to obtain accurate responses from the model. If the prompt lacks sufficient information or provides unverified data, the model may give unsupported answers or suggestions based on the available information. When we provide the model with more precise information, it can provide more reasoned and informed responses based on the context.
- It is important to note that models have configurable settings when working with them through platforms like Playground or other cloud environments that utilize the OpenAI engine. One example is the "temperature" parameter, which determines the variability of model responses. A high "temperature" allows the model to generate diverse and creative responses, taking into account the probabilities of matching the query. ChatGPT is known to operate with a fixed "temperature" of 0.7, which implies variability in the bot's responses to the same question. Decreasing the "temperature" makes the model more predictable and stable but may limit response variability. Therefore, it is important to consider the choice of temperature, especially when seeking a fixed and precise answer.
- When working with ChatGPT, it is beneficial to break down large prompts into small parts, especially for comparative analysis or complex tasks. This helps avoid losing context and helps to obtain more accurate results. You can divide the task into several subtasks, process them separately, and then compare the results obtained from each. Keep in mind the prompt size limitations for each model. If the limit is exceeded, the part of the prompt that exceeds the limit will be "forgotten" by the model.
- Encourage "thinking": ChatGPT can learn through interaction. Providing the model with a small amount of context and a clear understanding of what is right or wrong helps it remember that information and increases the likelihood of generating responses in the expected format. This approach is suitable for tackling more complex problems and provides greater confidence in the results obtained. Providing examples also helps train the model.
Tools and techniques for effective prompt engineering
- Dyno IDE and Wale IDE: These tools enable testing and controlling the results of queries through the OpenAI API, as well as using small datasets to incorporate variables in a loop of queries. Using these IDEs, you can generate multiple responses with different settings, such as temperature and response length. By comparing and analyzing the results, you can choose the most suitable answer. These environments are useful for testing and fine-tuning optimal prompts.
- Built-in OpenAI Playground: The OpenAI Playground provides the ability to customize queries and use additional features to achieve consistent results.
- DocsGPT: A model based on OpenAI, DocsGPT allows you to use files as context for generating responses. DocsGPT can answer questions using information from the uploaded files. It is a convenient tool for building reference chatbots based on LLM.
- Additionally, there is a browser extension that allows ChatGPT to fetch data from the internet. The free version of ChatGPT still relies on knowledge through September 2021. This limitation has been lifted in the paid versions of popular chatbots. With the help of this extension, ChatGPT can access search engines and provide answers based on information obtained from the internet.
Using tools helps you test and tweak your prompts to find the best options and obtain the results you want.
Useful resources for learning Prompt engineering
— There are already many additional resources that you can use to learn more about Prompt engineering and apply it in practice:
1. HRGPT and Prompt engineering courses can be found on the LinkedIn Learning platform. They provide a professional and more in-depth explanation of these concepts. Examples include:
2. The Learn Prompting resource provides useful resources for learning more about the prompting process.
3. Various IDEs (Integrated Development Environments) offer rich features and are constantly being updated, providing useful context and tools for the efficient creation of prompts. They also integrate with various programming languages and frameworks to solve specific problems.