2025-08-14

Context Engineering: How to Use AI Effectively

AI is Changing Faster and Faster

If there’s one major change felt in life between last year and this year, it would be AI. It was only last year that we were amazed by AI answering like a human to questions asked as if in a conversation. Now, it has advanced incredibly fast, not just understanding and summarizing complex content, but even creating new things and reviewing them.

Last year, bookstores were filled with books on the theme of ‘How to Become a Prompt Engineer.’ The majority of them stated that ‘engineers who can ask good questions’ would be the most needed in the coming AI era.

However, less than a year later, we are meeting an AI that ‘understands us perfectly, even when we speak clumsily.’ We are experiencing it providing answers that show traces of effort to understand, approaching 100% comprehension, even when we ask questions that are hard to understand, perhaps not even knowing what we are curious about.

As will be discussed, current AI can ‘accurately understand even highly complex content, provided it is given a well-designed context, i.e., rich and accurate information.’ However, when it tries to generate something new based on this understanding, it still shows shortcomings in its generation ability, such as a lack of consistency or duplicated content, especially as the output becomes more complex.

A paper written in July 2025, ‘A Survey of Context Engineering for Large Language Models’, introduces context engineering. This paper compares prompt engineering and context engineering and explains how users should apply context engineering for better AI utilization.

Prompt Engineering and Context Engineering

Prompt engineering focuses on creating the ‘best question to generate the best answer.’ It’s similar to ‘a skilled investigator carefully choosing every word and nuance to elicit the most accurate statement from a witness.’

However, context engineering is like ‘doing everything possible to help the AI solve a problem and give a good answer.’ It’s like handing a ‘case file’ containing all the materials so far to the AI, which is the skilled investigator. This file can include clear instructions, external reference materials, a list of available tools, and past interaction records.

Dimension	Prompt Engineering	Context Engineering
Scope	Optimization of a single text input (prompt)	Design of the entire information environment composed of multiple information sources
Model	Static text string ($C = \text{prompt}$)	Assembly of dynamic, structured information elements ($C = \mathcal{A}(c_1, c_2,…, c_n)$)
Goal	Generating the best response for a specific prompt	Optimizing the information system to maximize the reward (quality) for a given task
Complexity	Passive, iterative search for the optimal phrasing	System-level optimization composed of multiple functions like retrieval, selection, assembly
Information Flow	Information is fixed within the prompt	Dynamically fetching and using information from external databases, tools, memory, etc.
State Management	Mostly stateless (a new conversation each time)	Explicitly manages the state of previous interactions through memory components
Scalability	Becomes vulnerable as the prompt gets longer and more complex	Systematically manages complexity through modular components
Error Analysis	Manual inspection and iterative prompt modification	Systematic evaluation and debugging for each information component (function)

The Stages of Context Engineering

The paper organizes context engineering into three major foundational elements. If we compare this to a supply chain for manufacturing a product, it can be expressed as follows:

Retrieval and Generation: Procuring raw materials
Processing: Processing raw materials in the factory.
Management: Efficiently delivering and managing processed products in the warehouse.

1. Context Retrieval and Generation

The Importance of a Good Prompt

In context engineering, prompt engineering—that is, a good question—is naturally important. We should ‘ask the AI to show the process of how it arrived at an answer’ or ‘request it to explore multiple possibilities simultaneously instead of just one answer.’ Through this, we can help the AI expand its thinking in the most effective direction by reasoning through various possibilities, rather than jumping to a hasty conclusion.

Utilization of External Knowledge

AI has the limitation of ‘only being able to think about what it has learned.’ Even in the case of ChatGPT, the latest information it knows varies depending on the selected model.

However, recently, with Retrieval Augmented Generation (RAG) technology, we can provide AI with external knowledge, such as the internet and internal documents, to use in generating answers. This is one of the most critical technologies in context engineering techniques and can significantly reduce the generation of false information, known as Hallucination, which is one of the major problems with AI.

While early RAG was limited to simple fact retrieval, recent RAG systems are broadly divided into two types to process information in a more sophisticated way:

Modular RAG: The user can freely combine modules as needed to build an optimal RAG pipeline, such as a ‘query rewriting’ module to refine the question, a ‘multi-source retrieval’ module to fetch information from multiple databases simultaneously, and a ‘reranking’ module to select the most useful among the retrieved information.
Agentic RAG: The AI goes beyond simply searching for information and conducts an ‘investigation’ on its own. When it receives a complex question, it breaks it down into several sub-questions, searches for information on each, and then synthesizes the results to draw a final conclusion.

AI connects the knowledge received from the outside in the form of a graph, like a mind map. This connected information allows the AI to perform multi-step reasoning through related content much more accurately.

2. Context Processing

The Limit of Context Size

AI has a limit to the amount of information it can process at one time. Just as we have a limited amount of information we can study at once, AI also has memory limitations. Because of this, providing too much information to the AI can cause a Lost-In-The-Middle phenomenon. It may forget the information in the middle and give a wrong answer, or the consistency of the generated result may break down. There are various studies (FlashAttention, Mamba, etc.) to solve this.

In fact, if you look at this from the perspective that ‘computers also have a limited memory space called memory,’ it’s just a natural result. However, unlike humans who ‘use various means to remember the past,’ the AI’s methods and space limitations are relatively clear.

Spontaneous Improvement (Self-Refinement)

AI is capable of self-refinement using a given context. It creates a draft, criticizes it, and revises the draft. It repeats this process to generate the optimal result. Looking at this from another perspective, the story of ‘AI is growing on its own,’ which we only saw in science fiction, is becoming a reality. We may be facing a true artificial intelligence whose future development we cannot predict, behind the technological advantage of being able to obtain progressively better results.

3. Context Management

AI forgets everything when a basic conversation ends. To solve this, AI has a limited memory space called memory. Memory has the constraint of being a small space, but it can recall memories quickly. Not only AI, but many systems use a method of ‘keeping frequently used information in memory and fetching information from a long-term storage as needed.’ MemGPT provides a function (paging) to efficiently manage information in this way.

Also, the workspace that AI uses, the context window, also has limited space. Since it cannot contain all information as is, the AI compresses it for use. Through context compression, the AI creates ‘summary notes’ that summarize only the core points, allowing it to store more information efficiently in memory.

How Should We Use AI?

Based on the content provided in the paper, we can find ‘ways to get optimal answers through AI.’

Clarify the purpose of the task and minimize the scope.

This starts with ‘clearly understanding what I want to request.’ As in real life, good questions, or prompt engineering, are the most important thing for AI as well. And you must be able to break this request down into the smallest possible unit of questions. The larger the question, the higher the possibility of the Lost-In-The-Middle phenomenon occurring. The paper says that ‘it shows the best performance when important information is located at the beginning or end of the context.’

The same applies when requesting tasks. It is better to modularize tasks and request them step-by-step, rather than requesting many tasks at once.

Provide as much information as possible.

The biggest purpose for users using AI is to ‘get the optimal result quickly.’ In fact, providing information to the AI can be another task in itself. However, it is a natural result that the more information you provide to the AI, the higher the quality of the answer the AI will provide.

Beyond simply giving a lot of information, you must process the information into a form that the AI can use well and provide it in an assembled manner. You must provide specialized knowledge containing accurate information, and you must enable effective work by calling external tools (APIs) that can help with tasks the AI cannot do on its own.