Academy/AI Core Concepts Encyclopedia/Prompting and Interaction Techniques: Prompt, RAG, and Function Calling

Free Chapter 13 minChapter 3/5

Prompting and Interaction Techniques: Prompt, RAG, and Function Calling

Master the core technical concepts and methodologies for interacting with AI.

本章学习要点

第 3 / 5 章

Differentiate the meanings of the four levels: AI, AGI, ASI, ANI

Understand the relationship between Machine Learning, Deep Learning, and Neural Networks

Master the difference between Training and Inference

Understand the practical significance of model parameter counts (7B/70B/405B)

Differentiate the advantages and disadvantages of open-source vs. closed-source models

Knowing how AI works is not enough; the key is mastering how to interact with it efficiently. This chapter will systematically explain core interaction technologies such as Prompt Engineering, RAG, Chain of Thought, Function Calling, and Multimodal capabilities.

Prompt

**What is a Prompt?** Everything you send as input to an AI model is called a Prompt—including questions, instructions, background information, examples, etc. The quality of the Prompt directly determines the quality of the AI's output.

**System Prompt**: Instructions that set the model's role, behavioral rules, and output format, typically not visible to the user. For example: 'You are a professional legal advisor. Answer questions in plain language, keeping responses under 200 words.'

**User Prompt**: The content directly input by the user, i.e., the specific question or request.

**Prompt Engineering**: The techniques and methodology of carefully designing Prompts to obtain better AI outputs. This is currently the AI skill with the lowest barrier to entry and the highest ROI.

Core Prompt Techniques

Zero-shot vs Few-shot

**Zero-shot**: Give the AI a task directly without providing examples. E.g., 'Translate this passage into English.' Suitable for simple, clear tasks.

**Few-shot**: Provide 1-5 input-output examples in the Prompt to help the AI understand your desired format and style. Usually performs much better than Zero-shot.

**Example**: 'Please rewrite the product description into selling points—Example input: "This phone has a 6.7-inch screen" → Example output: "A massive 6.7-inch screen for an immersive experience perfect for streaming and gaming." Now please rewrite: "This laptop weighs only 1.2kg"'

Chain of Thought (CoT)

**What is CoT?** Making the AI show its reasoning process before giving the final answer. Simply adding 'Please think step by step' at the end of a Prompt can significantly improve accuracy on complex reasoning tasks.

**Why does it work?** LLM output is generated token by token. Making it generate reasoning steps first gives it more 'thinking space,' reducing errors from skipping steps.

**Application Scenarios**: Math problems, logical reasoning, complex analysis, multi-step decision-making. For simple tasks (like translation, rewriting), CoT can add unnecessary token consumption.

实用建议

Chain of Thought isn't limited to the phrase 'think step by step.' You can also use prompts like 'First analyze the reasons, then give the conclusion' or 'Please list your reasoning process' to guide the AI to show its steps.

RAG: Retrieval-Augmented Generation

**RAG (Retrieval-Augmented Generation)** is one of the most important AI application architectures today. Core idea: first retrieve relevant information from an external knowledge base, then provide the retrieved results as context to the LLM to generate an answer.

**Why is RAG needed?** LLMs have two major limitations: 1) Knowledge cutoff—they don't know events after their training data; 2) Hallucination—they may confidently fabricate non-existent information. RAG addresses these by introducing external knowledge sources.

**RAG Workflow**: User asks a question → Question is converted to an embedding → Similar documents are searched in a vector database → Retrieved documents are passed as context to the LLM → LLM generates an answer based on the context.

**RAG Application Scenarios**: Enterprise knowledge base Q&A, customer service bots, document analysis, legal statute queries, medical knowledge Q&A—any scenario requiring answers based on specific documents.

重要提醒

RAG vs. Fine-tuning: RAG is suitable for scenarios where knowledge is frequently updated (just update the documents). Fine-tuning is suitable for scenarios requiring changes to model behavior or style. Often, RAG is a more economical and flexible solution than fine-tuning.

Function Calling

**What is Function Calling?** The ability for an LLM to call external tools and APIs. LLMs themselves can only generate text, but through Function Calling, they can query weather, search databases, send emails, execute code, etc.

**Workflow**: User asks 'What's the weather like in Beijing today?' → LLM determines it needs to call a weather API → Outputs a function call request (function name + parameters) → Application layer executes the API call → Returns the result to the LLM → LLM summarizes the answer in natural language.

**Core Value**: Transforms LLMs from 'just chatting' to 'getting things done.' This is the technical foundation for AI Agents—systems that can autonomously call multiple tools to complete complex tasks.

**Models Supporting Function Calling**: Mainstream models like GPT-4o, Claude 3.5, Gemini 1.5 Pro, Qwen all support it.

Multimodal

**What is Multimodal?** Refers to AI models that can process and generate multiple types of data—not just text, but also images, audio, video, etc.

**Multimodal Input**: GPT-4o can accept text + image input simultaneously (e.g., upload a photo and ask 'What plant is this?'); Gemini can directly process video content.

**Multimodal Output**: DALL-E 3 generates images, Sora generates videos, TTS models generate speech. Currently, the hottest direction is unified multimodal models—a single model that simultaneously understands and generates text, images, and audio.

**Multimodal Application Scenarios**: Intelligent customer service (voice + text + images), document understanding (OCR + text analysis), content creation (text-to-image generation), medical image analysis (images + diagnostic text).

Hallucination

**What is Hallucination?** When an LLM confidently generates incorrect, non-existent, or factually inaccurate content. For example, fabricating non-existent paper citations, inventing legal statutes, or making up statistics.

**Why does hallucination occur?** Because LLMs essentially predict the next token based on probability, not retrieve facts from a knowledge base. When the model lacks relevant knowledge, it continues to 'make things up' based on patterns.

**How to reduce hallucinations?** Use RAG to introduce external knowledge sources; include in the Prompt 'If you're unsure, please say you don't know'; lower the Temperature to increase determinism; manually verify critical information.

注意事项

In critical fields like law, medicine, and finance, AI outputs must be verified by professionals. Do not use unverified AI outputs directly for decision-making—the hallucination problem is not yet fully solved.

Chapter Terminology Quick Reference

**Prompt**: Input content sent to the AI. **System Prompt**: Instructions that set the model's role and rules. **Zero-shot**: Prompting method without providing examples. **Few-shot**: Prompting method that provides examples. **CoT (Chain of Thought)**: Technique to guide the AI to show its reasoning steps. **RAG**: Retrieval-Augmented Generation, retrieving information from external knowledge bases to assist generation. **Function Calling**: Enables LLMs to call external tools and APIs. **Multimodal**: Processing and generating multiple data types. **Hallucination**: AI confidently generating incorrect content.

RAG Workflow

User Question

Embedding Conversion

Vector Database Retrieval

Relevant Documents

LLM Generates Answer

Prompt Technique Hierarchy

Zero-shot(Direct Question)

Few-shot(Provide Examples)

CoT(Guide Reasoning)

Agent(Autonomous Planning & Execution)

Chapter Quiz

1/4

1What is the core function of RAG?

In the next chapter, we will explore AI Agents and the tool ecosystem—key concepts for AI evolving from a 'tool' to an 'assistant.'

Previous Chapter

Core Concepts of Large Language Models: Token, Embedding, and Transformer

Next Chapter

AI Agents and Tool Ecosystems: MCP, LangChain, and Development Tools

Course Chapters

AI Fundamentals and Models: From Machine Learning to Large Language Models Core Concepts of Large Language Models: Token, Embedding, and Transformer Prompting and Interaction Techniques: Prompt, RAG, and Function Calling AI Agents and Tool Ecosystems: MCP, LangChain, and Development Tools AI Industry Terminology and Business Concepts: From SaaS to AI Governance

Finished? Mark as completed

Complete all chapters to earn your certificate

Want to unlock all course content?

Purchase the full learning pack for all chapters + certification guides + job templates

View Full Course