LlamaIndex 接入教程

什么是 LlamaIndex?

LlamaIndex（原 GPT Index）是一个专门用于构建 RAG（检索增强生成）应用的数据框架，让 LLM 能够轻松连接和查询自定义数据。

📚

数据连接

支持 100+ 数据源：PDF、数据库、API、Notion 等

🔍

智能索引

向量、关键词、图索引等多种索引方式

🎯

精准查询

语义搜索、混合检索、重排序等

⚡

高性能

优化的查询管道，支持流式输出

💡 LangChain vs LlamaIndex：
• LangChain - 通用 AI 应用框架，适合构建各种类型的 AI 应用
• LlamaIndex - 专注于数据和 RAG 的框架，更适合知识库、文档问答场景
• 两者可以配合使用

安装 LlamaIndex

bash

# 安装核心包
pip install llama-index

# 安装 OpenAI 兼容的嵌入模型支持
pip install llama-index-embeddings-openai

# 安装向量存储（可选）
pip install llama-index-vector-stores-chroma

# 安装文档加载器（按需）
# pip install llama-index-readers-file  # 文件读取
# pip install llama-index-readers-web   # 网页读取
# pip install llama-index-readers-database  # 数据库读取

配置 XiDao Api

环境变量配置

bash

# Linux / macOS
export OPENAI_API_KEY="sk-你的API_KEY"
export OPENAI_BASE_URL="https://api.xidao.online/v1"

# Windows (PowerShell)
$env:OPENAI_API_KEY="sk-你的API_KEY"
$env:OPENAI_BASE_URL="https://api.xidao.online/v1"

代码中配置

Python

from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

# 配置 LLM
Settings.llm = OpenAI(
    model="gpt-4o",
    api_key="sk-你的API_KEY",
    base_url="https://api.xidao.online/v1"
)

# 配置 Embedding 模型
Settings.embed_model = OpenAIEmbedding(
    model="text-embedding-3-small",
    api_key="sk-你的API_KEY",
    base_url="https://api.xidao.online/v1"
)

基础 RAG 示例

最简单的 RAG 应用

Python

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

# 配置 API
Settings.llm = OpenAI(
    model="gpt-4o",
    api_key="sk-你的API_KEY",
    base_url="https://api.xidao.online/v1"
)
Settings.embed_model = OpenAIEmbedding(
    api_key="sk-你的API_KEY",
    base_url="https://api.xidao.online/v1"
)

# 1. 加载文档
documents = SimpleDirectoryReader("./data").load_data()

# 2. 创建索引
index = VectorStoreIndex.from_documents(documents)

# 3. 创建查询引擎
query_engine = index.as_query_engine()

# 4. 查询
response = query_engine.query("这个项目的主要功能是什么？")
print(response)

📁 准备测试数据：
在当前目录创建 ./data 文件夹，放入一些 .txt 或 .md 文件即可开始测试。

数据连接器

LlamaIndex 支持从多种数据源加载数据。

加载本地文件

Python

from llama_index.core import SimpleDirectoryReader
from llama_index.readers.file import PDFReader, MarkdownReader

# 加载目录中的所有文件
reader = SimpleDirectoryReader(
    input_dir="./docs",
    required_exts=[".pdf", ".md", ".txt", ".docx"],
    recursive=True
)
documents = reader.load_data()

print(f"加载了 {len(documents)} 个文档")

加载网页内容

Python

from llama_index.readers.web import SimpleWebPageReader

# 安装依赖：pip install llama-index-readers-web

urls = [
    "https://example.com/article1",
    "https://example.com/article2"
]

reader = SimpleWebPageReader(html_to_text=True)
documents = reader.load_data(urls)

print(f"从 {len(urls)} 个网页加载了 {len(documents)} 个文档")

手动创建文档

Python

from llama_index.core import Document

# 手动创建文档
documents = [
    Document(
        text="XiDao Api 提供稳定的 AI API 中转服务...",
        metadata={"source": "internal", "category": "intro"}
    ),
    Document(
        text="支持的模型包括 Claude、GPT、Gemini 等...",
        metadata={"source": "internal", "category": "models"}
    ),
]

# 直接使用这些文档创建索引
index = VectorStoreIndex.from_documents(documents)

高级索引策略

自定义分块策略

Python

from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# 自定义分块器
splitter = SentenceSplitter(
    chunk_size=1024,      # 每块最大字符数
    chunk_overlap=200,     # 块之间重叠字符数
)

# 加载并分割文档
documents = SimpleDirectoryReader("./data").load_data()
nodes = splitter.get_nodes_from_documents(documents)

# 创建索引
index = VectorStoreIndex(nodes)

print(f"创建了 {len(nodes)} 个节点")

使用外部向量数据库

Python

import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.embeddings.openai import OpenAIEmbedding

# 初始化 ChromaDB
chroma_client = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = chroma_client.get_or_create_collection("xidao_docs")

# 创建向量存储
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# 配置 embedding
embed_model = OpenAIEmbedding(
    api_key="sk-你的API_KEY",
    base_url="https://api.xidao.online/v1"
)

# 从文档创建索引（会持久化到 ChromaDB）
index = VectorStoreIndex.from_documents(
    documents,
    storage_context=storage_context,
    embed_model=embed_model
)

print("索引已保存到 ChromaDB")

查询引擎

LlamaIndex 提供多种查询方式来满足不同场景需求。

基本查询

Python

# 基本查询引擎
query_engine = index.as_query_engine()

response = query_engine.query("XiDao Api 有什么优势？")
print(f"回答: {response}")

# 查看引用的源节点
for node in response.source_nodes:
    print(f"- 相关度: {node.score:.3f}")
    print(f"  内容: {node.text[:100]}...")

流式查询

Python

# 流式输出
streaming_query_engine = index.as_query_engine(streaming=True)

response = streaming_query_engine.query("详细介绍产品特性")

# 流式打印结果
for text in response.response_gen:
    print(text, end="", flush=True)

相似度搜索

Python

# 获取相似文档（不生成回答）
retriever = index.as_retriever(similarity_top_k=5)

nodes = retriever.retrieve("价格和计费")

for i, node in enumerate(nodes):
    print(f"\n--- 结果 {i+1} (相关度: {node.score:.3f}) ---")
    print(node.text[:200])

对话引擎

对话引擎支持多轮对话，能够记住上下文历史。

Python

from llama_index.core.chat_engine import CondensePlusContextChatEngine

# 创建对话引擎
chat_engine = index.as_chat_engine(
    chat_mode="condense_plus_context",  # 压缩上下文 + 检索
    similarity_top_k=3,
    verbose=True
)

# 多轮对话
response1 = chat_engine.chat("你们支持哪些模型？")
print(f"AI: {response1.response}")

response2 = chat_engine.chat("Claude 模型的价格是多少？")
print(f"AI: {response2.response}")

# 对话引擎会记住之前的上下文

流式对话

Python

# 流式对话
streaming_chat = index.as_chat_engine(
    chat_mode="condense_plus_context",
    streaming=True
)

response = streaming_chat.chat("总结一下产品特点")

# 流式输出
for token in response.response_gen:
    print(token, end="", flush=True)

⚠️ 生产环境建议：
• 使用持久化的向量数据库（如 ChromaDB、Pinecone）
• 对于大量文档，考虑使用更高级的索引策略
• 监控 Token 使用量，优化成本
• 定期更新索引以保持数据新鲜度

什么是 LlamaIndex?

数据连接

智能索引

精准查询

高性能

安装 LlamaIndex

配置 XiDao Api

环境变量配置

代码中配置

基础 RAG 示例

最简单的 RAG 应用

数据连接器

加载本地文件

加载网页内容

手动创建文档

高级索引策略

自定义分块策略

使用外部向量数据库

查询引擎

基本查询

流式查询

相似度搜索

对话引擎

流式对话

📚 相关资源