附录B：编程实践指南

本附录提供了知识图谱与AI融合应用开发中的编程实践指南，包括Python数据处理与可视化、图数据库查询语言以及REST API设计与实现。这些指南旨在帮助读者快速上手相关技术，编写高质量的代码。

B.1 Python数据处理与可视化

Python是知识图谱与AI融合应用开发的主要语言之一，拥有丰富的数据处理和可视化库。本部分将介绍常用的Python库和实践方法。

B.1.1 数据处理常用库

B.1.1.1 pandas

pandas是Python中最常用的数据处理库，提供了DataFrame数据结构，用于处理结构化数据。

核心功能：

数据读取与写入（支持CSV、Excel、SQL等格式）
数据清洗（缺失值处理、重复值处理、异常值处理）
数据转换（类型转换、格式转换、特征工程）
数据分组与聚合
数据合并与连接

代码示例：

import pandas as pd
import numpy as np

# 1. 读取数据
# 从CSV文件读取数据
df = pd.read_csv('data.csv')

# 从Excel文件读取数据
df = pd.read_excel('data.xlsx', sheet_name='Sheet1')

# 创建DataFrame
data = {
    'name': ['张三', '李四', '王五', '赵六'],
    'age': [25, 30, 35, 40],
    'city': ['北京', '上海', '广州', '深圳'],
    'salary': [8000, 12000, 10000, 15000]
}
df = pd.DataFrame(data)

# 2. 数据探索
print("数据基本信息：")
print(df.info())

print("\n数据统计描述：")
print(df.describe())

print("\n前5行数据：")
print(df.head())

print("\n后5行数据：")
print(df.tail())

# 3. 数据清洗
# 处理缺失值
df = df.fillna(0)  # 用0填充缺失值
df = df.dropna()  # 删除包含缺失值的行

# 处理重复值
df = df.drop_duplicates()

# 处理异常值
df = df[(df['age'] > 0) & (df['age'] < 100)]  # 过滤年龄异常值

# 4. 数据转换
# 类型转换
df['age'] = df['age'].astype(int)
df['salary'] = df['salary'].astype(float)

# 添加新列
df['bonus'] = df['salary'] * 0.1
df['total_income'] = df['salary'] + df['bonus']

# 应用函数
df['city_code'] = df['city'].apply(lambda x: {
    '北京': 'BJ', '上海': 'SH', '广州': 'GZ', '深圳': 'SZ'
}[x])

# 5. 数据分组与聚合
print("\n按城市分组统计平均工资：")
city_salary = df.groupby('city')['salary'].mean()
print(city_salary)

# 多列聚合
print("\n按城市分组统计工资和奖金：")
city_stats = df.groupby('city').agg({
    'salary': ['mean', 'sum', 'count'],
    'bonus': 'sum'
})
print(city_stats)

# 6. 数据合并
# 创建另一个DataFrame
department_data = {
    'name': ['张三', '李四', '王五', '赵六'],
    'department': ['技术部', '市场部', '销售部', '技术部']
}
department_df = pd.DataFrame(department_data)

# 合并数据
df = pd.merge(df, department_df, on='name', how='inner')
print("\n合并后的数据：")
print(df)

# 7. 数据写入
df.to_csv('processed_data.csv', index=False)
df.to_excel('processed_data.xlsx', index=False, sheet_name='Processed')

B.1.1.2 numpy

numpy是Python中用于科学计算的核心库，提供了高效的数组操作和数学函数。

核心功能：

多维数组（ndarray）操作
数学函数（三角函数、指数函数、对数函数等）
线性代数运算（矩阵乘法、特征值计算等）
随机数生成

代码示例：

import numpy as np

# 1. 创建数组
# 一维数组
arr1d = np.array([1, 2, 3, 4, 5])
print("一维数组：", arr1d)

# 二维数组
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print("二维数组：\n", arr2d)

# 全零数组
zeros = np.zeros((2, 3))
print("全零数组：\n", zeros)

# 全一数组
ones = np.ones((2, 3))
print("全一数组：\n", ones)

# 随机数组
random_arr = np.random.rand(2, 3)
print("随机数组：\n", random_arr)

# 2. 数组属性
print("\n数组形状：", arr2d.shape)
print("数组维度：", arr2d.ndim)
print("数组元素数量：", arr2d.size)
print("数组数据类型：", arr2d.dtype)

# 3. 数组操作
# 索引和切片
print("\n二维数组第一行：", arr2d[0])
print("二维数组第一列：", arr2d[:, 0])
print("二维数组前两行前两列：\n", arr2d[:2, :2])

# 数组运算
print("\n数组加法：\n", arr2d + 1)
print("数组乘法：\n", arr2d * 2)
print("数组平方：\n", arr2d ** 2)

# 矩阵乘法
matrix1 = np.array([[1, 2], [3, 4]])
matrix2 = np.array([[5, 6], [7, 8]])
matrix_product = np.dot(matrix1, matrix2)
print("\n矩阵乘法：\n", matrix_product)

# 4. 数学函数
print("\n平方根：\n", np.sqrt(arr2d))
print("指数：\n", np.exp(arr2d))
print("正弦：\n", np.sin(arr2d))

# 5. 统计函数
print("\n数组均值：", np.mean(arr2d))
print("数组标准差：", np.std(arr2d))
print("数组最大值：", np.max(arr2d))
print("数组最小值：", np.min(arr2d))
print("数组求和：", np.sum(arr2d))

B.1.2 数据可视化常用库

B.1.2.1 matplotlib

matplotlib是Python中最基础的可视化库，提供了丰富的绘图功能。

核心功能：

折线图、散点图、柱状图、直方图等基本图表
自定义图表样式（颜色、字体、标签等）
多子图绘制
三维可视化

代码示例：

import matplotlib.pyplot as plt
import numpy as np

# 设置中文显示
plt.rcParams['font.sans-serif'] = ['SimHei']  # 指定默认字体
plt.rcParams['axes.unicode_minus'] = False  # 解决保存图像时负号'-'显示为方块的问题

# 1. 折线图
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)

plt.figure(figsize=(10, 6))
plt.plot(x, y1, label='sin(x)', color='blue', linewidth=2, linestyle='-')
plt.plot(x, y2, label='cos(x)', color='red', linewidth=2, linestyle='--')
plt.title('正弦和余弦函数')
plt.xlabel('x')
plt.ylabel('y')
plt.grid(True)
plt.legend()
plt.show()

# 2. 散点图
x = np.random.rand(100)
y = np.random.rand(100)
sizes = np.random.randint(10, 100, 100)
colors = np.random.rand(100)

plt.figure(figsize=(10, 6))
plt.scatter(x, y, s=sizes, c=colors, alpha=0.7, cmap='viridis')
plt.colorbar(label='颜色值')
plt.title('散点图')
plt.xlabel('x')
plt.ylabel('y')
plt.grid(True)
plt.show()

# 3. 柱状图
categories = ['A', 'B', 'C', 'D', 'E']
values = [23, 45, 56, 78, 90]

plt.figure(figsize=(10, 6))
plt.bar(categories, values, color=['red', 'green', 'blue', 'yellow', 'purple'])
plt.title('柱状图')
plt.xlabel('类别')
plt.ylabel('数值')
plt.grid(True, axis='y')

# 在柱状图上添加数值标签
for i, v in enumerate(values):
    plt.text(i, v + 1, str(v), ha='center', va='bottom')

plt.show()

# 4. 直方图
data = np.random.randn(1000)  # 生成1000个服从正态分布的随机数

plt.figure(figsize=(10, 6))
plt.hist(data, bins=30, density=True, alpha=0.7, color='skyblue', edgecolor='black')
plt.title('直方图')
plt.xlabel('数值')
plt.ylabel('概率密度')
plt.grid(True)
plt.show()

# 5. 多子图
plt.figure(figsize=(12, 8))

# 第一个子图：折线图
plt.subplot(2, 2, 1)
plt.plot(x, y1, color='blue')
plt.title('子图1：sin(x)')

# 第二个子图：散点图
plt.subplot(2, 2, 2)
plt.scatter(x[:20], y2[:20], color='red')
plt.title('子图2：cos(x)散点')

# 第三个子图：柱状图
plt.subplot(2, 2, 3)
plt.bar(categories, values, color='green')
plt.title('子图3：柱状图')

# 第四个子图：直方图
plt.subplot(2, 2, 4)
plt.hist(data, bins=20, color='purple', alpha=0.7)
plt.title('子图4：直方图')

plt.tight_layout()  # 调整子图间距
plt.show()

B.1.2.2 seaborn

seaborn是基于matplotlib的高级可视化库，提供了更美观的图表样式和更简单的API。

核心功能：

统计可视化（箱线图、小提琴图、热力图等）
分类数据可视化
回归分析可视化
多变量可视化

代码示例：

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# 设置中文显示
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

# 1. 加载示例数据
df = sns.load_dataset('iris')
print("数据集信息：")
print(df.info())

# 2. 箱线图
plt.figure(figsize=(10, 6))
sns.boxplot(x='species', y='sepal_length', data=df)
plt.title('不同鸢尾花种类的花萼长度箱线图')
plt.show()

# 3. 小提琴图
plt.figure(figsize=(10, 6))
sns.violinplot(x='species', y='petal_length', data=df, palette='Set2')
plt.title('不同鸢尾花种类的花瓣长度小提琴图')
plt.show()

# 4. 热力图
# 计算相关系数矩阵
corr_matrix = df.corr()

plt.figure(figsize=(10, 8))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt='.2f', square=True)
plt.title('鸢尾花数据相关系数热力图')
plt.show()

# 5. 散点图矩阵
plt.figure(figsize=(12, 10))
sns.pairplot(df, hue='species', palette='Set1')
plt.suptitle('鸢尾花数据散点图矩阵', y=1.02)
plt.show()

# 6. 条形图
plt.figure(figsize=(10, 6))
sns.countplot(x='species', data=df, palette='Set3')
plt.title('不同鸢尾花种类的数量分布')
plt.show()

# 7. 回归图
plt.figure(figsize=(10, 6))
sns.regplot(x='sepal_length', y='sepal_width', data=df, scatter_kws={'alpha': 0.7})
plt.title('花萼长度与花萼宽度的回归关系')
plt.show()

# 8. 分组条形图
# 创建示例数据
category_data = pd.DataFrame({
    'category': ['A', 'B', 'C', 'D'],
    'value1': [23, 45, 56, 78],
    'value2': [34, 56, 67, 89],
    'value3': [45, 67, 78, 90]
})

# 转换为长格式
category_data_long = category_data.melt(id_vars='category', var_name='variable', value_name='value')

plt.figure(figsize=(10, 6))
sns.barplot(x='category', y='value', hue='variable', data=category_data_long, palette='Set2')
plt.title('分组条形图')
plt.legend(title='变量')
plt.show()

B.1.3 数据处理与可视化最佳实践

数据清洗优先：在进行分析和可视化之前，确保数据质量，处理缺失值、重复值和异常值。
理解数据：在进行可视化之前，先对数据进行探索性分析，了解数据的分布和特征。
选择合适的图表类型：根据数据类型和分析目的选择合适的图表类型，如连续数据用折线图或散点图，分类数据用柱状图或饼图。
保持简洁：图表应该简洁明了，避免过多的装饰和不必要的元素，突出核心信息。
使用合适的颜色：选择颜色时要考虑可读性和美观性，避免使用过于鲜艳或难以区分的颜色。
添加必要的标签和标题：为图表添加清晰的标题、坐标轴标签和图例，便于读者理解。
考虑受众：根据受众的背景和需求调整图表的复杂度和细节。
代码可读性：编写清晰、注释充分的代码，便于维护和共享。

B.2 图数据库查询语言

图数据库查询语言用于查询和操作图数据，常用的图查询语言包括Cypher（Neo4j）和Gremlin（Apache TinkerPop）。

B.2.1 Cypher（Neo4j）

Cypher是Neo4j图数据库的查询语言，采用类似SQL的语法，但针对图数据进行了优化。

B.2.1.1 基本语法

节点表示：

(node)：表示一个节点
(node:Label)：表示带有标签的节点
(node:Label {property: value})：表示带有标签和属性的节点

关系表示：

(a)-[r]->(b)：表示从节点a到节点b的有向关系
(a)-[r:RELATIONSHIP_TYPE]->(b)：表示带有类型的关系
(a)-[r:RELATION_TYPE {property: value}]->(b)：表示带有类型和属性的关系

查询结构：

MATCH：匹配图模式
WHERE：过滤条件
RETURN：返回结果
CREATE：创建节点和关系
UPDATE/SET：更新属性
DELETE：删除节点和关系
MERGE：创建或匹配节点和关系

B.2.1.2 常用查询示例

1. 创建节点和关系

-- 创建人员节点
CREATE (p:Person {name: '张三', age: 30, city: '北京'})
CREATE (q:Person {name: '李四', age: 28, city: '上海'})

-- 创建公司节点
CREATE (c:Company {name: 'ABC科技', industry: '互联网', location: '北京'})

-- 创建关系
CREATE (p)-[:WORKS_AT {position: '工程师', since: 2020}]->(c)
CREATE (p)-[:FRIEND_WITH {since: 2015}]->(q)

2. 查询节点

-- 查询所有人员节点
MATCH (p:Person)
RETURN p

-- 查询特定人员节点
MATCH (p:Person {name: '张三'})
RETURN p.name, p.age, p.city

-- 查询年龄大于25的人员
MATCH (p:Person)
WHERE p.age > 25
RETURN p.name, p.age

-- 查询在北京工作的人员
MATCH (p:Person)-[:WORKS_AT]->(c:Company {location: '北京'})
RETURN p.name, c.name

3. 查询关系

-- 查询张三的所有关系
MATCH (p:Person {name: '张三'})-[r]->(n)
RETURN p.name, type(r), n.name

-- 查询朋友关系
MATCH (p:Person)-[r:FRIEND_WITH]->(q:Person)
RETURN p.name, r.since, q.name

-- 查询间接关系（朋友的朋友）
MATCH (p:Person {name: '张三'})-[:FRIEND_WITH]->(q:Person)-[:FRIEND_WITH]->(r:Person)
RETURN p.name, q.name, r.name

4. 更新和删除

-- 更新节点属性
MATCH (p:Person {name: '张三'})
SET p.age = 31, p.salary = 15000
RETURN p

-- 删除关系
MATCH (p:Person {name: '张三'})-[r:FRIEND_WITH]->(q:Person {name: '李四'})
DELETE r

-- 删除节点（必须先删除所有关系）
MATCH (p:Person {name: '张三'})-[r]-()
DELETE r, p

5. 聚合查询

-- 统计人员数量
MATCH (p:Person)
RETURN count(p) AS total_people

-- 按城市统计人员数量
MATCH (p:Person)
RETURN p.city, count(p) AS people_count
ORDER BY people_count DESC

-- 计算平均年龄
MATCH (p:Person)
RETURN avg(p.age) AS average_age

-- 统计各公司员工数量
MATCH (p:Person)-[:WORKS_AT]->(c:Company)
RETURN c.name, count(p) AS employee_count
ORDER BY employee_count DESC

6. 路径查询

-- 查询张三到李四的最短路径
MATCH shortestPath((p:Person {name: '张三'})-[*]-(q:Person {name: '李四'})) AS path
RETURN path

-- 查询长度不超过3的路径
MATCH (p:Person {name: '张三'})-[*1..3]->(n)
RETURN p.name, n.name, length(path)

-- 查询特定类型的路径
MATCH (p:Person {name: '张三'})-[:FRIEND_WITH*2..3]->(n:Person)
RETURN p.name, n.name

B.2.2 Gremlin（Apache TinkerPop）

Gremlin是Apache TinkerPop框架的图遍历语言，支持多种图数据库，如JanusGraph、Neo4j等。

B.2.2.1 基本语法

Gremlin使用链式API，通过一系列步骤构建查询：

g.V()：获取所有节点
g.E()：获取所有关系
hasLabel()：按标签过滤
has()：按属性过滤
out()/in()/both()：遍历出边/入边/双向边
outE()/inE()/bothE()：获取出边/入边/双向边
outV()/inV()/bothV()：从边上获取出节点/入节点/双向节点
values()：获取属性值
valueMap()：获取属性映射
count()：计数
path()：返回路径
shortestPath()：最短路径

B.2.2.2 常用查询示例

1. 查询节点

-- 查询所有人员节点
g.V().hasLabel('Person')

-- 查询特定人员节点
g.V().hasLabel('Person').has('name', '张三')

-- 查询年龄大于25的人员
g.V().hasLabel('Person').has('age', gt(25)).values('name', 'age')

-- 查询在北京工作的人员
g.V().hasLabel('Person').out('WORKS_AT').has('location', '北京').values('name')

2. 查询关系

-- 查询张三的所有出边关系
g.V().hasLabel('Person').has('name', '张三').outE()

-- 查询张三的所有朋友
g.V().hasLabel('Person').has('name', '张三').out('FRIEND_WITH').values('name')

-- 查询朋友的朋友
g.V().hasLabel('Person').has('name', '张三').out('FRIEND_WITH').out('FRIEND_WITH').values('name')

-- 查询关系属性
g.V().hasLabel('Person').has('name', '张三').outE('WORKS_AT').values('position', 'since')

3. 创建节点和关系

-- 创建人员节点
g.addV('Person').property('name', '王五').property('age', 26).property('city', '广州')

-- 创建公司节点
g.addV('Company').property('name', 'XYZ公司').property('industry', '金融').property('location', '上海')

-- 创建关系（需要先获取节点）
// 方法1：使用ID
person = g.V().hasLabel('Person').has('name', '王五').next()
company = g.V().hasLabel('Company').has('name', 'XYZ公司').next()
g.addE('WORKS_AT').from(person).to(company).property('position', '分析师').property('since', 2021)

// 方法2：链式创建
g.V().hasLabel('Person').has('name', '王五').as('p')
  .V().hasLabel('Company').has('name', 'XYZ公司').as('c')
  .addE('WORKS_AT').from('p').to('c').property('position', '分析师').property('since', 2021)

4. 更新和删除

-- 更新节点属性
g.V().hasLabel('Person').has('name', '王五').property('age', 27).property('salary', 12000)

-- 删除关系
g.V().hasLabel('Person').has('name', '张三').outE('FRIEND_WITH').where(inV().has('name', '李四')).drop()

-- 删除节点（会自动删除相关关系）
g.V().hasLabel('Person').has('name', '王五').drop()

5. 聚合查询

-- 统计人员数量
g.V().hasLabel('Person').count()

-- 按城市统计人员数量
g.V().hasLabel('Person').group().by('city').by(count())

-- 计算平均年龄
g.V().hasLabel('Person').values('age').mean()

-- 统计各公司员工数量
g.V().hasLabel('Company').as('c').out('WORKS_AT').count().by('c').select('c').values('name')

6. 路径查询

-- 查询张三到李四的最短路径
g.V().hasLabel('Person').has('name', '张三').shortestPath().with(ShortestPath.target, g.V().hasLabel('Person').has('name', '李四'))

-- 查询长度不超过3的路径
g.V().hasLabel('Person').has('name', '张三').repeat(out()).times(1, 3).values('name')

-- 查询特定类型的路径
g.V().hasLabel('Person').has('name', '张三').repeat(out('FRIEND_WITH')).times(2, 3).values('name')

B.2.3 图查询最佳实践

使用索引：为经常查询的属性创建索引，提高查询性能。
- Cypher：CREATE INDEX FOR (p:Person) ON (p.name)
- Gremlin：具体实现取决于图数据库，如JanusGraph：mgmt.buildIndex('personByName', Vertex.class).addKey(mgmt.getPropertyKey('name')).buildCompositeIndex()
限制返回结果：使用LIMIT（Cypher）或limit()（Gremlin）限制返回结果数量，避免返回过多数据。
优化路径查询：路径查询可能很耗时，应限制路径长度，使用特定的关系类型。
使用参数化查询：避免直接拼接查询字符串，使用参数化查询防止注入攻击，提高性能。
合理使用标签和关系类型：设计清晰的标签和关系类型体系，便于查询和维护。
监控查询性能：使用图数据库提供的工具监控查询性能，优化慢查询。

B.3 REST API设计与实现

REST API是知识图谱与AI融合应用的重要组成部分，用于提供数据访问和服务调用接口。本部分将介绍REST API的设计原则和Python实现方法。

B.3.1 REST API设计原则

资源导向：API应该围绕资源设计，资源使用名词表示，如/users、/products。
HTTP方法：使用合适的HTTP方法表示操作：
- GET：获取资源
- POST：创建资源
- PUT：更新资源（全部更新）
- PATCH：更新资源（部分更新）
- DELETE：删除资源
URL设计：
- 使用小写字母和连字符（-），避免下划线和大写字母
- 使用复数形式表示资源集合，如/users而不是/user
- 使用层次结构表示资源关系，如/users/{id}/orders
状态码：使用合适的HTTP状态码表示响应状态：
- 200 OK：成功获取资源
- 201 Created：成功创建资源
- 204 No Content：成功但无内容返回
- 400 Bad Request：请求参数错误
- 401 Unauthorized：未授权
- 403 Forbidden：禁止访问
- 404 Not Found：资源不存在
- 500 Internal Server Error：服务器内部错误
响应格式：
- 使用JSON格式作为响应体
- 提供一致的响应结构，如包含data、message、status等字段
- 对于集合资源，提供分页、排序和过滤功能
错误处理：
- 提供详细的错误信息
- 使用一致的错误格式
- 记录错误日志
版本控制：
- 在URL中包含版本号，如/v1/users
- 或使用请求头进行版本控制

B.3.2 使用FastAPI实现REST API

FastAPI是一个现代、高性能的Python Web框架，用于构建RESTful API。它基于Python类型提示，自动生成API文档。

B.3.2.1 安装FastAPI

pip install fastapi uvicorn

B.3.2.2 基本示例

1. 简单API

from fastapi import FastAPI

app = FastAPI(title="知识图谱API", version="1.0.0")

# 根路径
@app.get("/")
async def root():
    return {"message": "欢迎使用知识图谱API"}

# 获取所有人员
@app.get("/persons", tags=["人员管理"])
async def get_persons():
    # 模拟数据
    persons = [
        {"id": 1, "name": "张三", "age": 30, "city": "北京"},
        {"id": 2, "name": "李四", "age": 28, "city": "上海"},
        {"id": 3, "name": "王五", "age": 26, "city": "广州"}
    ]
    return {"data": persons, "total": len(persons)}

# 获取特定人员
@app.get("/persons/{person_id}", tags=["人员管理"])
async def get_person(person_id: int):
    # 模拟数据
    person = {"id": person_id, "name": "张三", "age": 30, "city": "北京"}
    return {"data": person}

# 启动服务：uvicorn main:app --reload

2. 使用Pydantic模型

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional

app = FastAPI(title="知识图谱API", version="1.0.0")

# 定义数据模型
class PersonBase(BaseModel):
    name: str
    age: int
    city: str
    
class PersonCreate(PersonBase):
    pass
    
class PersonUpdate(BaseModel):
    name: Optional[str] = None
    age: Optional[int] = None
    city: Optional[str] = None
    
class Person(PersonBase):
    id: int
    
    class Config:
        from_attributes = True  # 用于ORM映射

# 模拟数据库
persons_db = {
    1: Person(id=1, name="张三", age=30, city="北京"),
    2: Person(id=2, name="李四", age=28, city="上海"),
    3: Person(id=3, name="王五", age=26, city="广州")
}

# 获取所有人员
@app.get("/persons", response_model=List[Person], tags=["人员管理"])
async def get_persons():
    return list(persons_db.values())

# 获取特定人员
@app.get("/persons/{person_id}", response_model=Person, tags=["人员管理"])
async def get_person(person_id: int):
    if person_id not in persons_db:
        raise HTTPException(status_code=404, detail="人员不存在")
    return persons_db[person_id]

# 创建人员
@app.post("/persons", response_model=Person, status_code=201, tags=["人员管理"])
async def create_person(person: PersonCreate):
    # 生成新ID
    new_id = max(persons_db.keys()) + 1 if persons_db else 1
    new_person = Person(id=new_id, **person.dict())
    persons_db[new_id] = new_person
    return new_person

# 更新人员
@app.put("/persons/{person_id}", response_model=Person, tags=["人员管理"])
async def update_person(person_id: int, person: PersonUpdate):
    if person_id not in persons_db:
        raise HTTPException(status_code=404, detail="人员不存在")
    
    # 更新属性
    db_person = persons_db[person_id]
    update_data = person.dict(exclude_unset=True)  # 只更新提供的字段
    updated_person = db_person.copy(update=update_data)
    persons_db[person_id] = updated_person
    return updated_person

# 删除人员
@app.delete("/persons/{person_id}", status_code=204, tags=["人员管理"])
async def delete_person(person_id: int):
    if person_id not in persons_db:
        raise HTTPException(status_code=404, detail="人员不存在")
    del persons_db[person_id]
    return None

3. 路径参数和查询参数

from fastapi import FastAPI, Query

app = FastAPI(title="知识图谱API", version="1.0.0")

# 带查询参数的API
@app.get("/persons", tags=["人员管理"])
async def get_persons(
    name: Optional[str] = Query(None, description="按姓名过滤"),
    city: Optional[str] = Query(None, description="按城市过滤"),
    age_gt: Optional[int] = Query(None, description="年龄大于"),
    skip: int = Query(0, ge=0, description="跳过前N条记录"),
    limit: int = Query(10, ge=1, le=100, description="返回记录数")
):
    # 模拟数据
    persons = [
        {"id": 1, "name": "张三", "age": 30, "city": "北京"},
        {"id": 2, "name": "李四", "age": 28, "city": "上海"},
        {"id": 3, "name": "王五", "age": 26, "city": "广州"},
        {"id": 4, "name": "赵六", "age": 32, "city": "北京"}
    ]
    
    # 过滤
    filtered_persons = persons
    if name:
        filtered_persons = [p for p in filtered_persons if name in p["name"]]
    if city:
        filtered_persons = [p for p in filtered_persons if p["city"] == city]
    if age_gt:
        filtered_persons = [p for p in filtered_persons if p["age"] > age_gt]
    
    # 分页
    total = len(filtered_persons)
    paginated_persons = filtered_persons[skip:skip+limit]
    
    return {
        "data": paginated_persons,
        "total": total,
        "skip": skip,
        "limit": limit
    }

4. 错误处理

from fastapi import FastAPI, HTTPException, Request
from fastapi.responses import JSONResponse
from fastapi.exceptions import RequestValidationError

app = FastAPI(title="知识图谱API", version="1.0.0")

# 自定义异常处理
class PersonNotFoundException(HTTPException):
    def __init__(self, person_id: int):
        super().__init__(status_code=404, detail=f"人员 {person_id} 不存在")

# 全局异常处理器
@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
    return JSONResponse(
        status_code=400,
        content={
            "status": "error",
            "message": "请求参数错误",
            "errors": exc.errors()
        }
    )

@app.exception_handler(HTTPException)
async def http_exception_handler(request: Request, exc: HTTPException):
    return JSONResponse(
        status_code=exc.status_code,
        content={
            "status": "error",
            "message": exc.detail
        }
    )

# 使用自定义异常
@app.get("/persons/{person_id}", tags=["人员管理"])
async def get_person(person_id: int):
    # 模拟数据库查询
    if person_id not in {1, 2, 3}:
        raise PersonNotFoundException(person_id)
    return {"id": person_id, "name": f"人员{person_id}", "age": 30}

5. 启动服务

将上述代码保存为main.py，然后运行：

uvicorn main:app --reload --host 0.0.0.0 --port 8000

访问http://localhost:8000/docs查看自动生成的API文档。

B.3.3 API设计最佳实践

使用一致的命名规范：URL路径、参数名、响应字段使用一致的命名规范，如蛇形命名法（snake_case）。
提供详细的文档：使用FastAPI的自动文档功能，或使用Swagger/OpenAPI规范编写文档。
实现认证和授权：使用OAuth2、JWT等机制实现API的认证和授权。
添加请求验证：使用Pydantic模型验证请求参数和请求体。
实现速率限制：防止API被滥用，实现速率限制。
添加日志记录：记录API请求和响应，便于调试和监控。
实现监控和指标：添加监控指标，如请求次数、响应时间等。
考虑缓存：对频繁访问的数据实现缓存，提高API性能。
实现CORS：允许跨域请求，方便前端应用调用。
编写单元测试：为API编写单元测试，确保功能正确性。

B.4 本章小结

本附录介绍了知识图谱与AI融合应用开发中的编程实践指南，包括：

Python数据处理与可视化：介绍了pandas、numpy等数据处理库，以及matplotlib、seaborn等可视化库，并提供了详细的代码示例。
图数据库查询语言：介绍了Cypher（Neo4j）和Gremlin（Apache TinkerPop）两种常用的图查询语言，包括基本语法和常用查询示例。
REST API设计与实现：介绍了REST API的设计原则，以及使用FastAPI框架实现REST API的方法，包括基本示例、数据模型、错误处理等。

这些编程实践指南旨在帮助读者快速上手相关技术，编写高质量的代码，加速知识图谱与AI融合应用的开发过程。