第100集标准库综合应用

学习目标

掌握Python标准库的综合运用方法
学会将多个标准库模块结合解决实际问题
理解标准库在项目开发中的重要性
能够独立设计和实现基于标准库的应用程序

一、标准库综合应用概述

Python标准库包含了丰富的模块，每个模块都有其特定的功能。在实际项目开发中，我们很少只使用单个模块，而是需要将多个模块结合起来，形成一个完整的解决方案。

1.1 标准库的优势

稳定性高：经过长期测试和广泛使用，稳定性有保障
无需安装：内置模块，无需额外安装依赖
性能优化：许多模块使用C语言实现，性能优秀
文档完善：拥有详细的官方文档和丰富的示例

1.2 常见模块组合场景

文件操作 + 数据处理：os + shutil + json + csv
网络编程 + 数据解析：socket + urllib + re + json
时间处理 + 随机数生成：datetime + time + random
系统交互 + 进程管理：os + sys + subprocess + multiprocessing

二、综合应用案例1：文件批量处理工具

2.1 需求分析

开发一个文件批量处理工具，实现以下功能：

扫描指定目录下的所有文件
根据文件类型进行分类
对不同类型的文件进行不同处理（如压缩图片、格式化文本等）
生成处理报告

2.2 技术选型

os/os.path：目录遍历和文件路径处理
shutil：文件复制、移动和删除
time/datetime：时间记录
json：配置文件读写
re：文件类型识别
random：生成临时文件名

2.3 核心代码实现

import os
import shutil
import time
import datetime
import json
import re
import random

def scan_directory(directory):
    """扫描目录下的所有文件"""
    files = []
    for root, _, filenames in os.walk(directory):
        for filename in filenames:
            file_path = os.path.join(root, filename)
            # 获取文件信息
            file_info = {
                'path': file_path,
                'name': filename,
                'size': os.path.getsize(file_path),
                'modified_time': os.path.getmtime(file_path),
                'extension': os.path.splitext(filename)[1].lower()
            }
            files.append(file_info)
    return files

def classify_files(files):
    """根据文件类型分类"""
    categories = {
        'images': [],    # 图片文件
        'documents': [], # 文档文件
        'videos': [],    # 视频文件
        'audio': [],     # 音频文件
        'code': [],      # 代码文件
        'others': []     # 其他文件
    }
    
    # 定义文件类型正则表达式
    image_pattern = r'\.(jpg|jpeg|png|gif|bmp|tiff)$'
    document_pattern = r'\.(txt|doc|docx|pdf|xls|xlsx|ppt|pptx|md)$'
    video_pattern = r'\.(mp4|avi|mov|wmv|flv)$'
    audio_pattern = r'\.(mp3|wav|ogg|flac)$'
    code_pattern = r'\.(py|java|c|cpp|h|js|html|css)$'
    
    for file in files:
        extension = file['extension']
        if re.match(image_pattern, extension, re.IGNORECASE):
            categories['images'].append(file)
        elif re.match(document_pattern, extension, re.IGNORECASE):
            categories['documents'].append(file)
        elif re.match(video_pattern, extension, re.IGNORECASE):
            categories['videos'].append(file)
        elif re.match(audio_pattern, extension, re.IGNORECASE):
            categories['audio'].append(file)
        elif re.match(code_pattern, extension, re.IGNORECASE):
            categories['code'].append(file)
        else:
            categories['others'].append(file)
    
    return categories

def process_files(categories, output_dir):
    """处理不同类型的文件"""
    results = []
    
    for category, files in categories.items():
        if not files:
            continue
            
        # 创建分类目录
        category_dir = os.path.join(output_dir, category)
        os.makedirs(category_dir, exist_ok=True)
        
        for file in files:
            try:
                # 生成目标文件路径
                timestamp = int(time.time())
                random_str = ''.join(random.choices('abcdefghijklmnopqrstuvwxyz0123456789', k=8))
                new_filename = f"{os.path.splitext(file['name'])[0]}_{timestamp}_{random_str}{file['extension']}"
                target_path = os.path.join(category_dir, new_filename)
                
                # 根据文件类型进行不同处理
                if category == 'images':
                    # 图片文件：简单复制（实际项目中可添加压缩等处理）
                    shutil.copy2(file['path'], target_path)
                elif category == 'documents':
                    # 文档文件：复制并记录内容摘要
                    shutil.copy2(file['path'], target_path)
                    # 对于文本文件，可以添加内容提取等处理
                    if file['extension'] in ['.txt', '.md']:
                        with open(file['path'], 'r', encoding='utf-8', errors='ignore') as f:
                            content = f.read(500)  # 读取前500个字符作为摘要
                        results.append({
                            'original_path': file['path'],
                            'target_path': target_path,
                            'category': category,
                            'status': 'success',
                            'summary': content[:500] + '...' if len(content) > 500 else content
                        })
                        continue
                elif category == 'code':
                    # 代码文件：复制并记录代码行数
                    shutil.copy2(file['path'], target_path)
                    with open(file['path'], 'r', encoding='utf-8', errors='ignore') as f:
                        line_count = len(f.readlines())
                    results.append({
                        'original_path': file['path'],
                        'target_path': target_path,
                        'category': category,
                        'status': 'success',
                        'line_count': line_count
                    })
                    continue
                else:
                    # 其他文件：直接复制
                    shutil.copy2(file['path'], target_path)
                
                results.append({
                    'original_path': file['path'],
                    'target_path': target_path,
                    'category': category,
                    'status': 'success'
                })
                
            except Exception as e:
                results.append({
                    'original_path': file['path'],
                    'target_path': '',
                    'category': category,
                    'status': 'failed',
                    'error': str(e)
                })
    
    return results

def generate_report(results, output_dir):
    """生成处理报告"""
    # 统计信息
    total_files = len(results)
    success_files = sum(1 for r in results if r['status'] == 'success')
    failed_files = sum(1 for r in results if r['status'] == 'failed')
    
    # 分类统计
    category_stats = {}
    for r in results:
        if r['category'] not in category_stats:
            category_stats[r['category']] = {'success': 0, 'failed': 0}
        if r['status'] == 'success':
            category_stats[r['category']]['success'] += 1
        else:
            category_stats[r['category']]['failed'] += 1
    
    # 生成报告内容
    report = {
        'report_time': datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
        'total_files': total_files,
        'success_files': success_files,
        'failed_files': failed_files,
        'category_stats': category_stats,
        'details': results
    }
    
    # 保存为JSON文件
    report_path = os.path.join(output_dir, f"file_process_report_{int(time.time())}.json")
    with open(report_path, 'w', encoding='utf-8') as f:
        json.dump(report, f, ensure_ascii=False, indent=4)
    
    return report_path

def main():
    """主函数"""
    # 配置信息
    config = {
        'source_dir': './source_files',      # 源文件目录
        'output_dir': './processed_files',   # 输出目录
        'log_file': './file_process_log.txt' # 日志文件
    }
    
    # 确保目录存在
    os.makedirs(config['source_dir'], exist_ok=True)
    os.makedirs(config['output_dir'], exist_ok=True)
    
    # 记录开始时间
    start_time = time.time()
    print(f"开始处理文件，时间：{datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    
    try:
        # 1. 扫描目录
        print(f"正在扫描目录：{config['source_dir']}")
        files = scan_directory(config['source_dir'])
        print(f"共发现 {len(files)} 个文件")
        
        # 2. 分类文件
        print("正在分类文件...")
        categories = classify_files(files)
        for category, count in [(k, len(v)) for k, v in categories.items()]:
            if count > 0:
                print(f"  {category}: {count} 个文件")
        
        # 3. 处理文件
        print(f"正在处理文件，输出目录：{config['output_dir']}")
        results = process_files(categories, config['output_dir'])
        
        # 4. 生成报告
        print("正在生成处理报告...")
        report_path = generate_report(results, config['output_dir'])
        
        # 记录结束时间
        end_time = time.time()
        elapsed_time = end_time - start_time
        
        # 打印总结
        print(f"\n=== 处理完成 ===")
        print(f"总文件数：{len(results)}")
        print(f"成功：{sum(1 for r in results if r['status'] == 'success')}")
        print(f"失败：{sum(1 for r in results if r['status'] == 'failed')}")
        print(f"耗时：{elapsed_time:.2f} 秒")
        print(f"报告路径：{report_path}")
        
    except Exception as e:
        print(f"处理过程中发生错误：{str(e)}")
        import traceback
        traceback.print_exc()

if __name__ == "__main__":
    main()

2.4 代码解析

目录扫描：使用os.walk()递归遍历目录下的所有文件
文件分类：使用正则表达式re匹配文件扩展名进行分类
文件处理：根据不同文件类型执行不同操作，如复制、内容提取等
报告生成：使用json模块保存处理结果，方便后续分析
错误处理：使用异常处理机制确保程序稳定性

三、综合应用案例2：简易网络爬虫

3.1 需求分析

开发一个简易网络爬虫，实现以下功能：

爬取指定网站的内容
提取网页中的链接和图片
下载图片到本地
记录爬取历史
避免重复爬取

3.2 技术选型

urllib.request：发送HTTP请求
re：正则表达式提取内容
os/os.path：文件和目录操作
time：控制爬取频率
json：保存爬取历史
random：生成随机用户代理

3.3 核心代码实现

import urllib.request
import re
import os
import time
import json
import random

def get_html(url, headers=None):
    """获取网页内容"""
    if headers is None:
        # 随机用户代理
        user_agents = [
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
            "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0",
            "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/605.1.15"
        ]
        headers = {
            'User-Agent': random.choice(user_agents)
        }
    
    req = urllib.request.Request(url, headers=headers)
    
    try:
        with urllib.request.urlopen(req, timeout=10) as response:
            html = response.read().decode('utf-8', errors='ignore')
        return html
    except Exception as e:
        print(f"获取网页失败：{url}，错误：{str(e)}")
        return None

def extract_links(html, base_url):
    """提取网页中的链接"""
    links = []
    # 匹配<a>标签中的href属性
    link_pattern = r'<a\s+[^>]*href=["\'](.*?)["\'][^>]*>'
    
    for match in re.finditer(link_pattern, html, re.IGNORECASE):
        href = match.group(1)
        # 处理相对链接
        if href.startswith('http://') or href.startswith('https://'):
            full_url = href
        elif href.startswith('/'):
            # 绝对路径，基于base_url
            full_url = base_url.rstrip('/') + href
        else:
            # 相对路径，基于base_url
            full_url = base_url.rstrip('/') + '/' + href
        
        links.append(full_url)
    
    return list(set(links))  # 去重

def extract_images(html, base_url):
    """提取网页中的图片链接"""
    images = []
    # 匹配<img>标签中的src属性
    img_pattern = r'<img\s+[^>]*src=["\'](.*?)["\'][^>]*>'
    
    for match in re.finditer(img_pattern, html, re.IGNORECASE):
        src = match.group(1)
        # 处理相对链接
        if src.startswith('http://') or src.startswith('https://'):
            full_url = src
        elif src.startswith('/'):
            # 绝对路径，基于base_url
            full_url = base_url.rstrip('/') + src
        else:
            # 相对路径，基于base_url
            full_url = base_url.rstrip('/') + '/' + src
        
        images.append(full_url)
    
    return list(set(images))  # 去重

def download_image(url, save_dir):
    """下载图片到本地"""
    try:
        # 确保保存目录存在
        os.makedirs(save_dir, exist_ok=True)
        
        # 获取文件名
        filename = os.path.basename(url)
        # 如果文件名中没有扩展名，添加.jpg
        if '.' not in filename:
            filename += '.jpg'
        
        save_path = os.path.join(save_dir, filename)
        
        # 下载图片
        urllib.request.urlretrieve(url, save_path)
        return save_path
    except Exception as e:
        print(f"下载图片失败：{url}，错误：{str(e)}")
        return None

def load_crawl_history(history_file):
    """加载爬取历史"""
    if os.path.exists(history_file):
        with open(history_file, 'r', encoding='utf-8') as f:
            return json.load(f)
    return []

def save_crawl_history(history_file, history):
    """保存爬取历史"""
    with open(history_file, 'w', encoding='utf-8') as f:
        json.dump(history, f, ensure_ascii=False, indent=4)

def is_already_crawled(url, history):
    """检查URL是否已经爬取过"""
    return url in history

def crawl_website(start_url, max_depth=2, save_dir='./crawled_images'):
    """爬取网站"""
    # 初始化
    history_file = './crawl_history.json'
    history = load_crawl_history(history_file)
    queue = [(start_url, 0)]  # (url, depth)
    crawled_count = 0
    downloaded_images = 0
    
    print(f"开始爬取网站：{start_url}")
    print(f"最大深度：{max_depth}")
    print(f"图片保存目录：{save_dir}")
    print("=" * 60)
    
    while queue:
        url, depth = queue.pop(0)
        
        # 检查是否超过最大深度或已经爬取过
        if depth > max_depth or is_already_crawled(url, history):
            continue
        
        print(f"[{depth}/{max_depth}] 正在爬取：{url}")
        
        # 获取网页内容
        html = get_html(url, {})
        if html is None:
            continue
        
        # 记录爬取历史
        history.append(url)
        crawled_count += 1
        
        # 提取并下载图片
        images = extract_images(html, url)
        for img_url in images:
            save_path = download_image(img_url, save_dir)
            if save_path:
                downloaded_images += 1
                print(f"  下载图片：{img_url} -> {save_path}")
        
        # 如果未达到最大深度，提取链接继续爬取
        if depth < max_depth:
            links = extract_links(html, url)
            for link in links:
                if not is_already_crawled(link, history):
                    queue.append((link, depth + 1))
        
        # 控制爬取频率，避免给服务器造成压力
        time.sleep(random.uniform(0.5, 2.0))
        
        # 定期保存历史
        if crawled_count % 10 == 0:
            save_crawl_history(history_file, history)
    
    # 最后保存历史
    save_crawl_history(history_file, history)
    
    print("=" * 60)
    print(f"爬取完成！")
    print(f"爬取网页数：{crawled_count}")
    print(f"下载图片数：{downloaded_images}")
    print(f"爬取历史已保存到：{history_file}")

def main():
    """主函数"""
    # 配置信息
    start_url = 'https://example.com'  # 起始URL
    max_depth = 1  # 最大爬取深度
    save_dir = './crawled_images'  # 图片保存目录
    
    try:
        crawl_website(start_url, max_depth, save_dir)
    except KeyboardInterrupt:
        print("\n爬取被用户中断")
    except Exception as e:
        print(f"爬取过程中发生错误：{str(e)}")
        import traceback
        traceback.print_exc()

if __name__ == "__main__":
    main()

3.4 代码解析

网页获取：使用urllib.request发送HTTP请求，模拟不同浏览器的用户代理
内容提取：使用re正则表达式提取链接和图片
链接处理：自动处理相对链接和绝对链接，构建完整URL
图片下载：使用urllib.request.urlretrieve()下载图片到本地
爬取控制：实现深度控制、去重和频率控制，避免对目标网站造成过大压力
历史记录：使用json保存爬取历史，避免重复爬取

四、综合应用案例3：系统监控工具

4.1 需求分析

开发一个系统监控工具，实现以下功能：

监控CPU和内存使用情况
监控磁盘空间和IO
监控网络连接和流量
记录监控数据
当指标超过阈值时发送警报

4.2 技术选型

os/sys：系统信息获取
psutil：系统资源监控（注意：psutil是第三方库，这里仅作为示例）
time/datetime：时间记录
json/csv：数据存储
smtplib：发送邮件警报
logging：日志记录

4.3 核心代码实现（简化版）

import os
import time
import datetime
import json
import csv
import logging
import smtplib
from email.mime.text import MIMEText
from email.header import Header

# 配置日志
logging.basicConfig(
    filename='system_monitor.log',
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)

class SystemMonitor:
    def __init__(self, config):
        self.config = config
        self.data_dir = config.get('data_dir', './monitor_data')
        self.thresholds = config.get('thresholds', {})
        self.alert_enabled = config.get('alert_enabled', False)
        
        # 创建数据目录
        os.makedirs(self.data_dir, exist_ok=True)
        
    def get_system_info(self):
        """获取系统信息（简化版，实际项目中使用psutil等库）"""
        # 注意：这里使用简化的方式获取系统信息，实际项目中应使用psutil等专业库
        try:
            # CPU使用率（简化模拟）
            cpu_usage = float(os.popen('wmic cpu get loadpercentage').read().strip().split('\n')[1])
        except:
            cpu_usage = 0.0
            
        try:
            # 内存信息（简化模拟）
            mem_info = os.popen('wmic os get FreePhysicalMemory,TotalVisibleMemorySize').read().strip().split('\n')[1]
            free_mem, total_mem = map(int, mem_info.split())
            mem_usage = ((total_mem - free_mem) / total_mem) * 100
        except:
            mem_usage = 0.0
            
        try:
            # 磁盘空间
            disk_info = os.popen('wmic logicaldisk where DriveType=3 get Size,FreeSpace').read().strip().split('\n')[1]
            free_space, total_space = map(int, disk_info.split())
            disk_usage = ((total_space - free_space) / total_space) * 100
        except:
            disk_usage = 0.0
        
        return {
            'timestamp': datetime.datetime.now().isoformat(),
            'cpu_usage': cpu_usage,
            'mem_usage': mem_usage,
            'disk_usage': disk_usage,
            'network_connections': 0,  # 简化：实际项目中使用psutil获取
            'disk_io': 0,  # 简化：实际项目中使用psutil获取
            'network_io': 0   # 简化：实际项目中使用psutil获取
        }
    
    def save_data(self, data):
        """保存监控数据"""
        # 保存为JSON格式
        json_file = os.path.join(self.data_dir, f"monitor_data_{datetime.datetime.now().strftime('%Y-%m-%d')}.json")
        
        # 读取现有数据
        existing_data = []
        if os.path.exists(json_file):
            with open(json_file, 'r', encoding='utf-8') as f:
                existing_data = json.load(f)
        
        # 添加新数据
        existing_data.append(data)
        
        # 保存
        with open(json_file, 'w', encoding='utf-8') as f:
            json.dump(existing_data, f, ensure_ascii=False, indent=4)
        
        # 同时保存为CSV格式，方便数据分析
        csv_file = os.path.join(self.data_dir, f"monitor_data_{datetime.datetime.now().strftime('%Y-%m-%d')}.csv")
        fields = ['timestamp', 'cpu_usage', 'mem_usage', 'disk_usage', 'network_connections', 'disk_io', 'network_io']
        
        with open(csv_file, 'w', newline='', encoding='utf-8') as f:
            writer = csv.DictWriter(f, fieldnames=fields)
            writer.writeheader()
            writer.writerows(existing_data)
    
    def check_thresholds(self, data):
        """检查是否超过阈值"""
        alerts = []
        
        for metric, value in data.items():
            if metric in self.thresholds and isinstance(value, (int, float)):
                threshold = self.thresholds[metric]
                if value > threshold:
                    alerts.append({
                        'metric': metric,
                        'value': value,
                        'threshold': threshold,
                        'timestamp': data['timestamp']
                    })
        
        return alerts
    
    def send_alert(self, alerts):
        """发送警报邮件"""
        if not self.alert_enabled or not alerts:
            return
        
        try:
            # 配置邮件信息
            smtp_server = self.config.get('smtp_server', 'smtp.example.com')
            smtp_port = self.config.get('smtp_port', 587)
            sender = self.config.get('sender', 'alert@example.com')
            password = self.config.get('password', 'password')
            receiver = self.config.get('receiver', 'admin@example.com')
            
            # 构建邮件内容
            subject = '系统监控警报'
            content = f"系统监控发现异常情况！\n\n"
            content += f"监控时间：{datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n\n"
            
            for alert in alerts:
                content += f"- {alert['metric']}: {alert['value']:.2f} (阈值: {alert['threshold']})\n"
            
            # 创建邮件
            message = MIMEText(content, 'plain', 'utf-8')
            message['From'] = Header(sender)
            message['To'] = Header(receiver)
            message['Subject'] = Header(subject)
            
            # 发送邮件
            with smtplib.SMTP(smtp_server, smtp_port) as server:
                server.starttls()
                server.login(sender, password)
                server.sendmail(sender, receiver, message.as_string())
            
            logging.info(f"发送警报邮件成功，共 {len(alerts)} 条警报")
            
        except Exception as e:
            logging.error(f"发送警报邮件失败：{str(e)}")
    
    def run(self, interval=60):
        """运行监控"""
        logging.info("系统监控工具启动")
        logging.info(f"监控间隔：{interval} 秒")
        logging.info(f"阈值配置：{self.thresholds}")
        
        try:
            while True:
                # 获取系统信息
                system_info = self.get_system_info()
                
                # 保存数据
                self.save_data(system_info)
                
                # 检查阈值
                alerts = self.check_thresholds(system_info)
                
                # 发送警报
                self.send_alert(alerts)
                
                # 记录日志
                logging.info(f"监控数据：CPU={system_info['cpu_usage']:.2f}%，内存={system_info['mem_usage']:.2f}%，磁盘={system_info['disk_usage']:.2f}%")
                
                if alerts:
                    logging.warning(f"发现 {len(alerts)} 条警报：{[a['metric'] for a in alerts]}")
                
                # 等待下一次监控
                time.sleep(interval)
                
        except KeyboardInterrupt:
            logging.info("系统监控工具被用户中断")
        except Exception as e:
            logging.error(f"系统监控工具发生错误：{str(e)}")
            import traceback
            logging.error(traceback.format_exc())

def main():
    """主函数"""
    # 配置信息
    config = {
        'data_dir': './monitor_data',
        'thresholds': {
            'cpu_usage': 80.0,   # CPU使用率阈值
            'mem_usage': 85.0,   # 内存使用率阈值
            'disk_usage': 90.0   # 磁盘使用率阈值
        },
        'alert_enabled': False,  # 是否启用警报
        'smtp_server': 'smtp.example.com',
        'smtp_port': 587,
        'sender': 'alert@example.com',
        'password': 'password',
        'receiver': 'admin@example.com'
    }
    
    monitor = SystemMonitor(config)
    monitor.run(interval=60)  # 每60秒监控一次

if __name__ == "__main__":
    main()

4.4 代码解析

系统信息获取：使用系统命令获取CPU、内存和磁盘使用情况
数据存储：同时使用JSON和CSV格式保存监控数据，方便后续分析
阈值检查：设置阈值监控系统资源使用情况
警报系统：当超过阈值时发送邮件警报
持续运行：使用循环和sleep实现持续监控
日志记录：使用logging模块记录监控过程和异常情况

四、标准库综合应用最佳实践

4.1 模块选择原则

优先使用标准库：当标准库能满足需求时，优先使用标准库，减少外部依赖
考虑性能因素：对于性能要求高的场景，选择性能更好的模块或实现方式
注重稳定性：选择成熟稳定的模块，避免使用实验性特性

4.2 代码组织技巧

模块化设计：将不同功能封装到不同模块中，提高代码复用性
配置集中管理：将配置信息集中管理，方便修改和维护
异常处理：合理使用异常处理机制，提高程序稳定性
日志记录：使用logging模块记录程序运行状态，便于调试和维护

4.3 性能优化建议

减少I/O操作：尽量减少文件读写、网络请求等I/O操作
合理使用缓存：对于频繁访问的数据，使用缓存提高性能
选择高效算法：根据实际需求选择合适的算法和数据结构
避免不必要的计算：只计算必要的数据，避免冗余计算

4.4 安全注意事项

输入验证：对所有外部输入进行严格验证，防止注入攻击
权限控制：合理设置文件和目录权限，防止未授权访问
安全传输：对于网络通信，使用HTTPS等安全协议
避免硬编码：不要在代码中硬编码敏感信息，如密码、API密钥等

五、第二阶段总结与回顾

5.1 核心知识点回顾

面向对象编程：类与对象、封装、继承、多态、高级特性
高级数据结构：列表推导式、字典推导式、生成器、迭代器、装饰器
文件操作：文件读写、目录操作、文件系统交互
标准库模块：os、sys、datetime、time、random、math、re等

5.2 技能提升

编程思维：从面向过程到面向对象的思维转变
代码质量：编写规范、高效、可维护的代码
问题解决：使用Python解决实际问题的能力
项目开发：理解项目开发流程，具备基本的项目开发能力

5.3 学习建议

多做练习：通过实际项目巩固所学知识
阅读源码：阅读优秀的Python代码，学习编程技巧
参与社区：加入Python社区，与其他开发者交流学习
持续学习：Python生态不断发展，保持学习新知识的习惯

六、课后练习

6.1 基础练习

基于文件批量处理工具，添加对文档文件的内容搜索功能
改进网络爬虫，添加对爬取内容的关键词过滤功能
完善系统监控工具，添加对网络连接的监控

6.2 进阶练习

开发一个综合应用：个人日记管理系统，包含以下功能：
- 日记的创建、编辑、删除
- 按日期、标签、关键词搜索
- 数据备份和恢复
- 日记统计分析
开发一个简易的命令行工具：
- 支持多种命令和参数
- 实现文件批量重命名功能
- 支持正则表达式匹配
- 提供详细的帮助信息

6.3 挑战练习

开发一个基于标准库的Web服务器：
- 支持静态文件服务
- 实现简单的路由功能
- 支持GET和POST请求
- 处理表单数据
开发一个多线程下载工具：
- 支持多线程同时下载
- 支持断点续传
- 显示下载进度
- 支持批量下载

七、学习资源推荐

官方文档：
- Python标准库文档：https://docs.python.org/3/library/
- Python教程：https://docs.python.org/3/tutorial/
在线课程：
- Python官方教程：https://docs.python.org/3/tutorial/
- Coursera上的Python课程
书籍推荐：
- 《Python Cookbook》
- 《Fluent Python》
- 《Python编程：从入门到实践》
社区资源：
- Stack Overflow：https://stackoverflow.com/
- Python官方论坛：https://discuss.python.org/
- GitHub：https://github.com/（寻找优秀的Python项目）

下一阶段预告：从第101集开始，我们将进入第三阶段"Python实战应用"，学习网络编程、多线程编程、数据库操作等实用技能，通过实际项目提升编程能力。请继续关注！

第100集 标准库综合应用