第02章_Python高级特性

xsc699675

186人浏览 · 2026-06-09 16:05:44

xsc699675 · 2026-06-09 16:05:44 发布

第02章 Python高级特性

2.1 装饰器

2.1.1 函数装饰器基础

装饰器是一种特殊的函数，用于修改其他函数的行为。它可以在不修改原函数代码的情况下，为函数添加额外功能。

基本语法：

def decorator(func):
    def wrapper(*args, **kwargs):
        # 在调用原函数前执行的代码
        print("Before function call")
        result = func(*args, **kwargs)
        # 在调用原函数后执行的代码
        print("After function call")
        return result
    return wrapper

@decorator
def greet(name):
    return f"Hello, {name}!"

print(greet("Alice"))

案例1：日志装饰器

import time
from functools import wraps

def log_execution_time(func):
    """记录函数执行时间的装饰器"""
    @wraps(func)
    def wrapper(*args, **kwargs):
        start_time = time.time()
        result = func(*args, **kwargs)
        end_time = time.time()
        execution_time = end_time - start_time
        print(f"函数 {func.__name__} 执行耗时: {execution_time:.4f} 秒")
        return result
    return wrapper

@log_execution_time
def slow_function():
    time.sleep(1)
    return "Done"

slow_function()  # 输出: 函数 slow_function 执行耗时: 1.0001 秒

2.1.2 类装饰器

类装饰器使用类来实现装饰器功能，通过实现 __call__ 方法使类实例可调用。

类装饰器示例：

class DecoratorClass:
    def __init__(self, func):
        self.func = func
    
    def __call__(self, *args, **kwargs):
        print("Before call")
        result = self.func(*args, **kwargs)
        print("After call")
        return result

@DecoratorClass
def say_hello(name):
    return f"Hello, {name}!"

print(say_hello("Bob"))

案例2：带状态的类装饰器

class CountCalls:
    """记录函数调用次数的装饰器"""
    
    def __init__(self, func):
        self.func = func
        self.count = 0
    
    def __call__(self, *args, **kwargs):
        self.count += 1
        print(f"函数 {self.func.__name__} 已调用 {self.count} 次")
        return self.func(*args, **kwargs)

@CountCalls
def add(a, b):
    return a + b

add(1, 2)  # 输出: 函数 add 已调用 1 次，返回: 3
add(3, 4)  # 输出: 函数 add 已调用 2 次，返回: 7

2.1.3 带参数装饰器

装饰器可以接受参数，这需要在装饰器外层再包装一层函数。

带参数装饰器示例：

def repeat(times):
    """重复执行函数指定次数"""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            results = []
            for _ in range(times):
                results.append(func(*args, **kwargs))
            return results
        return wrapper
    return decorator

@repeat(times=3)
def greet(name):
    return f"Hello, {name}!"

print(greet("Alice"))  # 输出: ['Hello, Alice!', 'Hello, Alice!', 'Hello, Alice!']

案例3：带参数的日志装饰器

def log_with_level(level="INFO"):
    """带日志级别的装饰器"""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            print(f"[{level}] 调用函数: {func.__name__}")
            try:
                result = func(*args, **kwargs)
                print(f"[{level}] 函数 {func.__name__} 执行成功")
                return result
            except Exception as e:
                print(f"[{level}] 函数 {func.__name__} 执行失败: {e}")
                raise
        return wrapper
    return decorator

@log_with_level(level="DEBUG")
def divide(a, b):
    return a / b

divide(10, 2)  # 输出: [DEBUG] 调用函数: divide, [DEBUG] 函数 divide 执行成功
divide(10, 0)  # 输出: [DEBUG] 调用函数: divide, [DEBUG] 函数 divide 执行失败: division by zero

2.1.4 functools.wraps 的重要性

functools.wraps 用于保留原函数的元信息（如 __name__、__doc__ 等）。

对比示例：

def bad_decorator(func):
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)
    return wrapper

def good_decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)
    return wrapper

@bad_decorator
def my_func():
    """这是一个测试函数"""
    pass

@good_decorator
def my_func2():
    """这是另一个测试函数"""
    pass

print(my_func.__name__)   # 输出: wrapper（错误）
print(my_func2.__name__)  # 输出: my_func2（正确）
print(my_func.__doc__)    # 输出: None（错误）
print(my_func2.__doc__)   # 输出: 这是另一个测试函数（正确）

2.2 生成器与迭代器

2.2.1 生成器基础

生成器是一种特殊的迭代器，使用 yield 关键字返回值，而不是 return。生成器在每次调用时产生一个值，然后暂停，下次调用时从暂停处继续执行。

生成器函数：

def countdown(n):
    while n > 0:
        yield n
        n -= 1

# 使用生成器
for num in countdown(5):
    print(num)  # 输出: 5, 4, 3, 2, 1

案例4：斐波那契生成器

def fibonacci_generator():
    """生成斐波那契数列的生成器"""
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# 使用生成器
fib = fibonacci_generator()
for _ in range(10):
    print(next(fib), end=" ")  # 输出: 0 1 1 2 3 5 8 13 21 34

2.2.2 yield from 语法

yield from 可以将一个生成器的所有值委托给另一个生成器。

yield from 示例：

def generator1():
    yield 1
    yield 2
    yield 3

def generator2():
    yield from generator1()  # 委托给generator1
    yield 4
    yield 5

for num in generator2():
    print(num, end=" ")  # 输出: 1 2 3 4 5

2.2.3 send() 方法

生成器可以通过 send() 方法接收外部传入的值。

send() 示例：

def echo_generator():
    message = yield "Ready to receive"
    while message is not None:
        message = yield f"Received: {message}"

gen = echo_generator()
print(next(gen))        # 输出: Ready to receive
print(gen.send("Hello"))  # 输出: Received: Hello
print(gen.send("World"))  # 输出: Received: World

案例5：带反馈的生成器

def accumulator():
    """累加器生成器，支持动态重置"""
    total = 0
    while True:
        value = yield total
        if value is None:
            total = 0  # 重置
        else:
            total += value

acc = accumulator()
next(acc)              # 启动生成器
print(acc.send(5))     # 输出: 5
print(acc.send(3))     # 输出: 8
print(acc.send(None))  # 重置，输出: 0
print(acc.send(10))    # 输出: 10

2.2.4 itertools 模块

itertools 提供了一系列高效的迭代器工具函数。

常用 itertools 函数：

import itertools

# 无限迭代器
count = itertools.count(start=1, step=2)
for _ in range(5):
    print(next(count), end=" ")  # 输出: 1 3 5 7 9

# 循环迭代器
cycle = itertools.cycle(['a', 'b', 'c'])
for _ in range(5):
    print(next(cycle), end=" ")  # 输出: a b c a b

# 重复迭代器
repeat = itertools.repeat(10, times=3)
print(list(repeat))  # 输出: [10, 10, 10]

案例6：itertools 组合操作

import itertools

# 排列组合
letters = ['a', 'b', 'c']

# 排列
permutations = list(itertools.permutations(letters, 2))
print(permutations)  # [('a','b'), ('a','c'), ('b','a'), ('b','c'), ('c','a'), ('c','b')]

# 组合
combinations = list(itertools.combinations(letters, 2))
print(combinations)  # [('a','b'), ('a','c'), ('b','c')]

# 笛卡尔积
product = list(itertools.product([1, 2], ['a', 'b']))
print(product)  # [(1,'a'), (1,'b'), (2,'a'), (2,'b')]

# 压缩
compressed = list(itertools.zip_longest([1, 2], ['a', 'b', 'c'], fillvalue='-'))
print(compressed)  # [(1,'a'), (2,'b'), ('-','c')]

2.2.5 惰性求值

生成器实现了惰性求值，只在需要时才计算值，节省内存。

惰性求值对比：

# 列表推导式（立即求值）
squares_list = [x**2 for x in range(1_000_000)]  # 占用大量内存

# 生成器表达式（惰性求值）
squares_gen = (x**2 for x in range(1_000_000))  # 几乎不占用内存

# 按需获取值
print(next(squares_gen))  # 0
print(next(squares_gen))  # 1
print(next(squares_gen))  # 4

2.3 上下文管理器

2.3.1 with 语句基础

with 语句用于资源管理，确保资源在使用后被正确释放。

基本语法：

with open("example.txt", "r") as file:
    content = file.read()
# 文件自动关闭

2.3.2 enter 和 exit 方法

自定义上下文管理器需要实现 __enter__ 和 __exit__ 方法。

自定义上下文管理器：

class FileManager:
    def __init__(self, filename, mode):
        self.filename = filename
        self.mode = mode
        self.file = None
    
    def __enter__(self):
        self.file = open(self.filename, self.mode)
        return self.file
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        if self.file:
            self.file.close()
        # 返回 True 表示异常已处理，不会向上传播
        return False

# 使用自定义上下文管理器
with FileManager("example.txt", "w") as f:
    f.write("Hello, World!")

案例7：计时器上下文管理器

import time

class Timer:
    """计时器上下文管理器"""
    
    def __enter__(self):
        self.start_time = time.time()
        return self
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        self.end_time = time.time()
        self.elapsed_time = self.end_time - self.start_time
        print(f"耗时: {self.elapsed_time:.4f} 秒")

# 使用计时器
with Timer():
    time.sleep(1)
# 输出: 耗时: 1.0001 秒

2.3.3 contextlib 模块

contextlib 提供了更简洁的方式来创建上下文管理器。

使用 contextmanager 装饰器：

from contextlib import contextmanager

@contextmanager
def file_manager(filename, mode):
    file = open(filename, mode)
    try:
        yield file
    finally:
        file.close()

# 使用
with file_manager("example.txt", "r") as f:
    content = f.read()

案例8：临时目录上下文管理器

import tempfile
import os
from contextlib import contextmanager

@contextmanager
def temporary_directory():
    """创建临时目录的上下文管理器"""
    temp_dir = tempfile.mkdtemp()
    try:
        yield temp_dir
    finally:
        # 清理临时目录
        import shutil
        shutil.rmtree(temp_dir)

# 使用临时目录
with temporary_directory() as tmp:
    print(f"临时目录: {tmp}")
    # 在临时目录中创建文件
    with open(os.path.join(tmp, "temp.txt"), "w") as f:
        f.write("Temporary content")
# 临时目录已自动删除

2.3.4 contextlib 常用工具

closing 函数：

from contextlib import closing
import urllib.request

with closing(urllib.request.urlopen('http://example.com')) as page:
    content = page.read()

suppress 函数：

from contextlib import suppress

# 忽略特定异常
with suppress(FileNotFoundError):
    os.remove("nonexistent.txt")  # 不会抛出异常

redirect_stdout 函数：

from contextlib import redirect_stdout
import io

f = io.StringIO()
with redirect_stdout(f):
    print("Hello, World!")
output = f.getvalue()
print(output)  # 输出: Hello, World!

2.4 元类与描述符

2.4.1 元类基础

元类是创建类的"类"，type 是 Python 的内置元类。

使用 type 创建类：

# 动态创建类
def greet(self):
    return f"Hello, {self.name}!"

Person = type('Person', (object,), {
    'name': 'Default',
    'greet': greet
})

p = Person()
p.name = "Alice"
print(p.greet())  # Hello, Alice!

2.4.2 new 方法

__new__ 是在 __init__ 之前调用的方法，用于创建类的实例。

new 示例：

class Singleton:
    _instance = None
    
    def __new__(cls, *args, **kwargs):
        if cls._instance is None:
            cls._instance = super().__new__(cls, *args, **kwargs)
        return cls._instance

s1 = Singleton()
s2 = Singleton()
print(s1 is s2)  # True（同一个实例）

2.4.3 自定义元类

自定义元类示例：

class MetaClass(type):
    def __new__(cls, name, bases, attrs):
        # 在类创建时修改属性
        attrs['created_at'] = '2024'
        attrs['class_name'] = name
        return super().__new__(cls, name, bases, attrs)

class MyClass(metaclass=MetaClass):
    pass

obj = MyClass()
print(obj.created_at)  # 2024
print(obj.class_name)  # MyClass

案例9：强制类型检查元类

class TypeCheckedMeta(type):
    """强制类型检查的元类"""
    
    def __new__(cls, name, bases, attrs):
        # 检查所有方法的类型注解
        for attr_name, attr_value in attrs.items():
            if callable(attr_value) and hasattr(attr_value, '__annotations__'):
                attrs[attr_name] = cls._add_type_check(attr_value)
        return super().__new__(cls, name, bases, attrs)
    
    @staticmethod
    def _add_type_check(func):
        annotations = func.__annotations__
        
        def wrapper(*args, **kwargs):
            # 检查参数类型
            for i, (param_name, param_type) in enumerate(annotations.items()):
                if i < len(args):
                    arg_value = args[i]
                    if not isinstance(arg_value, param_type):
                        raise TypeError(f"参数 {param_name} 应为 {param_type}, 实际为 {type(arg_value)}")
            return func(*args, **kwargs)
        
        return wrapper

class MyClass(metaclass=TypeCheckedMeta):
    def add(self, a: int, b: int) -> int:
        return a + b

obj = MyClass()
obj.add(1, 2)  # 正常
# obj.add("1", "2")  # TypeError: 参数 a 应为 <class 'int'>, 实际为 <class 'str'>

2.4.4 ABCMeta 抽象基类

ABCMeta 用于创建抽象基类，强制子类实现特定方法。

抽象基类示例：

from abc import ABCMeta, abstractmethod

class Shape(metaclass=ABCMeta):
    @abstractmethod
    def area(self):
        pass
    
    @abstractmethod
    def perimeter(self):
        pass

class Circle(Shape):
    def __init__(self, radius):
        self.radius = radius
    
    def area(self):
        return 3.14159 * self.radius ** 2
    
    def perimeter(self):
        return 2 * 3.14159 * self.radius

# Shape()  # TypeError: Can't instantiate abstract class Shape
circle = Circle(5)
print(circle.area())      # 78.53975
print(circle.perimeter()) # 31.4159

2.4.5 描述符协议

描述符是实现了 __get__、__set__、__delete__ 方法的类。

描述符示例：

class PositiveNumber:
    """确保数值为正数的描述符"""
    
    def __get__(self, instance, owner):
        return instance.__dict__.get(self.name, 0)
    
    def __set__(self, instance, value):
        if value <= 0:
            raise ValueError("值必须为正数")
        instance.__dict__[self.name] = value
    
    def __set_name__(self, owner, name):
        self.name = name

class Product:
    price = PositiveNumber()
    stock = PositiveNumber()
    
    def __init__(self, name, price, stock):
        self.name = name
        self.price = price
        self.stock = stock

product = Product("Apple", 10, 100)
print(product.price)  # 10
# product.price = -5  # ValueError: 值必须为正数

案例10：类型转换描述符

class TypeConverter:
    """自动类型转换的描述符"""
    
    def __init__(self, target_type):
        self.target_type = target_type
    
    def __get__(self, instance, owner):
        return instance.__dict__.get(self.name)
    
    def __set__(self, instance, value):
        try:
            instance.__dict__[self.name] = self.target_type(value)
        except (ValueError, TypeError) as e:
            raise ValueError(f"无法转换为 {self.target_type}: {e}")
    
    def __set_name__(self, owner, name):
        self.name = name

class Person:
    age = TypeConverter(int)
    height = TypeConverter(float)
    
    def __init__(self, age, height):
        self.age = age      # 自动转换为 int
        self.height = height # 自动转换为 float

person = Person("25", "1.75")
print(type(person.age))    # <class 'int'>
print(type(person.height)) # <class 'float'>

2.5 类型系统

2.5.1 typing 模块基础

typing 模块提供了丰富的类型提示工具。

基本类型提示：

from typing import List, Dict, Tuple, Optional, Union

def process_data(
    names: List[str],
    scores: Dict[str, int],
    info: Tuple[str, int, bool],
    optional_value: Optional[str] = None,
    mixed: Union[int, str] = 0
) -> None:
    pass

2.5.2 泛型

泛型允许定义参数化类型。

泛型示例：

from typing import TypeVar, Generic

T = TypeVar('T')

class Container(Generic[T]):
    def __init__(self, value: T):
        self.value = value
    
    def get(self) -> T:
        return self.value

# 使用泛型
int_container = Container[int](42)
str_container = Container[str]("hello")

print(int_container.get())  # 42
print(str_container.get())  # hello

案例11：泛型栈实现

from typing import TypeVar, Generic, Optional

T = TypeVar('T')

class Stack(Generic[T]):
    """泛型栈实现"""
    
    def __init__(self):
        self._items: list[T] = []
    
    def push(self, item: T) -> None:
        self._items.append(item)
    
    def pop(self) -> Optional[T]:
        if not self._items:
            return None
        return self._items.pop()
    
    def peek(self) -> Optional[T]:
        return self._items[-1] if self._items else None
    
    def is_empty(self) -> bool:
        return len(self._items) == 0
    
    def size(self) -> int:
        return len(self._items)

# 使用泛型栈
int_stack = Stack[int]()
int_stack.push(1)
int_stack.push(2)
print(int_stack.pop())  # 2

str_stack = Stack[str]()
str_stack.push("hello")
print(str_stack.peek()) # hello

2.5.3 Protocol

Protocol 用于定义结构类型（structural typing）。

Protocol 示例：

from typing import Protocol

class Printable(Protocol):
    def print(self) -> None:
        pass

class Document:
    def print(self) -> None:
        print("Printing document...")

class Image:
    def print(self) -> None:
        print("Printing image...")

def print_all(items: list[Printable]) -> None:
    for item in items:
        item.print()

# Document 和 Image 都实现了 Printable 协议
print_all([Document(), Image()])

2.5.4 TypeVar 与约束

带约束的 TypeVar：

from typing import TypeVar, Union

# 约束为 int 或 float
Number = TypeVar('Number', int, float)

def add(a: Number, b: Number) -> Number:
    return a + b

print(add(1, 2))     # 3
print(add(1.5, 2.5)) # 4.0

2.5.5 Literal

Literal 用于限制值为特定字面量。

Literal 示例：

from typing import Literal

def set_mode(mode: Literal['read', 'write', 'append']) -> None:
    print(f"Mode set to: {mode}")

set_mode('read')    # OK
set_mode('write')   # OK
# set_mode('delete')  # TypeError

2.5.6 TypedDict

TypedDict 用于定义字典的类型结构。

TypedDict 示例：

from typing import TypedDict

class Person(TypedDict):
    name: str
    age: int
    city: str

person: Person = {
    'name': 'Alice',
    'age': 30,
    'city': 'Beijing'
}

2.6 函数式编程

2.6.1 map/filter/reduce

map 函数：

numbers = [1, 2, 3, 4]
squared = list(map(lambda x: x**2, numbers))
print(squared)  # [1, 4, 9, 16]

filter 函数：

numbers = [1, 2, 3, 4, 5, 6]
evens = list(filter(lambda x: x % 2 == 0, numbers))
print(evens)  # [2, 4, 6]

reduce 函数：

from functools import reduce

numbers = [1, 2, 3, 4]
product = reduce(lambda x, y: x * y, numbers)
print(product)  # 24

2.6.2 operator 模块

operator 模块提供了各种操作符的函数形式。

operator 示例：

import operator

# 算术运算
print(operator.add(3, 5))      # 8
print(operator.mul(3, 5))      # 15
print(operator.pow(2, 10))     # 1024

# 比较运算
print(operator.eq(10, 10))     # True
print(operator.gt(10, 5))      # True

# 序列操作
my_list = [1, 2, 3]
operator.setitem(my_list, 0, 10)
print(my_list)  # [10, 2, 3]

2.6.3 partial 函数

partial 用于固定函数的部分参数。

partial 示例：

from functools import partial

def power(base, exponent):
    return base ** exponent

# 创建固定指数的函数
square = partial(power, exponent=2)
cube = partial(power, exponent=3)

print(square(3))  # 9
print(cube(3))    # 27

案例12：使用 partial 简化函数调用

from functools import partial
import logging

# 配置日志
logging.basicConfig(level=logging.INFO)

# 创建不同级别的日志函数
info_log = partial(logging.info, extra={'app': 'myapp'})
error_log = partial(logging.error, extra={'app': 'myapp'})

info_log("This is an info message")
error_log("This is an error message")

2.6.4 functools 其他工具

lru_cache 装饰器：

from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

print(fibonacci(100))  # 快速计算

wraps 装饰器（已在2.1节介绍）

cmp_to_key 函数：

from functools import cmp_to_key

def custom_sort(a, b):
    # 按长度排序，长度相同按字典序
    if len(a) != len(b):
        return len(a) - len(b)
    return 0 if a == b else 1 if a > b else -1

words = ['apple', 'banana', 'cherry', 'date']
sorted_words = sorted(words, key=cmp_to_key(custom_sort))
print(sorted_words)  # ['date', 'apple', 'banana', 'cherry']

2.7 正则表达式

2.7.1 re 模块基础

基本匹配：

import re

text = "Hello, my email is alice@example.com"

# 匹配邮箱
pattern = r'[\w.-]+@[\w.-]+'
match = re.search(pattern, text)
if match:
    print(match.group())  # alice@example.com

2.7.2 常用正则表达式模式

import re

# 匹配手机号码
phone_pattern = r'1[3-9]\d{9}'
text = "我的手机号是13812345678"
match = re.search(phone_pattern, text)
print(match.group())  # 13812345678

# 匹配URL
url_pattern = r'https?://[\w.-]+(?:/[\w./-]*)?'
text = "访问 https://www.example.com/path"
match = re.search(url_pattern, text)
print(match.group())  # https://www.example.com/path

# 匹配HTML标签
html_pattern = r'<(\w+)>(.*?)</\1>'
text = "<p>Hello</p>"
match = re.search(html_pattern, text)
print(match.group(1))  # p
print(match.group(2))  # Hello

2.7.3 命名组

命名组示例：

import re

pattern = r'(?P<protocol>https?)://(?P<domain>[\w.-]+)(?P<path>/[\w./-]*)?'
text = "https://www.example.com/path/to/page"

match = re.match(pattern, text)
if match:
    print(match.group('protocol'))  # https
    print(match.group('domain'))    # www.example.com
    print(match.group('path'))      # /path/to/page

2.7.4 前后断言

正向先行断言：

import re

# 匹配后面跟着 'world' 的 'hello'
pattern = r'hello(?= world)'
text = "hello world"
match = re.search(pattern, text)
print(match.group())  # hello

负向先行断言：

import re

# 匹配后面不跟着 'world' 的 'hello'
pattern = r'hello(?! world)'
text = "hello there"
match = re.search(pattern, text)
print(match.group())  # hello

正向后行断言：

import re

# 匹配前面是 'Hello, ' 的 'world'
pattern = r'(?<=Hello, )world'
text = "Hello, world"
match = re.search(pattern, text)
print(match.group())  # world

案例13：复杂正则表达式实战

import re

def extract_info(text: str) -> dict:
    """从文本中提取各种信息"""
    result = {}
    
    # 提取邮箱
    email_pattern = r'(?P<email>[\w.-]+@[\w.-]+\.\w+)'
    email_match = re.search(email_pattern, text)
    if email_match:
        result['email'] = email_match.group('email')
    
    # 提取日期（YYYY-MM-DD）
    date_pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
    date_match = re.search(date_pattern, text)
    if date_match:
        result['date'] = {
            'year': date_match.group('year'),
            'month': date_match.group('month'),
            'day': date_match.group('day')
        }
    
    # 提取价格
    price_pattern = r'(?P<currency>¥|\$|€)(?P<amount>\d+(?:\.\d{1,2})?)'
    price_match = re.search(price_pattern, text)
    if price_match:
        result['price'] = {
            'currency': price_match.group('currency'),
            'amount': float(price_match.group('amount'))
        }
    
    return result

# 测试
text = """
联系邮箱: alice@company.com
订单日期: 2024-01-15
商品价格: ¥99.99
"""
info = extract_info(text)
print(info)
# 输出: {'email': 'alice@company.com', 'date': {'year': '2024', 'month': '01', 'day': '15'}, 'price': {'currency': '¥', 'amount': 99.99}}

2.7.5 re 模块常用方法

import re

text = "apple, banana, cherry, apple"

# findall - 查找所有匹配
fruits = re.findall(r'\b\w+\b', text)
print(fruits)  # ['apple', 'banana', 'cherry', 'apple']

# finditer - 返回迭代器
matches = re.finditer(r'apple', text)
for match in matches:
    print(f"找到 'apple' 在位置 {match.start()}-{match.end()}")

# sub - 替换
new_text = re.sub(r'apple', 'orange', text)
print(new_text)  # orange, banana, cherry, orange

# split - 分割
parts = re.split(r',\s*', text)
print(parts)  # ['apple', 'banana', 'cherry', 'apple']

2.8 内存管理与性能

2.8.1 GC机制

Python 使用引用计数和垃圾回收来管理内存。

引用计数示例：

import sys

a = [1, 2, 3]
print(sys.getrefcount(a))  # 2（a 和 getrefcount 参数各引用一次）

b = a
print(sys.getrefcount(a))  # 3

del b
print(sys.getrefcount(a))  # 2

2.8.2 slots

__slots__ 可以减少类实例的内存占用。

slots 示例：

class WithoutSlots:
    def __init__(self, x, y):
        self.x = x
        self.y = y

class WithSlots:
    __slots__ = ['x', 'y']
    def __init__(self, x, y):
        self.x = x
        self.y = y

# 内存对比
import sys
obj1 = WithoutSlots(1, 2)
obj2 = WithSlots(1, 2)
print(sys.getsizeof(obj1))  # 约48字节
print(sys.getsizeof(obj2))  # 约32字节

2.8.3 weakref 弱引用

弱引用不会增加对象的引用计数，适用于缓存等场景。

weakref 示例：

import weakref

class MyClass:
    def __del__(self):
        print("对象被销毁")

obj = MyClass()
ref = weakref.ref(obj)

print(ref())  # <__main__.MyClass object at ...>

del obj
print(ref())  # None（对象已被销毁）

案例14：弱引用缓存

import weakref

class Cache:
    def __init__(self):
        self._cache = weakref.WeakKeyDictionary()
    
    def get(self, key):
        return self._cache.get(key)
    
    def set(self, key, value):
        self._cache[key] = value

# 使用缓存
cache = Cache()
obj = MyClass()
cache.set(obj, "some data")

print(cache.get(obj))  # some data

del obj
print(cache.get(obj))  # None（缓存自动清理）

2.8.4 profiling 性能分析

使用 cProfile：

import cProfile

def slow_function():
    total = 0
    for i in range(1_000_000):
        total += i
    return total

# 运行性能分析
cProfile.run('slow_function()', sort='cumulative')

使用 timeit：

import timeit

# 比较列表推导式和for循环
list_comp_time = timeit.timeit('[x**2 for x in range(1000)]', number=1000)
loop_time = timeit.timeit('result = []; [result.append(x**2) for x in range(1000)]', number=1000)

print(f"列表推导式: {list_comp_time:.4f}")
print(f"for循环: {loop_time:.4f}")

案例15：性能分析实战

import timeit
import functools

def fibonacci_recursive(n):
    if n < 2:
        return n
    return fibonacci_recursive(n-1) + fibonacci_recursive(n-2)

@functools.lru_cache(maxsize=None)
def fibonacci_cached(n):
    if n < 2:
        return n
    return fibonacci_cached(n-1) + fibonacci_cached(n-2)

# 比较性能
recursive_time = timeit.timeit(lambda: fibonacci_recursive(25), number=10)
cached_time = timeit.timeit(lambda: fibonacci_cached(25), number=10)

print(f"递归版本: {recursive_time:.4f} 秒")
print(f"缓存版本: {cached_time:.4f} 秒")