znlgis 博客

GIS开发与技术分享

第6章:集合运算

集合运算是空间分析的基础能力之一。Shapely 基于点集拓扑学理论,将几何对象视为平面上的点集, 提供并集、交集、差集、对称差集等核心运算。通过灵活运用这些运算,可以实现区域合并、重叠检测、 裁剪分析等丰富的空间操作。本章将全面介绍 Shapely 提供的各类集合运算函数、运算符重载、 批量操作以及性能优化技巧。


集合运算概述

点集拓扑基础

在点集拓扑中,每个几何对象都是平面上的一个点集。集合运算即对两个(或多个)点集进行并、交、差等操作, 生成新的几何对象。

Shapely 的集合运算基于 GEOS 库(JTS 的 C++ 移植版),提供高效且鲁棒的几何运算能力。

Shapely 2.x 中的两种调用方式

Shapely 2.x 同时支持面向对象函数式两种调用方式:

from shapely import union, intersection, difference, symmetric_difference
from shapely.geometry import Polygon

a = Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])
b = Polygon([(1, 1), (3, 1), (3, 3), (1, 3)])

# 面向对象方式
result1 = a.union(b)

# 函数式方式(推荐,支持向量化)
result2 = union(a, b)

函数式接口是 Shapely 2.x 推荐的方式,因为它支持 NumPy 数组输入,能够进行高效的批量运算。


union:并集运算

基本用法

union(a, b) 返回两个几何对象的并集,即两者覆盖的全部区域。

from shapely import union
from shapely.geometry import Polygon

a = Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])
b = Polygon([(1, 1), (3, 1), (3, 3), (1, 3)])

result = union(a, b)
print(result)
print(f"类型: {result.geom_type}")
print(f"面积: {result.area}")  # 2*2 + 2*2 - 1*1 = 7.0

不相交多边形的并集

当两个多边形不相交时,并集返回 MultiPolygon

from shapely import union
from shapely.geometry import Polygon

a = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
b = Polygon([(3, 3), (4, 3), (4, 4), (3, 4)])

result = union(a, b)
print(result.geom_type)  # MultiPolygon
print(f"包含 {len(result.geoms)} 个多边形")

线与多边形的并集

不同类型几何对象也可进行并集运算:

from shapely import union
from shapely.geometry import Polygon, LineString

poly = Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])
line = LineString([(1, -1), (1, 3)])

result = union(poly, line)
print(result.geom_type)  # GeometryCollection

union_all:批量并集

基本用法

union_all(geometries) 对一组几何对象执行并集运算,是 Shapely 2.x 推荐的批量并集函数, 替代旧版的 unary_union

from shapely import union_all
from shapely.geometry import Polygon
import numpy as np

polygons = [
    Polygon([(0, 0), (1, 0), (1, 1), (0, 1)]),
    Polygon([(0.5, 0.5), (1.5, 0.5), (1.5, 1.5), (0.5, 1.5)]),
    Polygon([(1, 1), (2, 1), (2, 2), (1, 2)]),
]

result = union_all(polygons)
print(f"合并后类型: {result.geom_type}")
print(f"合并后面积: {result.area}")

使用 NumPy 数组

from shapely import union_all, box
import numpy as np

# 创建一系列重叠的矩形
boxes = np.array([box(i, 0, i + 1.5, 1) for i in range(5)])
result = union_all(boxes)
print(f"面积: {result.area}")

axis 参数

当输入为多维数组时,可以通过 axis 参数指定沿哪个轴进行合并:

from shapely import union_all, box
import numpy as np

geoms = np.array([
    [box(0, 0, 1, 1), box(1, 0, 2, 1)],
    [box(0, 1, 1, 2), box(1, 1, 2, 2)],
])

# 沿第0轴合并(每列合并)
result_axis0 = union_all(geoms, axis=0)
print(f"沿 axis=0 合并结果数量: {len(result_axis0)}")

# 沿第1轴合并(每行合并)
result_axis1 = union_all(geoms, axis=1)
print(f"沿 axis=1 合并结果数量: {len(result_axis1)}")

unary_union:旧版批量并集

unary_union 是 Shapely 1.x 时代的批量并集函数名称,在 Shapely 2.x 中仍然可用, 但推荐使用 union_all

from shapely.ops import unary_union
from shapely.geometry import Polygon

polygons = [
    Polygon([(0, 0), (1, 0), (1, 1), (0, 1)]),
    Polygon([(0.5, 0), (1.5, 0), (1.5, 1), (0.5, 1)]),
]

# 旧版方式,仍可使用
result = unary_union(polygons)
print(f"面积: {result.area}")

注意:在新项目中建议统一使用 shapely.union_all()


coverage_union / coverage_union_all:快速并集

概述

coverage_unioncoverage_union_all 是专门用于不重叠几何体的快速并集运算。 它们假设输入的几何体之间没有重叠(只允许边界共享),因此可以跳过重叠检测步骤,大幅提升性能。

基本用法

from shapely import coverage_union_all
from shapely.geometry import Polygon

# 两个相邻但不重叠的多边形(共享一条边)
p1 = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
p2 = Polygon([(1, 0), (2, 0), (2, 1), (1, 1)])

result = coverage_union_all([p1, p2])
print(result)
print(f"面积: {result.area}")  # 2.0

适用场景

from shapely import coverage_union_all
from shapely.geometry import Polygon

# 模拟规则网格(瓦片),彼此不重叠
tiles = []
for i in range(10):
    for j in range(10):
        tiles.append(Polygon([
            (i, j), (i + 1, j), (i + 1, j + 1), (i, j + 1)
        ]))

result = coverage_union_all(tiles)
print(f"合并后面积: {result.area}")  # 100.0

警告:如果输入几何体存在重叠,coverage_union 的结果将不正确。 请确保输入几何体满足不重叠条件(覆盖关系)。


intersection:交集运算

基本用法

intersection(a, b) 返回两个几何对象的交集,即两者共同覆盖的区域。

from shapely import intersection
from shapely.geometry import Polygon

a = Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])
b = Polygon([(1, 1), (3, 1), (3, 3), (1, 3)])

result = intersection(a, b)
print(result)
print(f"类型: {result.geom_type}")  # Polygon
print(f"面积: {result.area}")  # 1.0

线与线的交集

from shapely import intersection
from shapely.geometry import LineString

line1 = LineString([(0, 0), (2, 2)])
line2 = LineString([(0, 2), (2, 0)])

result = intersection(line1, line2)
print(result)  # POINT (1 1)
print(result.geom_type)  # Point

不相交几何体的交集

from shapely import intersection
from shapely.geometry import Polygon

a = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
b = Polygon([(3, 3), (4, 3), (4, 4), (3, 4)])

result = intersection(a, b)
print(result.is_empty)  # True
print(result.geom_type)  # GeometryCollection

intersection_all:批量交集

intersection_all(geometries) 对一组几何对象计算交集。

from shapely import intersection_all
from shapely.geometry import Polygon

polygons = [
    Polygon([(0, 0), (3, 0), (3, 3), (0, 3)]),
    Polygon([(1, 1), (4, 1), (4, 4), (1, 4)]),
    Polygon([(2, 2), (5, 2), (5, 5), (2, 5)]),
]

result = intersection_all(polygons)
print(f"类型: {result.geom_type}")
print(f"面积: {result.area}")  # 1.0 (2,2)-(3,3) 的正方形

difference:差集运算

基本用法

difference(a, b) 从 a 中去除与 b 重叠的部分。注意差集是有方向的:difference(a, b) ≠ difference(b, a)

from shapely import difference
from shapely.geometry import Polygon

a = Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])
b = Polygon([(1, 1), (3, 1), (3, 3), (1, 3)])

# 从 a 中去除 b
result_ab = difference(a, b)
print(f"a - b 面积: {result_ab.area}")  # 4.0 - 1.0 = 3.0

# 从 b 中去除 a
result_ba = difference(b, a)
print(f"b - a 面积: {result_ba.area}")  # 4.0 - 1.0 = 3.0

挖洞操作

差集常用于在多边形中”挖洞”:

from shapely import difference
from shapely.geometry import Polygon, Point

outer = Polygon([(0, 0), (10, 0), (10, 10), (0, 10)])
hole = Point(5, 5).buffer(2)

result = difference(outer, hole)
print(f"有洞多边形面积: {result.area:.2f}")
print(f"内环数量: {len(list(result.interiors))}")

多次差集

from shapely import difference
from shapely.geometry import Polygon, Point

base = Polygon([(0, 0), (10, 0), (10, 10), (0, 10)])

# 挖去多个圆形区域
holes = [Point(2, 2).buffer(1), Point(5, 5).buffer(1.5), Point(8, 8).buffer(1)]
result = base
for hole in holes:
    result = difference(result, hole)

print(f"原始面积: {base.area}")
print(f"挖洞后面积: {result.area:.2f}")

symmetric_difference:对称差集

基本用法

symmetric_difference(a, b) 返回两个几何对象中不重叠的部分,即 “异或” 运算。 结果等价于 union(a, b) - intersection(a, b)

from shapely import symmetric_difference, union, intersection, difference
from shapely.geometry import Polygon

a = Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])
b = Polygon([(1, 1), (3, 1), (3, 3), (1, 3)])

sym_diff = symmetric_difference(a, b)
print(f"对称差集面积: {sym_diff.area}")  # 7.0 - 1.0 = 6.0

# 验证等价性
manual = difference(union(a, b), intersection(a, b))
print(f"手动计算面积: {manual.area}")  # 应该相同

对称差集的几何类型

from shapely import symmetric_difference
from shapely.geometry import Polygon

a = Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])
b = Polygon([(1, 1), (3, 1), (3, 3), (1, 3)])

result = symmetric_difference(a, b)
print(f"类型: {result.geom_type}")  # MultiPolygon
if result.geom_type == 'MultiPolygon':
    for i, geom in enumerate(result.geoms):
        print(f"  子多边形 {i}: 面积 = {geom.area}")

完全重叠的对称差集

from shapely import symmetric_difference
from shapely.geometry import Polygon

a = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
b = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])

result = symmetric_difference(a, b)
print(f"空几何: {result.is_empty}")  # True

symmetric_difference_all:批量对称差集

from shapely import symmetric_difference_all
from shapely.geometry import Polygon

polygons = [
    Polygon([(0, 0), (2, 0), (2, 2), (0, 2)]),
    Polygon([(1, 0), (3, 0), (3, 2), (1, 2)]),
    Polygon([(2, 0), (4, 0), (4, 2), (2, 2)]),
]

result = symmetric_difference_all(polygons)
print(f"类型: {result.geom_type}")
print(f"面积: {result.area}")

grid_size 参数:精度网格

概述

Shapely 2.x 的集合运算函数支持 grid_size 参数,用于控制计算精度。 坐标会被捕捉到指定大小的网格上,可以避免浮点精度问题并提升性能。

基本用法

from shapely import intersection
from shapely.geometry import Polygon

a = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
b = Polygon([(0.5000001, 0), (1.5, 0), (1.5, 1), (0.5000001, 1)])

# 默认精度
result_default = intersection(a, b)
print(f"默认精度面积: {result_default.area}")

# 使用 grid_size,坐标捕捉到 0.001 的网格
result_grid = intersection(a, b, grid_size=0.001)
print(f"grid_size=0.001 面积: {result_grid.area}")

精度与性能的权衡

from shapely import union, box
import time

a = box(0, 0, 1.00000001, 1.00000001)
b = box(0.99999999, 0, 2, 1)

# 无 grid_size
start = time.time()
for _ in range(1000):
    union(a, b)
t1 = time.time() - start

# 有 grid_size
start = time.time()
for _ in range(1000):
    union(a, b, grid_size=1e-6)
t2 = time.time() - start

print(f"无 grid_size: {t1:.3f}s")
print(f"有 grid_size: {t2:.3f}s")

处理浮点精度问题

from shapely import difference
from shapely.geometry import Polygon

# 由浮点误差导致的问题几何
a = Polygon([(0, 0), (1, 0), (1, 0.3 + 0.3 + 0.3 + 0.1), (0, 1)])
b = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])

result = difference(a, b)
print(f"无 grid_size: {result.area}")

result_grid = difference(a, b, grid_size=1e-10)
print(f"有 grid_size: {result_grid.area}")

运算符重载

Shapely 几何对象重载了 Python 运算符,提供更简洁的集合运算语法:

运算符 等价方法 含义
a \| b a.union(b) 并集
a & b a.intersection(b) 交集
a - b a.difference(b) 差集
a ^ b a.symmetric_difference(b) 对称差集

示例

from shapely.geometry import Polygon

a = Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])
b = Polygon([(1, 1), (3, 1), (3, 3), (1, 3)])

# 运算符方式
union_result = a | b
inter_result = a & b
diff_result  = a - b
symd_result  = a ^ b

print(f"并集面积: {union_result.area}")   # 7.0
print(f"交集面积: {inter_result.area}")   # 1.0
print(f"差集面积: {diff_result.area}")    # 3.0
print(f"对称差集面积: {symd_result.area}")  # 6.0

链式运算

from shapely.geometry import Polygon

a = Polygon([(0, 0), (3, 0), (3, 3), (0, 3)])
b = Polygon([(1, 1), (4, 1), (4, 4), (1, 4)])
c = Polygon([(2, 2), (5, 2), (5, 5), (2, 5)])

# 链式运算:三者的并集
result = a | b | c
print(f"三者并集面积: {result.area}")

# a 与 b 的交集,再去除 c
result2 = (a & b) - c
print(f"(a ∩ b) - c 面积: {result2.area}")

空几何体在集合运算中的行为

空几何体(GEOMETRYCOLLECTION EMPTY)在集合运算中表现如同空集:

from shapely import union, intersection, difference, symmetric_difference
from shapely.geometry import Polygon
from shapely import wkt

a = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
empty = wkt.loads("GEOMETRYCOLLECTION EMPTY")

# 并集:a ∪ ∅ = a
print(f"a | empty = {union(a, empty).equals(a)}")  # True

# 交集:a ∩ ∅ = ∅
print(f"a & empty 为空: {intersection(a, empty).is_empty}")  # True

# 差集:a - ∅ = a
print(f"a - empty = {difference(a, empty).equals(a)}")  # True

# 对称差集:a ^ ∅ = a
print(f"a ^ empty = {symmetric_difference(a, empty).equals(a)}")  # True

None 值的处理

在 Shapely 2.x 中,None 被视为缺失几何体,运算结果也为 None

from shapely import union
import numpy as np

from shapely.geometry import Point

result = union(Point(0, 0), None)
print(f"与 None 的并集: {result}")  # None

大量代码示例

示例 1:区域合并分析

from shapely import union_all
from shapely.geometry import Point

# 模拟多个传感器覆盖范围
sensors = [
    Point(0, 0).buffer(3),
    Point(4, 0).buffer(3),
    Point(2, 3).buffer(3),
    Point(6, 3).buffer(2),
]

coverage = union_all(sensors)
print(f"总覆盖面积: {coverage.area:.2f}")
print(f"覆盖区域类型: {coverage.geom_type}")

示例 2:重叠面积统计

from shapely import intersection
from shapely.geometry import Polygon

zones = [
    Polygon([(0, 0), (5, 0), (5, 5), (0, 5)]),
    Polygon([(3, 3), (8, 3), (8, 8), (3, 8)]),
    Polygon([(1, 1), (6, 1), (6, 6), (1, 6)]),
]

# 计算两两重叠面积
for i in range(len(zones)):
    for j in range(i + 1, len(zones)):
        overlap = intersection(zones[i], zones[j])
        print(f"区域{i} ∩ 区域{j} 面积: {overlap.area:.2f}")

示例 3:缓冲区裁剪

from shapely import intersection, difference
from shapely.geometry import Polygon, Point

study_area = Polygon([(0, 0), (10, 0), (10, 10), (0, 10)])
exclusion_zone = Point(5, 5).buffer(3)

# 从研究区域中排除禁区
usable_area = difference(study_area, exclusion_zone)
print(f"可用面积: {usable_area.area:.2f}")
print(f"排除面积比例: {(1 - usable_area.area / study_area.area) * 100:.1f}%")

性能比较和优化建议

union_all vs 逐一合并

from shapely import union, union_all
from shapely.geometry import Point
import time

circles = [Point(i * 0.5, 0).buffer(1) for i in range(50)]

# 方法1:逐一合并(慢)
start = time.time()
result1 = circles[0]
for c in circles[1:]:
    result1 = union(result1, c)
t1 = time.time() - start

# 方法2:union_all(快)
start = time.time()
result2 = union_all(circles)
t2 = time.time() - start

print(f"逐一合并: {t1:.3f}s")
print(f"union_all: {t2:.3f}s")
print(f"加速比: {t1 / t2:.1f}x")

优化建议

  1. 优先使用批量函数union_allintersection_all 等批量函数内部使用级联策略,性能远优于循环调用。
  2. 利用 coverage_union:如果输入几何体不重叠(如瓦片、网格),使用 coverage_union_all 可获得更好性能。
  3. 设置 grid_size:适当设置精度网格可避免浮点问题并提升性能。
  4. 使用向量化操作:利用 NumPy 数组作为输入,Shapely 2.x 会自动向量化运算。
  5. 减少中间对象:尽量一次性完成运算,避免创建大量中间几何对象。
import numpy as np
from shapely import union, box

# 向量化运算示例
a_arr = np.array([box(i, 0, i + 1, 1) for i in range(100)])
b_arr = np.array([box(i + 0.5, 0, i + 1.5, 1) for i in range(100)])

# 批量并集(向量化,快速)
results = union(a_arr, b_arr)
print(f"批量运算结果数量: {len(results)}")

常见问题

拓扑异常处理

当几何对象包含自相交时,集合运算可能抛出 GEOSException

from shapely import union, make_valid
from shapely.geometry import Polygon

# 自相交多边形(蝴蝶结形状)
bowtie = Polygon([(0, 0), (2, 2), (2, 0), (0, 2)])
print(f"是否有效: {bowtie.is_valid}")  # False

normal = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])

# 先修复再运算
valid_bowtie = make_valid(bowtie)
result = union(valid_bowtie, normal)
print(f"修复后运算成功,面积: {result.area:.2f}")

处理极小碎片

集合运算有时会产生极小面积的碎片多边形:

from shapely import difference
from shapely.geometry import Polygon

a = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
b = Polygon([(0.0000001, 0), (1.0000001, 0), (1.0000001, 1), (0.0000001, 1)])

result = difference(a, b)
print(f"类型: {result.geom_type}")
print(f"面积: {result.area}")

# 使用 grid_size 解决
result_clean = difference(a, b, grid_size=1e-6)
print(f"清理后面积: {result_clean.area}")

混合维度运算

from shapely import intersection
from shapely.geometry import Polygon, LineString, Point

poly = Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])
line = LineString([(1, -1), (1, 3)])
point = Point(1, 1)

# 多边形与线的交集 -> 线
r1 = intersection(poly, line)
print(f"多边形 ∩ 线 = {r1.geom_type}: {r1}")

# 多边形与点的交集 -> 点
r2 = intersection(poly, point)
print(f"多边形 ∩ 点 = {r2.geom_type}: {r2}")

# 线与点的交集
r3 = intersection(line, point)
print(f"线 ∩ 点 = {r3.geom_type}: {r3}")

本章小结

本章详细介绍了 Shapely 的集合运算功能:

  • 并集union / union_all):合并几何体
  • 交集intersection / intersection_all):提取公共部分
  • 差集difference):从一个几何体中去除另一个
  • 对称差集symmetric_difference):提取非公共部分
  • 运算符重载|&-^ 提供简洁语法
  • grid_size 参数:控制精度和性能
  • coverage_union:不重叠几何体的快速并集

掌握这些运算是进行复杂空间分析的基础。下一章将介绍构造性操作,学习缓冲区分析、凸包、简化等更丰富的空间操作。