第三章：快速上手第一个流水线

本章通过一个完整的实战示例，带你从零到运行第一个 GeoPipeAgent 流水线，体验 AI 生成 → 框架执行的完整闭环。

3.1 准备测试数据

首先创建工作目录和测试数据：

mkdir -p my-gis-project/data
mkdir -p my-gis-project/output
cd my-gis-project

创建一个简单的测试 GeoJSON 文件 data/roads.geojson：

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {"name": "主干道", "type": "primary"},
      "geometry": {
        "type": "LineString",
        "coordinates": [[116.3, 39.9], [116.4, 39.9], [116.5, 39.95]]
      }
    },
    {
      "type": "Feature",
      "properties": {"name": "次干道", "type": "secondary"},
      "geometry": {
        "type": "LineString",
        "coordinates": [[116.35, 39.85], [116.35, 39.95], [116.45, 39.95]]
      }
    }
  ]
}

3.2 编写第一个 YAML 流水线

在 my-gis-project/ 目录下创建流水线文件 buffer-pipeline.yaml：

pipeline:
  name: "道路缓冲区分析"
  description: "对道路数据做投影转换后进行缓冲区分析，将结果保存为 GeoJSON"

  variables:
    input_path: "data/roads.geojson"
    buffer_dist: 0.01
    output_path: "output/road_buffer.geojson"

  steps:
    - id: load-roads
      use: io.read_vector
      params:
        path: "${input_path}"

    - id: buffer-roads
      use: vector.buffer
      params:
        input: "$load-roads"
        distance: "${buffer_dist}"
        cap_style: "round"

    - id: save-result
      use: io.write_vector
      params:
        input: "$buffer-roads"
        path: "${output_path}"
        format: "GeoJSON"

  outputs:
    result: "$save-result"
    feature_count: "$buffer-roads.feature_count"

流水线解读

字段	说明
`pipeline.name`	流水线名称，出现在报告中
`pipeline.variables`	可复用的变量，通过 `${变量名}` 引用
`pipeline.steps`	步骤列表，按顺序执行
`id: load-roads`	步骤唯一 ID，后续步骤通过 `$load-roads` 引用其输出
`use: io.read_vector`	步骤类型，格式为 `类别.动作`
`params`	步骤参数，可使用变量替换和步骤引用
`$load-roads`	引用 `load-roads` 步骤的输出（`output` 字段）
`$buffer-roads.feature_count`	引用 `buffer-roads` 步骤 stats 中的 `feature_count` 值
`outputs`	声明流水线的最终输出，出现在 JSON 报告的 `outputs` 节

关于坐标系：roads.geojson 使用 WGS84（EPSG:4326，单位为度），缓冲区距离 0.01 表示约 1 公里（纬度方向）。若需精确计量距离，应先用 vector.reproject 转换为投影坐标系（如 EPSG:3857，单位米）后再做缓冲。

3.3 执行流水线

geopipe-agent run buffer-pipeline.yaml

正常执行输出（JSON 格式）：

{
  "pipeline": "道路缓冲区分析",
  "status": "success",
  "duration": 0.312,
  "steps": [
    {
      "id": "load-roads",
      "step": "io.read_vector",
      "status": "success",
      "duration": 0.089,
      "output_summary": {
        "feature_count": 2,
        "crs": "EPSG:4326",
        "geometry_types": ["LineString"],
        "columns": ["name", "type", "geometry"]
      }
    },
    {
      "id": "buffer-roads",
      "step": "vector.buffer",
      "status": "success",
      "duration": 0.124,
      "output_summary": {
        "feature_count": 2,
        "crs": "EPSG:4326",
        "geometry_types": ["Polygon"],
        "total_area": 0.00024578
      }
    },
    {
      "id": "save-result",
      "step": "io.write_vector",
      "status": "success",
      "duration": 0.098,
      "output_summary": {
        "feature_count": 2,
        "output_path": "output/road_buffer.geojson",
        "format": "GeoJSON"
      }
    }
  ],
  "outputs": {
    "result": "output/road_buffer.geojson",
    "feature_count": 2
  }
}

3.4 使用 `--var` 覆盖变量

通过 --var 参数在命令行覆盖流水线变量，无需修改 YAML 文件：

# 使用不同的缓冲距离
geopipe-agent run buffer-pipeline.yaml --var buffer_dist=0.02

# 同时覆盖多个变量
geopipe-agent run buffer-pipeline.yaml \
    --var input_path=data/highway.geojson \
    --var buffer_dist=0.05 \
    --var output_path=output/highway_buffer.geojson

这对批量处理多个数据文件非常方便——只需一个 YAML 文件，通过不同的 --var 参数运行多次。

3.5 流水线校验（不执行）

在正式执行之前，可以用 validate 命令检查 YAML 语法和步骤引用是否正确：

geopipe-agent validate buffer-pipeline.yaml

输出示例：

{
  "status": "valid",
  "pipeline": "道路缓冲区分析",
  "steps_count": 3,
  "steps": [
    {"id": "load-roads",    "use": "io.read_vector"},
    {"id": "buffer-roads",  "use": "vector.buffer"},
    {"id": "save-result",   "use": "io.write_vector"}
  ]
}

如果有语法错误，会看到类似这样的错误信息：

{
  "status": "invalid",
  "error": "PipelineParseError",
  "message": "Missing 'pipeline' key at the top level. Expected: pipeline:\n  name: ...\n  steps: ..."
}

3.6 带投影转换的完整示例

以下是一个更真实的示例，演示先投影转换再缓冲，确保缓冲距离单位正确：

pipeline:
  name: "道路 500 米缓冲区分析（精确距离）"
  description: "先将 WGS84 转为 Web Mercator（米制），再做 500 米缓冲"

  variables:
    input_path: "data/roads.geojson"
    buffer_dist_m: 500
    output_path: "output/road_buffer_500m.geojson"

  steps:
    - id: load-roads
      use: io.read_vector
      params:
        path: "${input_path}"

    - id: reproject-to-mercator
      use: vector.reproject
      params:
        input: "$load-roads"
        target_crs: "EPSG:3857"

    - id: buffer-500m
      use: vector.buffer
      params:
        input: "$reproject-to-mercator"
        distance: "${buffer_dist_m}"
        cap_style: "round"

    - id: reproject-back
      use: vector.reproject
      params:
        input: "$buffer-500m"
        target_crs: "EPSG:4326"

    - id: save-result
      use: io.write_vector
      params:
        input: "$reproject-back"
        path: "${output_path}"
        format: "GeoJSON"

  outputs:
    result: "$save-result"

3.7 调试模式

如果流水线执行失败或结果不符合预期，使用调试模式查看详细日志：

# 显示 DEBUG 级别日志
geopipe-agent run buffer-pipeline.yaml --log-level DEBUG

# 使用 JSON 格式日志（便于机器解析）
geopipe-agent run buffer-pipeline.yaml --json-log

3.8 查看 GIS 文件信息

在编写流水线之前，可以用 info 命令快速了解数据文件的基本信息：

geopipe-agent info data/roads.geojson

输出：

{
  "path": "data/roads.geojson",
  "format": "vector",
  "feature_count": 2,
  "crs": "EPSG:4326",
  "geometry_types": ["LineString"],
  "columns": ["name", "type", "geometry"],
  "bounds": [116.3, 39.85, 116.5, 39.95]
}

这对确定坐标系（决定缓冲距离单位）和了解属性字段非常有帮助。

3.9 快速 Cookbook 体验

GeoPipeAgent 自带 7 个即用型流水线示例（cookbook/ 目录），可以直接运行：

# 克隆仓库后
geopipe-agent run cookbook/buffer-analysis.yaml
geopipe-agent run cookbook/vector-qc.yaml
geopipe-agent run cookbook/overlay-analysis.yaml

这些示例涵盖了最常见的 GIS 工作流，是学习框架的最佳起点。

3.10 本章小结

本章完整演示了 GeoPipeAgent 的基本使用流程：

准备数据：GeoJSON、Shapefile 等格式均可直接使用
编写 YAML：在 pipeline: 下定义 variables、steps、outputs
执行流水线：geopipe-agent run <file> 一键执行，输出 JSON 报告
覆盖变量：--var key=value 在运行时动态修改参数
校验流水线：geopipe-agent validate <file> 在执行前检查语法
查看文件信息：geopipe-agent info <file> 了解数据基本情况

下一章将深入解析 YAML 流水线格式的每一个字段和规则。

导航：← 第二章：安装与环境配置｜第四章：YAML 流水线格式 →