Files
YG-Rules/skill/yg-rules-pipeline/SKILL.md

3.0 KiB

name, description
name description
yg-rules-pipeline Run or maintain an end-to-end YG-Rules local pipeline from a fixed input folder to generated Excel and Markdown outputs. Use when files are placed in a directory and Codex should collect domains, schema, and guidance files, run the existing parser/storage/analysis/rule-generation code directly without Flask, and produce output/rules-{task_id}/ artifacts.

YG-Rules Pipeline

Overview

Use this skill when the user wants a folder-driven local run: collect files from an input directory, populate the project intermediate state, analyze guidance files, and generate Excel plus Markdown outputs.

Do not call Flask routes for this workflow. Run the bundled script, which imports the project code directly.

Input Folder Contract

Default layout:

input/
  domains.xlsx        # or domains.csv / domains.json
  schema.xlsx         # or schema.xls
  guidance/
    过度负债/
      policy.md
      policy.docx
    无关多元/
      policy.pdf
    _all/
      common-policy.md

Rules:

  • domains.* is required and must parse through app.utils.parser.parse_upload_file.
  • schema.* is recommended and must be .xlsx or .xls.
  • guidance/<domain name>/ files attach only to the matching domain.
  • guidance/_all/ files attach to every domain.
  • Supported guidance extensions follow the app: .txt, .pdf, .doc, .docx, .md.

Run

From the repo root:

python skill\yg-rules-pipeline\scripts\run_pipeline.py --input input --limit 2 --create-sql

Useful options:

python skill\yg-rules-pipeline\scripts\run_pipeline.py --input E:\path\to\folder --granularity high --limit 5 --timeout 900
python skill\yg-rules-pipeline\scripts\run_pipeline.py --input input --skip-schema --limit 1

The script prints the task id, output directory, Excel path, Markdown path, skipped domains, skipped rules, and any Markdown error.

Implementation Flow

The script performs these steps directly:

  1. Parse domains.* with parse_upload_file.
  2. Save domains with DomainStorage.save_domains, replacing prior domain state.
  3. Save schema.* with SchemaStorage.save unless --skip-schema is set.
  4. Upload guidance files with DomainStorage.save_guidance_file.
  5. Analyze guidance with DomainStorage.analyze_guidance.
  6. Start RuleGenerationService(create_sql=...).
  7. Poll RuleGenerationService.get_status until done or failed.
  8. Validate sibling .xlsx and .md outputs when possible.

Maintenance Rules

  • Keep this pipeline script direct-code based; do not rewrite it to use HTTP unless explicitly requested.
  • Keep output artifact rules aligned with skill/yg-rules-output.
  • If app APIs or storage methods change, update references/local-pipeline.md and scripts/run_pipeline.py together.
  • Add script tests or smoke checks when changing input discovery or status polling.

References

  • Read references/local-pipeline.md before changing script behavior or input layout.
  • Use skill/yg-rules-output for details about final task output directory and Markdown/Excel contracts.