feat: 新增预算费控模型与报销审批流引擎

后端新增预算费控服务和报销单审批流模块,引入申请人费用画像
算法,优化知识库 RAG 运行时和同步逻辑,完善报销单工作流常
量和明细同步,更新差旅报销规则电子表格,前端新增预算分析
组件和数字员工模型,完善审批对话框和洞察面板交互,优化侧
边栏和顶栏样式,补充单元测试。
This commit is contained in:
caoxiaozhu
2026-05-27 17:31:27 +08:00
parent cbb98f4469
commit d4d5d40569
75 changed files with 5393 additions and 686 deletions

View File

@@ -0,0 +1,21 @@
# X-Financial 核心算法推演目录
> 目录名 `algorithem` 沿用当前任务指定拼写。该目录用于沉淀核心算法推演、公式口径和可审计实现,避免把算法细节直接堆进 `services`。
## 目录职责
- 保存预算、费用、风控、知识检索等核心算法的推演文档。
- 记录公式、权重、阈值、输入输出协议和边界案例。
- 为后续 Python 实现、单元测试和接口协议提供依据。
## 当前算法主题
- `applicant_expense_profile_formula.md`:申请人费用画像与审核建议量化公式。
- `applicant_expense_profile.py`:申请人费用画像评分的第一版纯算法实现。
## 落地原则
- 算法先有可解释公式,再进入业务服务实现。
- 硬规则、评分权重和自然语言解释要分层。
- 所有核心算法模块都要遵守 800 行上限,按职责拆分。
- 涉及审批建议时,输出“依据 + 建议动作”,不要直接给人贴负面标签。

View File

@@ -0,0 +1,13 @@
"""Core algorithm derivations for X-Financial."""
from .applicant_expense_profile import (
ApplicantExpenseProfileInput,
ApplicantExpenseProfileResult,
evaluate_applicant_expense_profile,
)
__all__ = [
"ApplicantExpenseProfileInput",
"ApplicantExpenseProfileResult",
"evaluate_applicant_expense_profile",
]

View File

@@ -0,0 +1,445 @@
"""Applicant expense profile scoring algorithm.
The module is intentionally pure and framework-free. Service layers can build the
input snapshot from database records, while this module only owns the formula,
scores, thresholds, and explainable result.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from decimal import ROUND_HALF_UP, Decimal, InvalidOperation
from typing import Any
ZERO = Decimal("0")
ONE = Decimal("1")
HUNDRED = Decimal("100")
LEVEL_NORMAL = "normal"
LEVEL_WATCH = "watch"
LEVEL_REVIEW = "review"
LEVEL_ESCALATION = "escalation"
@dataclass(slots=True)
class ApplicantExpenseProfileInput:
"""Inputs for applicant expense behavior scoring.
Values should be pre-aggregated by a comparable peer group, such as
department + role + expense type + city grade + project type.
"""
applicant_claim_count_90d: int = 0
peer_claim_count_p75_90d: Any = ZERO
applicant_amount_90d: Any = ZERO
available_peer_budget_90d: Any = ZERO
amount_percentile: Any = ZERO
peer_amount_median_90d: Any = ZERO
adjusted_or_returned_count_180d: int = 0
approved_claim_count_180d: int = 0
requested_days: Any = ZERO
peer_travel_days_p75: Any = ZERO
business_buffer_days: Any = ONE
claim_amount: Any = ZERO
peer_daily_cost_baseline: Any = ZERO
tolerance_factor: Any = Decimal("1.20")
entertainment_amount: Any = ZERO
attendee_count: int = 0
entertainment_standard_cap: Any = ZERO
same_customer_frequency_90d: int = 0
applicant_entertainment_percentile: Any = ZERO
hard_rule_score: int = 0
@dataclass(slots=True)
class ApplicantExpenseProfileResult:
profile_score: int
profile_level: str
recommendation_score: int
recommendation_level: str
frequency_score: int
amount_occupancy_score: int
peer_deviation_score: int
adjustment_history_score: int
current_claim_deviation_score: int
travel_days_deviation_score: int
daily_cost_deviation_score: int
entertainment_deviation_score: int
frequency_ratio: Decimal
budget_share_ratio: Decimal
peer_deviation_ratio: Decimal
adjustment_ratio: Decimal
travel_days_ratio: Decimal
daily_cost_ratio: Decimal
per_capita_entertainment_ratio: Decimal
suggested_days: Decimal | None
suggested_amount_cap: Decimal | None
basis_codes: list[str] = field(default_factory=list)
def as_dict(self) -> dict[str, Any]:
return {
"profile_score": self.profile_score,
"profile_level": self.profile_level,
"recommendation_score": self.recommendation_score,
"recommendation_level": self.recommendation_level,
"scores": {
"frequency_score": self.frequency_score,
"amount_occupancy_score": self.amount_occupancy_score,
"peer_deviation_score": self.peer_deviation_score,
"adjustment_history_score": self.adjustment_history_score,
"current_claim_deviation_score": self.current_claim_deviation_score,
"travel_days_deviation_score": self.travel_days_deviation_score,
"daily_cost_deviation_score": self.daily_cost_deviation_score,
"entertainment_deviation_score": self.entertainment_deviation_score,
},
"metrics": {
"frequency_ratio": _format_decimal(self.frequency_ratio),
"budget_share_ratio": _format_decimal(self.budget_share_ratio),
"peer_deviation_ratio": _format_decimal(self.peer_deviation_ratio),
"adjustment_ratio": _format_decimal(self.adjustment_ratio),
"travel_days_ratio": _format_decimal(self.travel_days_ratio),
"daily_cost_ratio": _format_decimal(self.daily_cost_ratio),
"per_capita_entertainment_ratio": _format_decimal(
self.per_capita_entertainment_ratio
),
"suggested_days": _format_decimal(self.suggested_days),
"suggested_amount_cap": _format_decimal(self.suggested_amount_cap),
},
"basis_codes": list(self.basis_codes),
}
def evaluate_applicant_expense_profile(
payload: ApplicantExpenseProfileInput,
) -> ApplicantExpenseProfileResult:
frequency_ratio = _ratio(payload.applicant_claim_count_90d, payload.peer_claim_count_p75_90d)
frequency_score = _score_frequency_ratio(frequency_ratio)
budget_share_ratio = _ratio(payload.applicant_amount_90d, payload.available_peer_budget_90d)
amount_percentile_score = _score_percentile(_to_decimal(payload.amount_percentile))
budget_share_score = _score_budget_share_ratio(budget_share_ratio)
amount_occupancy_score = max(amount_percentile_score, budget_share_score)
peer_deviation_ratio = _ratio(payload.applicant_amount_90d, payload.peer_amount_median_90d)
peer_deviation_score = _score_peer_deviation_ratio(peer_deviation_ratio)
adjustment_ratio = _ratio(
payload.adjusted_or_returned_count_180d,
payload.approved_claim_count_180d,
)
adjustment_history_score = _score_adjustment_ratio(adjustment_ratio)
travel_days_ratio = _ratio(payload.requested_days, payload.peer_travel_days_p75)
travel_days_deviation_score = _score_travel_days_ratio(travel_days_ratio)
requested_days = _to_decimal(payload.requested_days)
daily_cost = _ratio(payload.claim_amount, requested_days)
daily_cost_ratio = _ratio(daily_cost, payload.peer_daily_cost_baseline)
daily_cost_deviation_score = _score_daily_cost_ratio(daily_cost_ratio)
per_capita_amount = _ratio(payload.entertainment_amount, payload.attendee_count)
per_capita_entertainment_ratio = _ratio(
per_capita_amount,
payload.entertainment_standard_cap,
)
entertainment_deviation_score = max(
_score_entertainment_per_capita_ratio(per_capita_entertainment_ratio),
_score_same_customer_frequency(payload.same_customer_frequency_90d),
_score_percentile(_to_decimal(payload.applicant_entertainment_percentile)),
)
current_claim_deviation_score = max(
travel_days_deviation_score,
daily_cost_deviation_score,
entertainment_deviation_score,
)
profile_score = _weighted_profile_score(
frequency_score=frequency_score,
amount_occupancy_score=amount_occupancy_score,
peer_deviation_score=peer_deviation_score,
adjustment_history_score=adjustment_history_score,
current_claim_deviation_score=current_claim_deviation_score,
)
hard_rule_score = _clamp_score(payload.hard_rule_score)
recommendation_score = max(profile_score, current_claim_deviation_score, hard_rule_score)
suggested_days = _suggest_days(
requested_days=requested_days,
baseline_days=_to_decimal(payload.peer_travel_days_p75),
business_buffer_days=_to_decimal(payload.business_buffer_days),
)
suggested_amount_cap = _suggest_amount_cap(
suggested_days=suggested_days,
daily_cost_baseline=_to_decimal(payload.peer_daily_cost_baseline),
tolerance_factor=_to_decimal(payload.tolerance_factor),
)
basis_codes = _build_basis_codes(
frequency_ratio=frequency_ratio,
amount_percentile=_to_decimal(payload.amount_percentile),
budget_share_ratio=budget_share_ratio,
peer_deviation_ratio=peer_deviation_ratio,
adjustment_ratio=adjustment_ratio,
travel_days_ratio=travel_days_ratio,
daily_cost_ratio=daily_cost_ratio,
per_capita_entertainment_ratio=per_capita_entertainment_ratio,
same_customer_frequency_90d=payload.same_customer_frequency_90d,
applicant_entertainment_percentile=_to_decimal(payload.applicant_entertainment_percentile),
hard_rule_score=hard_rule_score,
)
return ApplicantExpenseProfileResult(
profile_score=profile_score,
profile_level=level_from_score(profile_score),
recommendation_score=recommendation_score,
recommendation_level=level_from_score(recommendation_score),
frequency_score=frequency_score,
amount_occupancy_score=amount_occupancy_score,
peer_deviation_score=peer_deviation_score,
adjustment_history_score=adjustment_history_score,
current_claim_deviation_score=current_claim_deviation_score,
travel_days_deviation_score=travel_days_deviation_score,
daily_cost_deviation_score=daily_cost_deviation_score,
entertainment_deviation_score=entertainment_deviation_score,
frequency_ratio=frequency_ratio,
budget_share_ratio=budget_share_ratio,
peer_deviation_ratio=peer_deviation_ratio,
adjustment_ratio=adjustment_ratio,
travel_days_ratio=travel_days_ratio,
daily_cost_ratio=daily_cost_ratio,
per_capita_entertainment_ratio=per_capita_entertainment_ratio,
suggested_days=suggested_days,
suggested_amount_cap=suggested_amount_cap,
basis_codes=basis_codes,
)
def level_from_score(score: int) -> str:
normalized = _clamp_score(score)
if normalized >= 80:
return LEVEL_ESCALATION
if normalized >= 60:
return LEVEL_REVIEW
if normalized >= 40:
return LEVEL_WATCH
return LEVEL_NORMAL
def _weighted_profile_score(
*,
frequency_score: int,
amount_occupancy_score: int,
peer_deviation_score: int,
adjustment_history_score: int,
current_claim_deviation_score: int,
) -> int:
weighted = (
Decimal(frequency_score) * Decimal("0.20")
+ Decimal(amount_occupancy_score) * Decimal("0.25")
+ Decimal(peer_deviation_score) * Decimal("0.25")
+ Decimal(adjustment_history_score) * Decimal("0.15")
+ Decimal(current_claim_deviation_score) * Decimal("0.15")
)
return _clamp_score(int(weighted.quantize(Decimal("1"), rounding=ROUND_HALF_UP)))
def _score_frequency_ratio(ratio: Decimal) -> int:
return _score_ratio(
ratio,
[(Decimal("1.0"), 0), (Decimal("1.2"), 30), (Decimal("1.5"), 60), (Decimal("2.0"), 80)],
)
def _score_budget_share_ratio(ratio: Decimal) -> int:
return _score_ratio(ratio, [(Decimal("0.10"), 0), (Decimal("0.20"), 40), (Decimal("0.35"), 70)])
def _score_peer_deviation_ratio(ratio: Decimal) -> int:
return _score_ratio(ratio, [(Decimal("1.0"), 0), (Decimal("1.3"), 40), (Decimal("1.8"), 70)])
def _score_adjustment_ratio(ratio: Decimal) -> int:
return _score_ratio(ratio, [(Decimal("0.05"), 0), (Decimal("0.15"), 40), (Decimal("0.30"), 70)])
def _score_travel_days_ratio(ratio: Decimal) -> int:
return _score_ratio(ratio, [(Decimal("1.2"), 0), (Decimal("1.5"), 40), (Decimal("2.0"), 70)])
def _score_daily_cost_ratio(ratio: Decimal) -> int:
return _score_ratio(ratio, [(Decimal("1.1"), 0), (Decimal("1.3"), 40), (Decimal("1.6"), 70)])
def _score_entertainment_per_capita_ratio(ratio: Decimal) -> int:
return _score_ratio(ratio, [(Decimal("1.0"), 0), (Decimal("1.2"), 40), (Decimal("1.5"), 70)])
def _score_same_customer_frequency(frequency: int) -> int:
if frequency >= 5:
return 100
if frequency >= 3:
return 70
if frequency >= 2:
return 40
return 0
def _score_percentile(percentile: Decimal) -> int:
normalized = max(ZERO, min(HUNDRED, percentile))
if normalized <= Decimal("75"):
return 0
if normalized <= Decimal("85"):
return 40
if normalized <= Decimal("95"):
return 70
return 100
def _score_ratio(ratio: Decimal, bands: list[tuple[Decimal, int]]) -> int:
if ratio <= ZERO:
return 0
for upper_bound, score in bands:
if ratio <= upper_bound:
return score
return 100
def _suggest_days(
*,
requested_days: Decimal,
baseline_days: Decimal,
business_buffer_days: Decimal,
) -> Decimal | None:
if requested_days <= ZERO:
return None
if baseline_days <= ZERO:
return _quantize(requested_days)
return _quantize(min(requested_days, baseline_days + max(ZERO, business_buffer_days)))
def _suggest_amount_cap(
*,
suggested_days: Decimal | None,
daily_cost_baseline: Decimal,
tolerance_factor: Decimal,
) -> Decimal | None:
if suggested_days is None or suggested_days <= ZERO or daily_cost_baseline <= ZERO:
return None
factor = tolerance_factor if tolerance_factor > ZERO else ONE
return _quantize_money(suggested_days * daily_cost_baseline * factor)
def _build_basis_codes(
*,
frequency_ratio: Decimal,
amount_percentile: Decimal,
budget_share_ratio: Decimal,
peer_deviation_ratio: Decimal,
adjustment_ratio: Decimal,
travel_days_ratio: Decimal,
daily_cost_ratio: Decimal,
per_capita_entertainment_ratio: Decimal,
same_customer_frequency_90d: int,
applicant_entertainment_percentile: Decimal,
hard_rule_score: int,
) -> list[str]:
basis_codes: list[str] = []
if frequency_ratio > Decimal("1.5"):
basis_codes.append("applicant.frequency.ratio_review")
elif frequency_ratio > Decimal("1.2"):
basis_codes.append("applicant.frequency.ratio_watch")
if amount_percentile > Decimal("95"):
basis_codes.append("applicant.amount_percentile.p95")
elif amount_percentile > Decimal("85"):
basis_codes.append("applicant.amount_percentile.p85")
if budget_share_ratio > Decimal("0.35"):
basis_codes.append("applicant.budget_share.high")
elif budget_share_ratio > Decimal("0.20"):
basis_codes.append("applicant.budget_share.watch")
if peer_deviation_ratio > Decimal("1.8"):
basis_codes.append("applicant.peer_deviation.escalation")
elif peer_deviation_ratio > Decimal("1.3"):
basis_codes.append("applicant.peer_deviation.review")
if adjustment_ratio > Decimal("0.30"):
basis_codes.append("applicant.adjustment_history.escalation")
elif adjustment_ratio > Decimal("0.15"):
basis_codes.append("applicant.adjustment_history.review")
if travel_days_ratio > Decimal("2.0"):
basis_codes.append("travel.days_ratio.escalation")
elif travel_days_ratio > Decimal("1.5"):
basis_codes.append("travel.days_ratio.review")
elif travel_days_ratio > Decimal("1.2"):
basis_codes.append("travel.days_ratio.watch")
if daily_cost_ratio > Decimal("1.6"):
basis_codes.append("travel.daily_cost_ratio.escalation")
elif daily_cost_ratio > Decimal("1.3"):
basis_codes.append("travel.daily_cost_ratio.review")
elif daily_cost_ratio > Decimal("1.1"):
basis_codes.append("travel.daily_cost_ratio.watch")
if per_capita_entertainment_ratio > Decimal("1.5"):
basis_codes.append("entertainment.per_capita.escalation")
elif per_capita_entertainment_ratio > Decimal("1.2"):
basis_codes.append("entertainment.per_capita.review")
if same_customer_frequency_90d >= 5:
basis_codes.append("entertainment.same_customer.frequency_escalation")
elif same_customer_frequency_90d >= 3:
basis_codes.append("entertainment.same_customer.frequency_review")
if applicant_entertainment_percentile > Decimal("95"):
basis_codes.append("entertainment.amount_percentile.p95")
elif applicant_entertainment_percentile > Decimal("85"):
basis_codes.append("entertainment.amount_percentile.p85")
if hard_rule_score >= 80:
basis_codes.append("hard_rule.escalation")
elif hard_rule_score >= 60:
basis_codes.append("hard_rule.review")
return basis_codes
def _ratio(numerator: Any, denominator: Any) -> Decimal:
divisor = _to_decimal(denominator)
if divisor <= ZERO:
return ZERO
return _quantize(_to_decimal(numerator) / divisor)
def _to_decimal(value: Any) -> Decimal:
if value is None:
return ZERO
if isinstance(value, Decimal):
return value
try:
return Decimal(str(value))
except (InvalidOperation, ValueError):
return ZERO
def _quantize(value: Decimal) -> Decimal:
return value.quantize(Decimal("0.0001"), rounding=ROUND_HALF_UP)
def _quantize_money(value: Decimal) -> Decimal:
return value.quantize(Decimal("0.01"), rounding=ROUND_HALF_UP)
def _format_decimal(value: Decimal | None) -> str | None:
if value is None:
return None
return format(value.normalize(), "f")
def _clamp_score(score: int) -> int:
return max(0, min(100, int(score)))

View File

@@ -0,0 +1,224 @@
# 申请人费用画像与审核建议量化公式
## 目标
申请人费用画像用于回答:
- 该申请人的近期费用节奏是否高于同组基准。
- 本次申请的天数、金额、招待频率是否明显偏离。
- 审批时应普通通过、谨慎通过、重点复核,还是升级审批。
画像结果只作为审批参考,不应直接定义员工为异常人员。
## 同组基准
同组基准必须按业务可比口径聚合:
```text
peer_group =
部门
+ 岗位/职级
+ 费用类型
+ 城市等级/业务区域
+ 项目类型/客户阶段
+ 近 90 天或近 180 天窗口
```
不能用全公司平均值直接比较销售、研发、项目经理等不同岗位。
## 画像风险分
```text
profile_score =
frequency_score * 0.20
+ amount_occupancy_score * 0.25
+ peer_deviation_score * 0.25
+ adjustment_history_score * 0.15
+ current_claim_deviation_score * 0.15
```
等级:
```text
0-39 normal 正常
40-59 watch 需关注
60-79 review 重点复核
80-100 escalation 强复核 / 升级审批
```
## 子分计算
### 高频申请分
```text
frequency_ratio = applicant_claim_count_90d / peer_claim_count_p75_90d
frequency_score =
0 if frequency_ratio <= 1.0
30 if 1.0 < frequency_ratio <= 1.2
60 if 1.2 < frequency_ratio <= 1.5
80 if 1.5 < frequency_ratio <= 2.0
100 if frequency_ratio > 2.0
```
### 高金额占用分
```text
budget_share_ratio = applicant_amount_90d / available_peer_budget_90d
amount_percentile = applicant_amount_90d 在同组中的分位数
amount_occupancy_score =
max(
percentile_score(amount_percentile),
ratio_score(budget_share_ratio)
)
```
建议第一版分位规则:
```text
P0-P75 -> 0
P75-P85 -> 40
P85-P95 -> 70
P95-P100 -> 100
```
### 同组偏离分
```text
peer_deviation_ratio =
applicant_amount_90d / peer_amount_median_90d
peer_deviation_score =
0 if peer_deviation_ratio <= 1.0
40 if 1.0 < peer_deviation_ratio <= 1.3
70 if 1.3 < peer_deviation_ratio <= 1.8
100 if peer_deviation_ratio > 1.8
```
### 历史调减退回分
```text
adjustment_ratio =
adjusted_or_returned_count_180d / approved_claim_count_180d
adjustment_history_score =
0 if adjustment_ratio <= 0.05
40 if 0.05 < adjustment_ratio <= 0.15
70 if 0.15 < adjustment_ratio <= 0.30
100 if adjustment_ratio > 0.30
```
### 本次申请偏离分
```text
current_claim_deviation_score =
max(
travel_days_deviation_score,
daily_cost_deviation_score,
entertainment_deviation_score
)
```
## 出差天数建议
```text
baseline_days = peer_group.travel_days_p75
travel_days_ratio = requested_days / baseline_days
suggested_days = min(
requested_days,
baseline_days + business_buffer_days
)
```
建议阈值:
```text
travel_days_ratio <= 1.2 正常
1.2 < ratio <= 1.5 提醒关注
1.5 < ratio <= 2.0 建议压缩天数或补充说明
ratio > 2.0 重点复核 / 升级审批
```
## 出差费用建议
```text
daily_cost = claim_amount / requested_days
daily_cost_ratio = daily_cost / peer_city_grade_daily_cost_baseline
suggested_amount_cap =
suggested_days
* peer_city_grade_daily_cost_baseline
* tolerance_factor
```
第一版 `tolerance_factor` 建议取 `1.15``1.20`
建议阈值:
```text
daily_cost_ratio <= 1.1 正常
1.1 < ratio <= 1.3 提醒关注
1.3 < ratio <= 1.6 建议调减费用标准
ratio > 1.6 重点复核 / 升级审批
```
## 招待费用建议
```text
per_capita_entertainment_amount = entertainment_amount / attendee_count
per_capita_ratio = per_capita_entertainment_amount / entertainment_standard_cap
same_customer_frequency_90d = count(customer_id, applicant_id, 90d)
applicant_entertainment_percentile =
applicant_entertainment_amount_90d 在同组中的分位数
```
建议阈值:
```text
per_capita_ratio <= 1.0 正常
1.0 < ratio <= 1.2 提醒关注
1.2 < ratio <= 1.5 建议调低招待标准
ratio > 1.5 重点复核 / 升级审批
same_customer_frequency_90d >= 3 要求补充客户推进阶段
percentile >= P85 重点复核
percentile >= P95 升级审批
```
## 审核建议强度
```text
recommendation_strength =
max(
level(profile_score),
level(current_claim_deviation_score),
level(hard_rule_score)
)
```
输出文案必须引用触发指标:
```text
该申请人近 90 天费用占用处于同组 P88
本次出差天数为同类 P75 的 1.67 倍。
建议将出差天数由 5 天调整为 4 天,
或补充客户拜访安排和项目阶段说明。
```
## 输出字段建议
```json
{
"profile_score": 66,
"profile_level": "review",
"peer_percentile": 88,
"travel_days_ratio": "1.67",
"daily_cost_ratio": "1.34",
"same_customer_frequency_90d": 3,
"suggested_days": 4,
"suggested_amount_cap": "6800.00",
"basis_codes": [
"applicant.peer_percentile.p85",
"travel.days_ratio.review"
]
}
```