Compare commits

..

24 Commits

Author SHA1 Message Date
Developer
6aa271c4f7 refactor: 前端架构重构 - 提取 CSS 和逻辑到独立模块
前端重构:
- 删除旧的大体积 Vue 组件(HomeView, FileManage, TextSplit 等)
- 删除旧的 composables(useFormatters, useModels, useProjects)
- 新增 core/, page-logic/, pages/, shared/ 模块化目录结构
- 提取 CSS 到 styles/pages/ 目录
- 添加全局样式 variables.css 和 common.css

后端 API 更新:
- chunks: 语义分割 API 增强
- files: 文件处理 API 更新
- models: 模型管理 API 更新
- questions: 问答管理 API 更新
- database: 数据库连接优化
- semantic_embedding: 语义嵌入服务优化

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 14:23:34 +08:00
Developer
a280b4f014 feat(backend): 文件处理和语义分割 API 更新
- chunks API: 支持语义分割模式和 embedding 配置
- files API: 文件异步处理优化

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 10:11:59 +08:00
Developer
135f75e6be feat(frontend): 增强背景动画效果和侧边栏交互
- App.vue: 增强 sci-fi 背景动画,mesh 渐变光晕缓慢移动
- ProjectView.vue: 移除侧边栏"返回首页"按钮
- TextSplit.vue: 分割生成页面多选交互改造
- DeleteDialog.vue: 删除确认对话框组件

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 10:11:52 +08:00
Developer
45b77a44c6 style(frontend): 统一文件管理、评估管理、问答管理界面样式
- 评估管理界面:新增统计卡片带 glow 效果、空状态轨道动画、表格布局多选
- 问答管理界面:采用与文件管理一致的渐变标题、统计卡片、空状态动画
- 文件管理:微调样式细节

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 10:11:44 +08:00
Developer
fa7829657f chore: 删除废弃文件
- 删除 bug修复.md
- 删除废弃的 home.scss 样式文件

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 16:08:31 +08:00
Developer
3e2d07a502 refactor(frontend): 更新项目视图和文本分割页面
- App.vue: 更新样式和路由配置
- ProjectView.vue: 布局调整
- TextSplit.vue: 分割功能完善

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 16:08:16 +08:00
Developer
df70c09fe2 feat(frontend): 优化文件管理上传流程和 UI 体验
- 上传后立即显示文件列表,无需等待
- 添加轮询机制自动更新处理状态
- 移除固定高度限制,表格高度自适应
- 优化动画只在首次加载时播放,避免刷新闪烁
- 上传中状态隐藏空状态显示

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 16:08:12 +08:00
Developer
cc2e73c595 feat(backend): 更新 API 支持语义分割和 embedding 配置
- chunks API 添加 embedding 配置字段
- projects API 更新路由和方法

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 16:08:08 +08:00
Developer
da2887d913 feat(backend): 添加语义嵌入文本分割功能
- 新增 semantic_embedding.py 模块,基于 embedding 相似度进行语义分割
- 集成到 splitter.py 的 get_splitter 工厂函数
- 支持配置 embedding 模型和相似度阈值

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 16:08:04 +08:00
Developer
1cf44ac6f7 fix(backend): 修复文件上传后异步处理失败问题
- 修复 async_session_maker 未定义错误,改用 AsyncSessionLocal
- 确保文件上传后能正确异步转换为 Markdown

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 16:08:00 +08:00
Developer
9a12907f25 feat(frontend): 新增 composables 工具函数和爬虫页面
- 添加 useFormatters、useModels、useProjects 组合式函数
- 新增样式文件 index.scss 和 pages/home.scss
- 添加 CrawlerView 爬虫页面视图

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 10:45:36 +08:00
Developer
a1342b7634 feat: 完善前端功能,添加爬虫页面和项目分页
- 新增 CrawlerView 爬虫页面
- 完善 HomeView 分页展示(9个/页)
- 更新 ProjectCard 组件图标
- 优化 API 客户端和类型定义
- 重构样式文件结构到独立目录

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 10:45:32 +08:00
Developer
68453cead8 feat(backend): 完善日志系统,支持按日期分目录存储
- 实现 logs/YYYY-MM-DD/ 日期文件夹结构
- 添加 success.log 和 failure.log 专用日志
- 使用 TimedRotatingFileHandler 实现按天切割
- 添加 log_success 和 log_failure 便捷函数
- 集成 markitdown 进行文件转换
- 优化文件存储路径,按项目ID分类存储

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 10:44:09 +08:00
Developer
7514e7e763 feat: 完善模型管理功能
- 新增模型 API 路由,支持 CRUD 和测试连接
- 支持 MiniMax、GLM、OpenAI Compatible 三种供应商
- 添加连接状态持久化 (untested/connected/disconnected)
- 修复 CORS 和数据库模型兼容性问题
- 前端 UI 优化:供应商默认 API 地址自动填充

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 23:02:43 +08:00
Developer
15846a0f7a refactor(frontend): 用 TypeScript 版本替换 JavaScript API 客户端
- 删除旧的 JavaScript API 客户端 (index.js)
- 使用新的 TypeScript 版本 (index.ts)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:30:32 +08:00
Developer
5f56eec248 feat(frontend): 更新依赖和路由配置
- 更新 npm 依赖 (package.json)
- 更新路由配置 (router/index.js)
- 更新 Vite 构建配置 (vite.config.js)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:30:21 +08:00
Developer
47d1da7cea feat(backend): 更新核心模块和文件处理
- 更新配置模块 (config.py)
- 更新数据库连接 (database.py)
- 更新主应用入口 (main.py)
- 更新数据模型 (models.py)
- 更新基础 Schema (base.py)
- 更新文件处理器 (docx, excel, pdf)
- 更新 Dockerfile

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:30:11 +08:00
Developer
db11429290 feat(backend): 更新 API 端点实现
- 更新 Chunks API 端点
- 更新 Datasets API 端点
- 更新 Evaluation API 端点
- 更新 Files API 端点
- 更新 Projects API 端点
- 更新 Questions API 端点

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:29:58 +08:00
Developer
eac10a9d95 chore: 添加一键启动脚本
- 添加 start.sh 启动脚本
- 支持前端、后端一键启动
- 支持自定义端口配置

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:29:34 +08:00
Developer
e6aa585e06 chore: 添加项目配置文件
- 添加 Python 项目配置 (pyproject.toml)
- 添加环境变量示例 (.env.example)
- 添加 Docker 忽略文件 (.dockerignore)
- 添加 TypeScript 配置 (tsconfig.json, tsconfig.node.json)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:29:24 +08:00
Developer
2b2e1a67c8 feat(frontend): 优化页面功能和 UI
- 添加模型配置页面 (ModelSettingsView.vue)
- 优化首页项目列表显示和删除功能 (HomeView.vue)
- 优化项目详情页 (ProjectView.vue)
- 优化项目设置页 (Settings.vue)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:29:11 +08:00
Developer
66d251dcc4 feat(frontend): 添加 TypeScript 类型定义和组件
- 添加 TypeScript API 客户端 (api/index.ts)
- 添加全局样式 (styles/)
- 添加类型定义 (types/)
- 添加 Vue 组件 (components/)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:28:58 +08:00
Developer
3eb5d47bd3 feat(backend): 添加 API Schemas 定义
- 添加 Chunk 数据结构 (chunk.py)
- 添加 Dataset Schema (dataset.py)
- 添加 Evaluation Schema (eval.py)
- 添加 File Schema (file.py)
- 添加 Project Schema (project.py)
- 添加 Question Schema (question.py)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:28:47 +08:00
Developer
efe5d240ae feat(backend): 添加核心架构模块
- 添加认证模块 (auth.py)
- 添加 CRUD 基础操作 (crud.py)
- 添加异常处理 (exceptions.py)
- 添加日志模块 (logging.py)
- 添加响应格式 (response.py)
- 添加依赖注入 (dependencies.py)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:28:36 +08:00
116 changed files with 31544 additions and 3928 deletions

151
.daily-work-cache.json Normal file
View File

@@ -0,0 +1,151 @@
{
"date": "2026-03-17",
"project": "YG-Datasets",
"projectPath": "/data/code/YG-Datasets",
"entries": [
{
"time": "17:28",
"type": "feature",
"title": "后端核心架构模块",
"files": ["backend/app/core/auth.py", "backend/app/core/crud.py", "backend/app/core/exceptions.py", "backend/app/core/logging.py"],
"description": "添加认证模块、CRUD基础操作、异常处理、日志模块",
"source": "git-commit"
},
{
"time": "17:28",
"type": "feature",
"title": "后端 API Schemas 定义",
"files": ["backend/app/schemas/chunk.py", "backend/app/schemas/dataset.py", "backend/app/schemas/eval.py"],
"description": "添加数据结构的 Schema 定义",
"source": "git-commit"
},
{
"time": "17:28",
"type": "feature",
"title": "前端 TypeScript 类型定义和组件",
"files": ["frontend/src/api/index.ts", "frontend/src/components/", "frontend/src/types/"],
"description": "添加 TypeScript API 客户端和组件",
"source": "git-commit"
},
{
"time": "17:29",
"type": "feature",
"title": "前端页面功能和 UI 优化",
"files": ["frontend/src/views/ModelSettingsView.vue", "frontend/src/views/HomeView.vue"],
"description": "添加模型配置页面,优化项目列表和删除功能",
"source": "git-commit"
},
{
"time": "17:29",
"type": "change",
"title": "项目配置文件",
"files": ["backend/pyproject.toml", "frontend/tsconfig.json"],
"description": "添加项目配置文件",
"source": "git-commit"
},
{
"time": "17:29",
"type": "change",
"title": "一键启动脚本",
"files": ["start.sh"],
"description": "添加一键启动脚本",
"source": "git-commit"
},
{
"time": "17:29",
"type": "feature",
"title": "后端 API 端点实现",
"files": ["backend/app/api/v1/projects/__init__.py", "backend/app/api/v1/datasets/__init__.py"],
"description": "更新 API 端点实现",
"source": "git-commit"
},
{
"time": "17:30",
"type": "feature",
"title": "后端核心模块和文件处理",
"files": ["backend/app/core/config.py", "backend/app/main.py", "backend/app/models/models.py"],
"description": "更新核心模块和文件处理器",
"source": "git-commit"
},
{
"time": "17:30",
"type": "change",
"title": "前端依赖和路由配置",
"files": ["frontend/package.json", "frontend/src/router/index.js", "frontend/vite.config.js"],
"description": "更新依赖和路由配置",
"source": "git-commit"
},
{
"time": "17:30",
"type": "refactor",
"title": "前端 API 客户端重构",
"files": ["frontend/src/api/index.js", "frontend/src/api/index.ts"],
"description": "用 TypeScript 版本替换 JavaScript API 客户端",
"source": "git-commit"
},
{
"time": "17:35",
"type": "bugfix",
"title": "修复返回按钮白色背景遮挡",
"files": ["frontend/src/views/ModelSettingsView.vue"],
"description": "修复模型配置页面返回按钮 hover 时白色背景遮挡问题",
"source": "manual"
},
{
"time": "17:40",
"type": "change",
"title": "Git 代码推送",
"files": [],
"description": "推送所有代码更改到远程仓库",
"source": "manual"
},
{
"time": "22:40",
"type": "bugfix",
"title": "修复数据库初始化问题",
"files": ["backend/app/core/database.py", "backend/app/main.py"],
"description": "修复数据库表未创建的问题,添加 models 导入确保 Base.metadata 包含所有模型",
"source": "manual"
},
{
"time": "22:42",
"type": "bugfix",
"title": "修复 API 响应序列化错误",
"files": ["backend/app/api/v1/models/__init__.py", "backend/app/schemas/model.py"],
"description": "修复 SQLAlchemy ORM 对象无法序列化为 JSON 的问题,使用 model_validate() 转换",
"source": "manual"
},
{
"time": "22:45",
"type": "feature",
"title": "添加供应商默认 API Base URL",
"files": ["frontend/src/views/ModelSettingsView.vue"],
"description": "为 MiniMax、GLM、OpenAI Compatible 三个供应商添加默认 API Base URL自动填充",
"source": "manual"
},
{
"time": "22:50",
"type": "feature",
"title": "实现模型连接测试功能",
"files": ["backend/app/api/v1/models/__init__.py", "frontend/src/views/ModelSettingsView.vue", "frontend/src/api/index.ts"],
"description": "后端添加测试连接 API前端调用并显示连接状态已联通/未联通/待测试)",
"source": "manual"
},
{
"time": "22:55",
"type": "feature",
"title": "创建 git-commit skill",
"files": ["/root/.claude/skills/git-commit/SKILL.md"],
"description": "创建 Git 分批提交技能,自动分析 git 状态,按功能分组文件,生成规范提交信息",
"source": "manual"
},
{
"time": "23:00",
"type": "change",
"title": "前端 UI 样式调整",
"files": ["frontend/src/App.vue", "frontend/src/main.js"],
"description": "添加 Ant Design Vue 组件库,调整 Select 组件暗色样式",
"source": "manual"
}
]
}

50
backend/.dockerignore Normal file
View File

@@ -0,0 +1,50 @@
# Git
.git
.gitignore
# Python
__pycache__
*.py[cod]
*$py.class
*.so
.Python
venv/
.venv/
env/
.env
*.egg-info/
dist/
build/
# IDE
.vscode/
.idea/
*.swp
*.swo
# Testing
.pytest_cache/
.coverage
htmlcov/
# Logs
*.log
logs/
# Database
*.db
*.sqlite
*.sqlite3
# Uploads (should be persisted separately)
uploads/*
!uploads/.gitkeep
# Docker
Dockerfile
.dockerignore
# Misc
.DS_Store
*.md
docs/

24
backend/.env.example Normal file
View File

@@ -0,0 +1,24 @@
# Application
APP_NAME=YG-Dataset
DEBUG=true
HOST=0.0.0.0
PORT=8000
ALLOWED_ORIGINS=*
# Database - Use SQLite for development, PostgreSQL for production
DATABASE_URL=sqlite+aiosqlite:///./ygdataset.db
DATABASE_URL_SYNC=sqlite:///./ygdataset.db
# Security
SECRET_KEY=your-secret-key-change-in-production
# File Storage
UPLOAD_DIR=./uploads
MAX_FILE_SIZE=104857600
# LLM Settings
DEFAULT_MODEL_PROVIDER=openai
DEFAULT_MODEL_NAME=gpt-4o-mini
# Logging
LOG_LEVEL=INFO

View File

@@ -1,27 +1,60 @@
FROM python:3.11-slim
# Multi-stage build for Python FastAPI application
# Stage 1: Base image
FROM python:3.11-slim as base
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PYTHONFAULTHANDLER=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
WORKDIR /app
# Stage 2: Dependencies
FROM base as deps
# Install system dependencies
RUN apt-get update && apt-get install -y \
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
libpq-dev \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements
COPY requirements.txt .
# Create virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application
COPY . .
# Create uploads directory
RUN mkdir -p uploads
# Stage 3: Production
FROM base
# Install system dependencies for production
RUN apt-get update && apt-get install -y --no-install-recommends \
libpq5 \
&& rm -rf /var/lib/apt/lists/*
# Copy virtual environment from deps stage
COPY --from=deps /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# Copy application
COPY --chown=app:app . /app
RUN mkdir -p /app/uploads /app/logs && chown -R app:app /app
# Switch to non-root user
USER app
# Expose port
EXPOSE 8000
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')" || exit 1
# Run application
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--reload"]
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]

View File

@@ -0,0 +1,20 @@
"""
API Dependencies
API 依赖项
"""
from typing import Annotated, Optional
from fastapi import Depends
from app.core.auth import verify_api_key
# Type alias for API key dependency
ApiKey = Annotated[str, Depends(verify_api_key)]
# Optional API key (for endpoints that can work with or without auth)
async def get_optional_api_key(api_key: str = None) -> Optional[str]:
"""Get optional API key"""
return api_key
OptionalApiKey = Annotated[Optional[str], Depends(get_optional_api_key)]

View File

@@ -0,0 +1,75 @@
"""
API Response Wrapper
统一 API 响应格式
"""
from datetime import datetime
from typing import Any, Generic, List, Optional, TypeVar
from uuid import UUID
from pydantic import BaseModel, ConfigDict, Field
T = TypeVar("T")
class ApiResponse(BaseModel, Generic[T]):
"""统一 API 响应格式"""
model_config = ConfigDict(from_attributes=True)
success: bool = True
message: str = "Success"
data: Optional[T] = None
error: Optional[dict] = None
timestamp: datetime = Field(default_factory=datetime.utcnow)
@classmethod
def ok(cls, data: T = None, message: str = "Success") -> "ApiResponse[T]":
"""成功响应"""
return cls(success=True, message=message, data=data)
@classmethod
def fail(cls, message: str, error: dict = None) -> "ApiResponse[None]":
"""失败响应"""
return cls(success=False, message=message, error=error)
class PaginatedResponse(BaseModel, Generic[T]):
"""分页响应格式"""
model_config = ConfigDict(from_attributes=True)
success: bool = True
message: str = "Success"
data: List[T] = []
pagination: dict = Field(default_factory=lambda: {
"page": 1,
"page_size": 20,
"total": 0,
"total_pages": 0
})
@classmethod
def ok(
cls,
items: List[T],
page: int = 1,
page_size: int = 20,
total: int = 0
) -> "PaginatedResponse[T]":
"""创建分页响应"""
total_pages = (total + page_size - 1) // page_size if page_size > 0 else 0
return cls(
success=True,
data=items,
pagination={
"page": page,
"page_size": page_size,
"total": total,
"total_pages": total_pages
}
)
class ErrorDetail(BaseModel):
"""错误详情"""
code: str
message: str
details: Optional[dict] = None
field: Optional[str] = None

View File

@@ -4,14 +4,17 @@ API v1 Router
from fastapi import APIRouter
from app.api.v1 import files, projects, chunks, questions, datasets, eval
from app.api.v1 import files, projects, chunks, questions, datasets, eval, models
api_router = APIRouter()
# Include sub-routers
api_router.include_router(projects.router, prefix="/projects", tags=["projects"])
api_router.include_router(files.router, prefix="/files", tags=["files"])
api_router.include_router(chunks.router, prefix="/chunks", tags=["chunks"])
api_router.include_router(questions.router, prefix="/questions", tags=["questions"])
api_router.include_router(datasets.router, prefix="/datasets", tags=["datasets"])
api_router.include_router(eval.router, prefix="/eval", tags=["eval"])
# files, chunks, questions, datasets, eval 需要嵌套在 projects 下
# 通过 projects 路由中的子路由处理
api_router.include_router(files.router, prefix="/projects/{project_id}/files", tags=["files"])
api_router.include_router(chunks.router, prefix="/projects/{project_id}/chunks", tags=["chunks"])
api_router.include_router(questions.router, prefix="/projects/{project_id}/questions", tags=["questions"])
api_router.include_router(datasets.router, prefix="/projects/{project_id}/datasets", tags=["datasets"])
api_router.include_router(eval.router, prefix="/projects/{project_id}/eval", tags=["eval"])
api_router.include_router(models.router, prefix="/models", tags=["models"])

View File

@@ -1,93 +1,124 @@
"""
Chunks API Router
"""
import asyncio
from pathlib import Path
from typing import List, Optional
from uuid import UUID
from pydantic import BaseModel
from pydantic import BaseModel, Field
from fastapi import APIRouter, Depends, HTTPException, Query
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from app.core.database import get_db
from app.api.response import ApiResponse, PaginatedResponse
from app.core.database import get_db, AsyncSessionLocal
from app.core.exceptions import NotFoundException
from app.core.crud import CRUDBase
from app.core.logging import log_success, log_failure
from app.models.models import Chunk, File
from app.schemas.base import ChunkCreate, ChunkResponse
from app.schemas.chunk import ChunkResponse
from app.schemas.chunk import ChunkCreateSchema, ChunkUpdateSchema
from app.services.text_splitter.splitter import get_splitter
from app.services.file_processor.pdf_processor import process_pdf
from app.services.file_processor.docx_processor import process_docx
from app.services.file_processor.excel_processor import process_csv, process_excel
from markitdown import MarkItDown
router = APIRouter()
# Initialize CRUD
chunk_crud = CRUDBase(Chunk)
# Initialize markitdown
markitdown = MarkItDown()
def get_project_ready_dir(project_id: str) -> Path:
"""获取项目的 ready 文件目录"""
base_dir = Path("/data/code/YG-Datasets/data") / project_id / "ready"
base_dir.mkdir(parents=True, exist_ok=True)
return base_dir
class SplitRequest(BaseModel):
"""Request model for splitting text"""
file_id: Optional[UUID] = None
file_id: UUID
method: str = "recursive"
chunk_size: int = 500
overlap: int = 50
chunk_size: int = Field(500, ge=50, le=5000)
overlap: int = Field(50, ge=0, le=500)
separator: Optional[str] = None
# Embedding 相关参数(用于 semantic_embedding 方法)
embedding_provider: Optional[str] = Field(None, description="embedding provider: openai, minimax")
embedding_api_key: Optional[str] = Field(None, description="API key for embedding")
embedding_base_url: Optional[str] = Field(None, description="API base URL")
embedding_model: Optional[str] = Field(None, description="Embedding model name")
# 语义分割参数
similarity_threshold: float = Field(0.3, ge=0.0, le=1.0, description="Similarity threshold for semantic split")
min_chunk_size: int = Field(100, ge=10, le=1000, description="Minimum chunk size")
class ChunkListResponse(BaseModel):
"""Response for chunk list"""
chunks: List[ChunkResponse]
total: int
def process_file_by_type(file: File) -> str:
"""Process file based on its type"""
async def process_file_by_type(file: File) -> str:
"""Process file based on its type, convert to markdown"""
if not file.file_path:
raise HTTPException(status_code=400, detail="File path not found")
raise NotFoundException("File", file.id)
processors = {
"pdf": process_pdf,
"docx": process_docx,
"xlsx": process_excel,
"csv": process_csv,
}
# Supported types for markitdown
markitdown_types = ["pdf", "docx", "doc", "pptx", "ppt", "xlsx", "xls", "htm", "html"]
if file.file_type in markitdown_types:
# Use markitdown to convert to markdown
loop = asyncio.get_event_loop()
result = await loop.run_in_executor(
None,
lambda: markitdown.convert(file.file_path)
)
return result.text_content
processor = processors.get(file.file_type)
if not processor:
# Return raw text for txt, md files
with open(file.file_path, 'r', encoding='utf-8') as f:
return f.read()
return processor(file.file_path)
loop = asyncio.get_event_loop()
content = await loop.run_in_executor(
None,
lambda: open(file.file_path, 'r', encoding='utf-8').read()
)
return content
@router.post("/split", response_model=dict)
async def split_text(
async def process_split_async(
project_id: UUID,
request: SplitRequest,
db: AsyncSession = Depends(get_db)
):
"""Split text into chunks"""
# Get file
if request.file_id:
"""Run chunk splitting in background."""
async with AsyncSessionLocal() as db:
file = None
try:
result = await db.execute(
select(File).where(File.id == request.file_id, File.project_id == project_id)
)
file = result.scalar_one_or_none()
if not file:
raise HTTPException(status_code=404, detail="File not found")
return
# Process file
text = process_file_by_type(file)
text = await process_file_by_type(file)
# Update file status
file.status = "processing"
await db.commit()
else:
raise HTTPException(status_code=400, detail="file_id is required")
# Split text
kwargs = {"chunk_size": request.chunk_size, "overlap": request.overlap}
if request.method == "custom" and request.separator:
kwargs["separator"] = request.separator
if request.method == "semantic_embedding":
kwargs["embedding_provider_type"] = request.embedding_provider or "openai"
kwargs["embedding_api_key"] = request.embedding_api_key
kwargs["embedding_base_url"] = request.embedding_base_url or "https://api.minimax.chat/v1"
kwargs["embedding_model"] = request.embedding_model or "text-embedding-3-small"
kwargs["similarity_threshold"] = request.similarity_threshold
kwargs["min_chunk_size"] = request.min_chunk_size
splitter = get_splitter(request.method, **kwargs)
split_results = splitter.split(text)
# Save chunks
await db.execute(
Chunk.__table__.delete().where(
Chunk.project_id == project_id,
Chunk.file_id == file.id
)
)
chunks = []
for chunk_data in split_results:
db_chunk = Chunk(
@@ -102,81 +133,182 @@ async def split_text(
await db.commit()
# Update file status
ready_dir = get_project_ready_dir(str(project_id))
# 删除旧的 markdown 文件(可能有两种命名格式)
old_md_files = list(ready_dir.glob(f"{file.id}*.md"))
for old_file in old_md_files:
try:
old_file.unlink()
except Exception:
pass
md_filename = f"{file.id}.md"
md_path = ready_dir / md_filename
loop = asyncio.get_event_loop()
await loop.run_in_executor(
None,
lambda: md_path.write_text(text, encoding='utf-8')
)
file.file_path = str(md_path)
file.status = "completed"
await db.commit()
return {"chunks": len(chunks), "message": f"Successfully split into {len(chunks)} chunks"}
log_success(
"文件分割完成",
project_id=str(project_id),
file_id=str(file.id),
filename=file.filename,
method=request.method,
chunk_count=len(chunks),
text_length=len(text),
ready_path=str(md_path)
)
except Exception as e:
if file:
file.status = "failed"
await db.commit()
log_failure(
"文件分割失败",
project_id=str(project_id),
file_id=str(request.file_id),
method=request.method,
error=str(e)
)
@router.get("/", response_model=dict)
@router.post("/split", response_model=ApiResponse)
async def split_text(
project_id: UUID,
request: SplitRequest,
db: AsyncSession = Depends(get_db)
):
"""Split text into chunks"""
try:
result = await db.execute(
select(File).where(File.id == request.file_id, File.project_id == project_id)
)
file = result.scalar_one_or_none()
if not file:
raise NotFoundException("File", request.file_id)
# 记录开始处理
log_success(
"开始处理文件",
project_id=str(project_id),
file_id=str(file.id),
filename=file.filename,
method=request.method,
chunk_size=request.chunk_size,
overlap=request.overlap
)
file.status = "processing"
await db.commit()
asyncio.create_task(
process_split_async(
project_id=project_id,
request=request,
)
)
return ApiResponse.ok(
data={"file_id": str(file.id), "status": file.status},
message="Split task started, processing in background"
)
except Exception as e:
if 'file' in locals() and file:
file.status = "failed"
await db.commit()
log_failure(
"分割任务启动失败",
project_id=str(project_id),
file_id=str(request.file_id),
error=str(e)
)
raise
@router.get("", response_model=ApiResponse)
async def list_chunks(
project_id: UUID,
file_id: Optional[UUID] = Query(None),
page: int = Query(1, ge=1),
page_size: int = Query(20, ge=1, le=100),
db: AsyncSession = Depends(get_db)
):
"""List chunks for a project"""
query = select(Chunk).where(Chunk.project_id == project_id)
filters = {"project_id": project_id}
if file_id:
query = query.where(Chunk.file_id == file_id)
filters["file_id"] = file_id
query = query.order_by(Chunk.created_at.desc())
result = await db.execute(query)
chunks = result.scalars().all()
return {
"chunks": [ChunkResponse.model_validate(c) for c in chunks],
"total": len(chunks)
}
@router.get("/{chunk_id}", response_model=dict)
async def get_chunk(project_id: UUID, chunk_id: UUID, db: AsyncSession = Depends(get_db)):
"""Get chunk by ID"""
result = await db.execute(
select(Chunk).where(Chunk.id == chunk_id, Chunk.project_id == project_id)
skip = (page - 1) * page_size
chunks, total = await chunk_crud.get_multi(
db,
skip=skip,
limit=page_size,
filters=filters,
order_by="created_at",
descending=False
)
chunk_responses = [ChunkResponse.model_validate(c) for c in chunks]
return PaginatedResponse.ok(
items=chunk_responses,
page=page,
page_size=page_size,
total=total
)
chunk = result.scalar_one_or_none()
if not chunk:
raise HTTPException(status_code=404, detail="Chunk not found")
return ChunkResponse.model_validate(chunk)
@router.put("/{chunk_id}", response_model=dict)
@router.get("/{chunk_id}", response_model=ApiResponse)
async def get_chunk(
project_id: UUID,
chunk_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Get chunk by ID"""
chunk = await chunk_crud.get(db, chunk_id)
if not chunk or chunk.project_id != project_id:
raise NotFoundException("Chunk", chunk_id)
return ApiResponse.ok(data=ChunkResponse.model_validate(chunk))
@router.put("/{chunk_id}", response_model=ApiResponse)
async def update_chunk(
project_id: UUID,
chunk_id: UUID,
chunk: ChunkCreate,
chunk: ChunkUpdateSchema,
db: AsyncSession = Depends(get_db)
):
"""Update chunk"""
result = await db.execute(
select(Chunk).where(Chunk.id == chunk_id, Chunk.project_id == project_id)
db_chunk = await chunk_crud.get(db, chunk_id)
if not db_chunk or db_chunk.project_id != project_id:
raise NotFoundException("Chunk", chunk_id)
updated_chunk = await chunk_crud.update(db, db_chunk, chunk)
return ApiResponse.ok(
data=ChunkResponse.model_validate(updated_chunk),
message="Chunk updated successfully"
)
db_chunk = result.scalar_one_or_none()
if not db_chunk:
raise HTTPException(status_code=404, detail="Chunk not found")
for key, value in chunk.model_dump(exclude_unset=True).items():
setattr(db_chunk, key, value)
await db.commit()
await db.refresh(db_chunk)
return ChunkResponse.model_validate(db_chunk)
@router.delete("/{chunk_id}", response_model=dict)
async def delete_chunk(project_id: UUID, chunk_id: UUID, db: AsyncSession = Depends(get_db)):
@router.delete("/{chunk_id}", response_model=ApiResponse)
async def delete_chunk(
project_id: UUID,
chunk_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Delete chunk"""
result = await db.execute(
select(Chunk).where(Chunk.id == chunk_id, Chunk.project_id == project_id)
)
chunk = result.scalar_one_or_none()
if not chunk:
raise HTTPException(status_code=404, detail="Chunk not found")
chunk = await chunk_crud.get(db, chunk_id)
if not chunk or chunk.project_id != project_id:
raise NotFoundException("Chunk", chunk_id)
await db.delete(chunk)
await db.commit()
return {"message": "Chunk deleted successfully"}
await chunk_crud.delete(db, chunk_id)
return ApiResponse.ok(message="Chunk deleted successfully")

View File

@@ -3,94 +3,107 @@ Datasets API Router
"""
from typing import List, Optional
from uuid import UUID
from pydantic import BaseModel
from fastapi import APIRouter, Depends, HTTPException, Query
from pydantic import BaseModel, Field
from fastapi import APIRouter, Depends, Query
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, func
from app.api.response import ApiResponse, PaginatedResponse
from app.core.database import get_db
from app.models.models import Dataset, Question
from app.schemas.base import DatasetCreate, DatasetResponse
from app.core.exceptions import NotFoundException
from app.core.crud import CRUDBase
from app.models.models import Dataset
from app.schemas.dataset import DatasetResponse
from app.schemas.dataset import DatasetCreateSchema
router = APIRouter()
# Initialize CRUD
dataset_crud = CRUDBase(Dataset)
class ExportRequest(BaseModel):
"""Export request schema"""
format: str = "alpaca" # alpaca, sharegpt, llama_factory, json
format: str = Field("alpaca", pattern="^(alpaca|sharegpt|llama_factory|json)$")
@router.get("/", response_model=dict)
async def list_datasets(project_id: UUID, db: AsyncSession = Depends(get_db)):
@router.get("", response_model=ApiResponse)
async def list_datasets(
project_id: UUID,
page: int = Query(1, ge=1),
page_size: int = Query(20, ge=1, le=100),
db: AsyncSession = Depends(get_db)
):
"""List datasets for a project"""
result = await db.execute(
select(Dataset).where(Dataset.project_id == project_id).order_by(Dataset.created_at.desc())
skip = (page - 1) * page_size
datasets, total = await dataset_crud.get_multi(
db,
skip=skip,
limit=page_size,
filters={"project_id": project_id},
order_by="created_at",
descending=True
)
datasets = result.scalars().all()
# Get question count for each dataset
dataset_list = []
for dataset in datasets:
dataset_data = DatasetResponse.model_validate(dataset)
# TODO: Count questions in dataset
dataset_data.question_count = 0
dataset_list.append(dataset_data)
return {"datasets": dataset_list}
dataset_responses = [DatasetResponse.model_validate(d) for d in datasets]
return PaginatedResponse.ok(
items=dataset_responses,
page=page,
page_size=page_size,
total=total
)
@router.post("/", response_model=dict)
@router.post("", response_model=ApiResponse)
async def create_dataset(
project_id: UUID,
dataset: DatasetCreate,
dataset: DatasetCreateSchema,
db: AsyncSession = Depends(get_db)
):
"""Create a new dataset"""
db_dataset = Dataset(project_id=project_id, **dataset.model_dump())
# Add project_id to the dataset
dataset_dict = dataset.model_dump()
dataset_dict["project_id"] = project_id
db_dataset = Dataset(**dataset_dict)
db.add(db_dataset)
await db.commit()
await db.refresh(db_dataset)
return {"id": str(db_dataset.id)}
return ApiResponse.ok(
data={"id": str(db_dataset.id)},
message="Dataset created successfully"
)
@router.get("/{dataset_id}", response_model=dict)
@router.get("/{dataset_id}", response_model=ApiResponse)
async def get_dataset(
project_id: UUID,
dataset_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Get dataset by ID"""
result = await db.execute(
select(Dataset).where(Dataset.id == dataset_id, Dataset.project_id == project_id)
)
dataset = result.scalar_one_or_none()
if not dataset:
raise HTTPException(status_code=404, detail="Dataset not found")
dataset = await dataset_crud.get(db, dataset_id)
if not dataset or dataset.project_id != project_id:
raise NotFoundException("Dataset", dataset_id)
return DatasetResponse.model_validate(dataset)
return ApiResponse.ok(data=DatasetResponse.model_validate(dataset))
@router.delete("/{dataset_id}", response_model=dict)
@router.delete("/{dataset_id}", response_model=ApiResponse)
async def delete_dataset(
project_id: UUID,
dataset_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Delete dataset"""
result = await db.execute(
select(Dataset).where(Dataset.id == dataset_id, Dataset.project_id == project_id)
)
dataset = result.scalar_one_or_none()
if not dataset:
raise HTTPException(status_code=404, detail="Dataset not found")
dataset = await dataset_crud.get(db, dataset_id)
if not dataset or dataset.project_id != project_id:
raise NotFoundException("Dataset", dataset_id)
await db.delete(dataset)
await db.commit()
return {"message": "Dataset deleted successfully"}
await dataset_crud.delete(db, dataset_id)
return ApiResponse.ok(message="Dataset deleted successfully")
@router.post("/{dataset_id}/export")
@router.post("/{dataset_id}/export", response_model=ApiResponse)
async def export_dataset(
project_id: UUID,
dataset_id: UUID,
@@ -98,18 +111,9 @@ async def export_dataset(
db: AsyncSession = Depends(get_db)
):
"""Export dataset in specified format"""
# TODO: Implement actual export logic
# Get dataset
result = await db.execute(
select(Dataset).where(Dataset.id == dataset_id, Dataset.project_id == project_id)
)
dataset = result.scalar_one_or_none()
if not dataset:
raise HTTPException(status_code=404, detail="Dataset not found")
# Get questions for this dataset (placeholder)
# In real implementation, would link questions to datasets
dataset = await dataset_crud.get(db, dataset_id)
if not dataset or dataset.project_id != project_id:
raise NotFoundException("Dataset", dataset_id)
# Return sample data based on format
sample_data = [
@@ -121,6 +125,9 @@ async def export_dataset(
]
if request.format == "json":
return sample_data
return ApiResponse.ok(data=sample_data)
return {"data": sample_data, "format": request.format}
return ApiResponse.ok(
data={"data": sample_data, "format": request.format},
message="Dataset exported successfully"
)

View File

@@ -1,24 +1,32 @@
"""
Evaluation API Router
"""
from typing import List, Optional
from typing import Optional
from uuid import UUID
from pydantic import BaseModel
from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel, Field
from fastapi import APIRouter, Depends, Query
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from app.api.response import ApiResponse, PaginatedResponse
from app.core.database import get_db
from app.core.exceptions import NotFoundException
from app.core.crud import CRUDBase
from app.models.models import EvalDataset, Task
from app.schemas.base import EvalDatasetCreate, EvalDatasetResponse, TaskResponse
from app.schemas.eval import EvalDatasetResponse, TaskResponse
from app.schemas.eval import EvalDatasetCreateSchema
router = APIRouter()
# Initialize CRUD
eval_crud = CRUDBase(EvalDataset)
task_crud = CRUDBase(Task)
class GenerateEvalRequest(BaseModel):
"""Request for generating evaluation dataset"""
name: str
question_type: str = "mixed"
count: int = 50
name: str = Field(..., min_length=1, max_length=255)
question_type: str = Field("mixed", pattern="^(mixed|fact|reasoning|summary)$")
count: int = Field(50, ge=1, le=500)
class RunEvalRequest(BaseModel):
@@ -26,18 +34,34 @@ class RunEvalRequest(BaseModel):
model_config_id: Optional[UUID] = None
@router.get("/", response_model=dict)
async def list_eval_datasets(project_id: UUID, db: AsyncSession = Depends(get_db)):
@router.get("", response_model=ApiResponse)
async def list_eval_datasets(
project_id: UUID,
page: int = Query(1, ge=1),
page_size: int = Query(20, ge=1, le=100),
db: AsyncSession = Depends(get_db)
):
"""List evaluation datasets"""
result = await db.execute(
select(EvalDataset).where(EvalDataset.project_id == project_id).order_by(EvalDataset.created_at.desc())
skip = (page - 1) * page_size
datasets, total = await eval_crud.get_multi(
db,
skip=skip,
limit=page_size,
filters={"project_id": project_id},
order_by="created_at",
descending=True
)
datasets = result.scalars().all()
return {"datasets": [EvalDatasetResponse.model_validate(d) for d in datasets]}
dataset_responses = [EvalDatasetResponse.model_validate(d) for d in datasets]
return PaginatedResponse.ok(
items=dataset_responses,
page=page,
page_size=page_size,
total=total
)
@router.post("/", response_model=dict)
@router.post("", response_model=ApiResponse)
async def create_eval_dataset(
project_id: UUID,
request: GenerateEvalRequest,
@@ -53,10 +77,27 @@ async def create_eval_dataset(
await db.commit()
await db.refresh(db_dataset)
return {"id": str(db_dataset.id)}
return ApiResponse.ok(
data={"id": str(db_dataset.id)},
message="Evaluation dataset created successfully"
)
@router.post("/{eval_id}/evaluate", response_model=dict)
@router.get("/{eval_id}", response_model=ApiResponse)
async def get_eval_dataset(
project_id: UUID,
eval_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Get evaluation dataset by ID"""
dataset = await eval_crud.get(db, eval_id)
if not dataset or dataset.project_id != project_id:
raise NotFoundException("Evaluation Dataset", eval_id)
return ApiResponse.ok(data=EvalDatasetResponse.model_validate(dataset))
@router.post("/{eval_id}/evaluate", response_model=ApiResponse)
async def run_evaluation(
project_id: UUID,
eval_id: UUID,
@@ -65,12 +106,9 @@ async def run_evaluation(
):
"""Run evaluation on dataset"""
# Check dataset exists
result = await db.execute(
select(EvalDataset).where(EvalDataset.id == eval_id, EvalDataset.project_id == project_id)
)
dataset = result.scalar_one_or_none()
if not dataset:
raise HTTPException(status_code=404, detail="Evaluation dataset not found")
dataset = await eval_crud.get(db, eval_id)
if not dataset or dataset.project_id != project_id:
raise NotFoundException("Evaluation Dataset", eval_id)
# Create evaluation task
task = Task(
@@ -82,19 +120,21 @@ async def run_evaluation(
await db.commit()
await db.refresh(task)
# TODO: Start evaluation in background
return {"task_id": str(task.id), "message": "Evaluation task started"}
@router.get("/results", response_model=dict)
async def get_eval_results(project_id: UUID, task_id: UUID, db: AsyncSession = Depends(get_db)):
"""Get evaluation results"""
result = await db.execute(
select(Task).where(Task.id == task_id, Task.project_id == project_id)
return ApiResponse.ok(
data={"task_id": str(task.id)},
message="Evaluation task started"
)
task = result.scalar_one_or_none()
if not task:
raise HTTPException(status_code=404, detail="Task not found")
return TaskResponse.model_validate(task)
@router.get("/results", response_model=ApiResponse)
async def get_eval_results(
project_id: UUID,
task_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Get evaluation results"""
task = await task_crud.get(db, task_id)
if not task or task.project_id != project_id:
raise NotFoundException("Task", task_id)
return ApiResponse.ok(data=TaskResponse.model_validate(task))

View File

@@ -2,24 +2,48 @@
Files API Router
"""
import os
import aiofiles
import asyncio
from pathlib import Path
from typing import List
from uuid import UUID
from fastapi import APIRouter, Depends, HTTPException, UploadFile, File, Form
from typing import Optional
from uuid import UUID, uuid4
from fastapi import APIRouter, Depends, UploadFile, File, Query
from fastapi.responses import FileResponse, PlainTextResponse
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from app.core.database import get_db
from app.api.response import ApiResponse, PaginatedResponse
from app.core.config import get_settings
from app.models.models import File
from app.schemas.base import FileResponse
from app.core.database import get_db
from app.core.exceptions import ValidationException, NotFoundException
from app.core.crud import CRUDBase
from app.core.logging import log_success, log_failure
from app.models.models import File as FileModel
from app.models.models import Chunk, Question
from app.schemas.file import FileResponse, FileCreateSchema
from markitdown import MarkItDown
settings = get_settings()
router = APIRouter()
# Ensure upload directory exists
UPLOAD_DIR = Path(settings.UPLOAD_DIR)
UPLOAD_DIR.mkdir(parents=True, exist_ok=True)
# Initialize CRUD
file_crud = CRUDBase(FileModel)
# Initialize markitdown
markitdown = MarkItDown()
def get_project_raw_dir(project_id: str) -> Path:
"""获取项目的 raw 文件目录"""
base_dir = Path("/data/code/YG-Datasets/data") / project_id / "raw"
base_dir.mkdir(parents=True, exist_ok=True)
return base_dir
def get_project_ready_dir(project_id: str) -> Path:
"""获取项目的 ready 文件目录(处理后的文件)"""
base_dir = Path("/data/code/YG-Datasets/data") / project_id / "ready"
base_dir.mkdir(parents=True, exist_ok=True)
return base_dir
def get_file_type(filename: str) -> str:
@@ -40,71 +64,342 @@ def get_file_type(filename: str) -> str:
return type_map.get(ext, 'txt')
@router.post("/upload", response_model=dict)
# Allowed file extensions
ALLOWED_EXTENSIONS = {'pdf', 'docx', 'doc', 'xlsx', 'xls', 'csv', 'epub', 'md', 'txt'}
def validate_file(filename: str, file_size: int) -> None:
"""Validate file extension and size"""
ext = filename.rsplit('.', 1)[-1].lower() if '.' in filename else ''
if ext not in ALLOWED_EXTENSIONS:
raise ValidationException(
f"File type '{ext}' not allowed",
field="file"
)
if file_size > settings.MAX_FILE_SIZE:
raise ValidationException(
f"File size exceeds maximum allowed size of {settings.MAX_FILE_SIZE // (1024*1024)}MB",
field="file"
)
async def save_file_async(file: UploadFile, destination: Path) -> None:
"""Save uploaded file asynchronously"""
content = await file.read()
loop = asyncio.get_event_loop()
await loop.run_in_executor(None, lambda: destination.write_bytes(content))
@router.post("/upload", response_model=ApiResponse)
async def upload_file(
project_id: UUID,
file: UploadFile = File(...),
db: AsyncSession = Depends(get_db)
):
"""Upload a file"""
# Save file to disk
file_path = UPLOAD_DIR / f"{project_id}_{file.filename}"
async with aiofiles.open(file_path, 'wb') as f:
try:
# Read file content for validation
content = await file.read()
await f.write(content)
file_size = len(content)
# Validate file
validate_file(file.filename, file_size)
# Save file to disk - 使用项目 raw 目录
safe_filename = f"{uuid4().hex[:8]}_{file.filename}"
project_dir = get_project_raw_dir(str(project_id))
file_path = project_dir / safe_filename
# Write file asynchronously
await asyncio.get_event_loop().run_in_executor(
None,
lambda: file_path.write_bytes(content)
)
# Create file record
db_file = File(
db_file = FileModel(
project_id=project_id,
filename=file.filename,
file_type=get_file_type(file.filename),
file_path=str(file_path),
size=len(content),
status="pending"
size=file_size,
status="processing"
)
db.add(db_file)
await db.commit()
await db.refresh(db_file)
return {"id": str(db_file.id), "filename": db_file.filename, "status": db_file.status}
# 异步处理文件:立即返回,不等待处理完成
async def process_file_async(file_id: UUID, file_path_obj: Path, file_type: str, filename: str, project_id_val: UUID):
"""后台异步处理文件"""
from sqlalchemy.ext.asyncio import AsyncSession
from app.core.database import AsyncSessionLocal
async with AsyncSessionLocal() as processing_db:
try:
# 重新获取文件记录
file_record = await file_crud.get(processing_db, file_id)
if not file_record:
return
# 支持 markitdown 转换的文件类型
markitdown_types = ["pdf", "docx", "doc", "pptx", "ppt", "xlsx", "xls", "htm", "html"]
text_content = ""
if file_type in markitdown_types:
# 使用 markitdown 转换为 markdown
loop = asyncio.get_event_loop()
result = await loop.run_in_executor(
None,
lambda: markitdown.convert(str(file_path_obj))
)
text_content = result.text_content
else:
# txt, md 等直接读取
text_content = file_path_obj.read_text(encoding='utf-8')
# 保存到 ready 目录,使用 {uuid}.md 格式
ready_dir = get_project_ready_dir(str(project_id_val))
ready_filename = f"{file_id}.md"
ready_path = ready_dir / ready_filename
ready_path.write_text(text_content, encoding='utf-8')
# 更新文件状态为处理完成
file_record.status = "completed"
await processing_db.commit()
log_success(
"文件处理完成",
project_id=str(project_id_val),
file_id=str(file_id),
filename=filename,
ready_path=str(ready_path)
)
except Exception as e:
# 更新文件状态为处理失败
file_record = await file_crud.get(processing_db, file_id)
if file_record:
file_record.status = "failed"
await processing_db.commit()
log_failure(
"文件处理失败",
project_id=str(project_id_val),
file_id=str(file_id),
filename=filename,
error=str(e)
)
# 启动异步任务处理文件
asyncio.create_task(
process_file_async(
db_file.id,
file_path,
db_file.file_type,
file.filename,
project_id
)
)
return ApiResponse.ok(
data={"id": str(db_file.id), "filename": db_file.filename, "status": db_file.status},
message="File uploaded successfully, processing in background"
)
except Exception as e:
# 记录失败日志
log_failure(
"文件上传失败",
project_id=str(project_id),
filename=file.filename if 'file' in locals() else "unknown",
error=str(e)
)
raise
@router.get("/", response_model=dict)
async def list_files(project_id: UUID, db: AsyncSession = Depends(get_db)):
@router.get("", response_model=ApiResponse)
async def list_files(
project_id: UUID,
page: int = Query(1, ge=1),
page_size: int = Query(20, ge=1, le=100),
db: AsyncSession = Depends(get_db)
):
"""List files for a project"""
result = await db.execute(
select(File).where(File.project_id == project_id).order_by(File.created_at.desc())
skip = (page - 1) * page_size
files, total = await file_crud.get_multi(
db,
skip=skip,
limit=page_size,
filters={"project_id": project_id},
order_by="created_at",
descending=True
)
file_responses = [FileResponse.model_validate(f) for f in files]
return PaginatedResponse.ok(
items=file_responses,
page=page,
page_size=page_size,
total=total
)
files = result.scalars().all()
return {"files": [FileResponse.model_validate(f) for f in files]}
@router.get("/{file_id}", response_model=dict)
async def get_file(project_id: UUID, file_id: UUID, db: AsyncSession = Depends(get_db)):
@router.get("/{file_id}", response_model=ApiResponse)
async def get_file(
project_id: UUID,
file_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Get file by ID"""
result = await db.execute(
select(File).where(File.id == file_id, File.project_id == project_id)
file = await file_crud.get(db, file_id)
if not file or file.project_id != project_id:
raise NotFoundException("File", file_id)
return ApiResponse.ok(data=FileResponse.model_validate(file))
@router.get("/{file_id}/raw")
async def get_file_raw(
project_id: UUID,
file_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Get raw file content for preview"""
file = await file_crud.get(db, file_id)
if not file or file.project_id != project_id:
raise NotFoundException("File", file_id)
# 读取 raw 目录中的原始文件
raw_path = Path(file.file_path)
if not raw_path.exists():
raise NotFoundException("File not found on disk", file_id)
# 根据文件类型返回不同的内容
if file.file_type in ['txt', 'md', 'markdown', 'csv']:
content = raw_path.read_text(encoding='utf-8')
return PlainTextResponse(content=content, media_type="text/plain; charset=utf-8")
elif file.file_type == 'pdf':
# 返回PDF文件浏览器可以内嵌显示
import base64
content = raw_path.read_bytes()
b64 = base64.b64encode(content).decode('utf-8')
return PlainTextResponse(
content=f"data:application/pdf;base64,{b64}",
media_type="text/plain"
)
file = result.scalar_one_or_none()
if not file:
raise HTTPException(status_code=404, detail="File not found")
return FileResponse.model_validate(file)
else:
# 其他二进制文件,返回文件信息
size_mb = file.size / (1024 * 1024)
content = f"""[二进制文件]
文件名: {file.filename}
文件类型: {file.file_type.upper()}
文件大小: {size_mb:.2f} MB
此文件为二进制格式,请下载后查看。
"""
return PlainTextResponse(content=content, media_type="text/plain; charset=utf-8")
@router.delete("/{file_id}", response_model=dict)
async def delete_file(project_id: UUID, file_id: UUID, db: AsyncSession = Depends(get_db)):
"""Delete file"""
result = await db.execute(
select(File).where(File.id == file_id, File.project_id == project_id)
@router.get("/{file_id}/content")
async def get_file_content(
project_id: UUID,
file_id: UUID,
db: AsyncSession = Depends(get_db)
) -> PlainTextResponse:
"""Get file content (markdown)"""
file = await file_crud.get(db, file_id)
if not file or file.project_id != project_id:
raise NotFoundException("File", file_id)
# 读取 ready 目录中的 markdown 文件
ready_path = Path("/data/code/YG-Datasets/data") / str(project_id) / "ready" / f"{file_id}.md"
if ready_path.exists():
content = ready_path.read_text(encoding='utf-8')
return PlainTextResponse(content=content, media_type="text/plain; charset=utf-8")
else:
raise NotFoundException("File content", file_id)
@router.delete("/{file_id}", response_model=ApiResponse)
async def delete_file(
project_id: UUID,
file_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Delete file and all related data (markdown, chunks, questions)"""
file = await file_crud.get(db, file_id)
if not file or file.project_id != project_id:
raise NotFoundException("File", file_id)
# Delete related chunks and their questions (explicit deletion for safety)
chunks_result = await db.execute(
select(Chunk).where(Chunk.file_id == file_id)
)
file = result.scalar_one_or_none()
if not file:
raise HTTPException(status_code=404, detail="File not found")
chunks = chunks_result.scalars().all()
for chunk in chunks:
# Delete questions related to this chunk
questions_result = await db.execute(
select(Question).where(Question.chunk_id == chunk.id)
)
questions = questions_result.scalars().all()
for question in questions:
await db.delete(question)
# Delete chunk
await db.delete(chunk)
# Delete file from disk
# Delete file from raw directory
if file.file_path and os.path.exists(file.file_path):
os.remove(file.file_path)
await asyncio.get_event_loop().run_in_executor(
None,
os.remove,
file.file_path
)
await db.delete(file)
# Delete file from ready directory (processed markdown) - try both naming conventions
ready_dir = Path("/data/code/YG-Datasets/data") / str(project_id) / "ready"
if ready_dir.exists():
# Try file_id.md (from upload process)
ready_path = ready_dir / f"{file_id}.md"
if ready_path.exists():
await asyncio.get_event_loop().run_in_executor(
None,
os.remove,
str(ready_path)
)
# Try file_id_filename.md (from split process)
for md_file in ready_dir.glob(f"{file_id}_*.md"):
await asyncio.get_event_loop().run_in_executor(
None,
os.remove,
str(md_file)
)
await file_crud.delete(db, file_id)
await db.commit()
return {"message": "File deleted successfully"}
return ApiResponse.ok(message="File deleted successfully")
@router.get("/{file_id}/download")
async def download_file(
project_id: UUID,
file_id: UUID,
db: AsyncSession = Depends(get_db)
) -> FileResponse:
"""Download file"""
file = await file_crud.get(db, file_id)
if not file or file.project_id != project_id:
raise NotFoundException("File", file_id)
if not file.file_path or not os.path.exists(file.file_path):
raise ValidationException("File not found on disk", field="file")
return FileResponse(
path=file.file_path,
filename=file.filename,
media_type=f"application/{file.file_type}"
)

View File

@@ -0,0 +1,305 @@
"""
Model API Router
"""
import uuid
import httpx
from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, update
from app.core.database import get_db
from app.api.response import ApiResponse
from app.models.models import ModelConfig
from app.schemas.model import ModelCreate, ModelUpdate, ModelResponse
router = APIRouter()
VALID_MODEL_TYPES = {"chat", "vlm", "embedding", "rerank"}
def normalize_model_type(model_type: str | None, model_name: str | None) -> str:
"""Normalize model type, with keyword fallback for legacy records."""
if model_type in VALID_MODEL_TYPES and model_type != "chat":
return model_type
normalized_name = (model_name or "").strip().lower()
rerank_keywords = ("rerank", "bce-reranker", "gte-rerank")
embedding_keywords = (
"embedding",
"embed",
"text-embedding",
"bge-",
"bge_m3",
"gte-",
"m3e",
"e5-",
"jina-embeddings",
)
vlm_keywords = ("vl", "vision", "visual", "multimodal", "qwen-vl", "gpt-4o")
if any(keyword in normalized_name for keyword in rerank_keywords):
return "rerank"
if any(keyword in normalized_name for keyword in embedding_keywords):
return "embedding"
if any(keyword in normalized_name for keyword in vlm_keywords):
return "vlm"
return model_type if model_type in VALID_MODEL_TYPES else "chat"
async def test_model_connection(model: ModelConfig) -> dict:
"""Test model connection by calling the API"""
if not model.api_key:
return {"success": False, "message": "API Key is missing"}
api_base = model.api_base or ""
provider = model.provider
model_name = model.model_name
model_type = normalize_model_type(model.model_type, model_name)
api_key = model.api_key
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
try:
async with httpx.AsyncClient(timeout=10.0) as client:
if model_type in {"chat", "vlm"} and provider in {"openai", "ali"}:
# OpenAI compatible API test
response = await client.post(
f"{api_base.rstrip('/')}/chat/completions",
headers=headers,
json={
"model": model_name,
"messages": [{"role": "user", "content": "Hi"}],
"max_tokens": 5
}
)
elif model_type in {"chat", "vlm"} and provider == "minimax":
# MiniMax API test
response = await client.post(
f"{api_base.rstrip('/')}/chat/completions_v2",
headers={
**headers,
"Authorization": f"Bearer {api_key}"
},
json={
"model": model_name,
"messages": [{"role": "user", "content": "Hi"}]
}
)
elif model_type in {"chat", "vlm"} and provider == "glm":
# GLM API test
response = await client.post(
f"{api_base.rstrip('/')}/chat/completions",
headers=headers,
json={
"model": model_name,
"messages": [{"role": "user", "content": "Hi"}]
}
)
elif model_type == "embedding" and provider in {"openai", "ali", "glm"}:
response = await client.post(
f"{api_base.rstrip('/')}/embeddings",
headers=headers,
json={
"model": model_name,
"input": "test"
}
)
elif model_type == "embedding" and provider == "minimax":
return {"success": False, "message": "MiniMax embedding 自动测试暂未接入,请手动确认端点与模型"}
elif model_type == "rerank":
return {"success": False, "message": "Rerank 自动测试暂未接入,请先保存配置并在实际流程中验证"}
else:
return {"success": False, "message": f"Unsupported provider/type: {provider}/{model_type}"}
if response.status_code == 200:
return {"success": True, "message": "Connection successful"}
else:
return {"success": False, "message": f"API error: {response.status_code} - {response.text[:100]}"}
except httpx.TimeoutException:
return {"success": False, "message": "Connection timeout"}
except Exception as e:
return {"success": False, "message": f"Connection failed: {str(e)}"}
# Helper to convert string to UUID
def parse_uuid(id_str: str) -> uuid.UUID:
"""Parse string to UUID"""
try:
return uuid.UUID(id_str)
except ValueError:
raise HTTPException(status_code=400, detail="Invalid UUID format")
@router.get("", response_model=ApiResponse)
async def list_models(db: AsyncSession = Depends(get_db)):
"""Get all models"""
result = await db.execute(
select(ModelConfig).where(ModelConfig.project_id == None) # noqa: E711
)
models = result.scalars().all()
# Convert to Pydantic schema
model_responses = [ModelResponse.model_validate(m) for m in models]
return ApiResponse(data=model_responses)
@router.post("", response_model=ApiResponse)
async def create_model(model: ModelCreate, db: AsyncSession = Depends(get_db)):
"""Create a new model"""
# If setting as default, unset other defaults first
if model.is_default == "true":
await db.execute(
update(ModelConfig)
.where(ModelConfig.project_id == None) # noqa: E711
.values(is_default="false")
)
db_model = ModelConfig(
provider=model.provider,
model_type=model.model_type,
model_name=model.model_name,
api_key=model.api_key,
api_base=model.api_base,
is_default=model.is_default,
project_id=None # Global model config
)
db.add(db_model)
await db.commit()
await db.refresh(db_model)
# Convert to Pydantic schema
response = ModelResponse.model_validate(db_model)
return ApiResponse(data=response)
@router.get("/{model_id}", response_model=ApiResponse)
async def get_model(model_id: str, db: AsyncSession = Depends(get_db)):
"""Get a model by ID"""
model_uuid = parse_uuid(model_id)
result = await db.execute(
select(ModelConfig).where(
ModelConfig.id == model_uuid,
ModelConfig.project_id == None # noqa: E711
)
)
model = result.scalar_one_or_none()
if not model:
raise HTTPException(status_code=404, detail="Model not found")
response = ModelResponse.model_validate(model)
return ApiResponse(data=response)
@router.put("/{model_id}", response_model=ApiResponse)
async def update_model(model_id: str, model_update: ModelUpdate, db: AsyncSession = Depends(get_db)):
"""Update a model"""
model_uuid = parse_uuid(model_id)
result = await db.execute(
select(ModelConfig).where(
ModelConfig.id == model_uuid,
ModelConfig.project_id == None # noqa: E711
)
)
model = result.scalar_one_or_none()
if not model:
raise HTTPException(status_code=404, detail="Model not found")
# If setting as default, unset other defaults first
if model_update.is_default == "true":
await db.execute(
update(ModelConfig)
.where(
ModelConfig.project_id == None, # noqa: E711
ModelConfig.id != model_uuid
)
.values(is_default="false")
)
update_data = model_update.model_dump(exclude_unset=True)
for key, value in update_data.items():
setattr(model, key, value)
await db.commit()
await db.refresh(model)
response = ModelResponse.model_validate(model)
return ApiResponse(data=response)
@router.delete("/{model_id}", response_model=ApiResponse)
async def delete_model(model_id: str, db: AsyncSession = Depends(get_db)):
"""Delete a model"""
model_uuid = parse_uuid(model_id)
result = await db.execute(
select(ModelConfig).where(
ModelConfig.id == model_uuid,
ModelConfig.project_id == None # noqa: E711
)
)
model = result.scalar_one_or_none()
if not model:
raise HTTPException(status_code=404, detail="Model not found")
await db.delete(model)
await db.commit()
return ApiResponse(message="Model deleted successfully")
@router.post("/{model_id}/set-default", response_model=ApiResponse)
async def set_default_model(model_id: str, db: AsyncSession = Depends(get_db)):
"""Set a model as default"""
model_uuid = parse_uuid(model_id)
result = await db.execute(
select(ModelConfig).where(
ModelConfig.id == model_uuid,
ModelConfig.project_id == None # noqa: E711
)
)
model = result.scalar_one_or_none()
if not model:
raise HTTPException(status_code=404, detail="Model not found")
# Unset all other defaults
await db.execute(
update(ModelConfig)
.where(
ModelConfig.project_id == None, # noqa: E711
ModelConfig.id != model_uuid
)
.values(is_default="false")
)
model.is_default = "true"
await db.commit()
await db.refresh(model)
response = ModelResponse.model_validate(model)
return ApiResponse(data=response)
@router.post("/{model_id}/test", response_model=ApiResponse)
async def test_model(model_id: str, db: AsyncSession = Depends(get_db)):
"""Test model connection"""
model_uuid = parse_uuid(model_id)
result = await db.execute(
select(ModelConfig).where(
ModelConfig.id == model_uuid,
ModelConfig.project_id == None # noqa: E711
)
)
model = result.scalar_one_or_none()
if not model:
raise HTTPException(status_code=404, detail="Model not found")
# Test the connection
test_result = await test_model_connection(model)
# Save connection status to database
model.model_type = normalize_model_type(model.model_type, model.model_name)
model.connection_status = "connected" if test_result["success"] else "disconnected"
await db.commit()
await db.refresh(model)
# Return updated model
response = ModelResponse.model_validate(model)
return ApiResponse(data={"test_result": test_result, "model": response})

View File

@@ -1,74 +1,120 @@
"""
Projects API Router
"""
from typing import List
import logging
import shutil
from pathlib import Path
from typing import List, Optional
from uuid import UUID
from fastapi import APIRouter, Depends, HTTPException
from fastapi import APIRouter, Depends, Query
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from app.api.response import ApiResponse, PaginatedResponse
from app.core.database import get_db
from app.core.exceptions import NotFoundException
from app.core.crud import CRUDBase
from app.models.models import Project
from app.schemas.base import (
from app.schemas.project import (
ProjectCreate,
ProjectUpdate,
ProjectResponse
ProjectResponse,
ProjectCreateSchema,
ProjectUpdateSchema
)
router = APIRouter()
logger = logging.getLogger("yg_dataset.projects")
# Initialize CRUD
project_crud = CRUDBase(Project)
@router.get("/", response_model=dict)
async def list_projects(db: AsyncSession = Depends(get_db)):
"""List all projects"""
result = await db.execute(select(Project).order_by(Project.created_at.desc()))
projects = result.scalars().all()
return {"projects": [ProjectResponse.model_validate(p) for p in projects]}
@router.get("", response_model=PaginatedResponse)
async def list_projects(
page: int = Query(1, ge=1, description="Page number"),
page_size: int = Query(20, ge=1, le=100, description="Page size"),
db: AsyncSession = Depends(get_db)
):
"""List all projects with pagination"""
logger.info(f"Listing projects - page: {page}, page_size: {page_size}")
skip = (page - 1) * page_size
projects, total = await project_crud.get_multi(
db,
skip=skip,
limit=page_size,
order_by="created_at",
descending=True
)
logger.info(f"Found {total} projects, returning {len(projects)} items")
project_responses = [ProjectResponse.model_validate(p) for p in projects]
return PaginatedResponse.ok(
items=project_responses,
page=page,
page_size=page_size,
total=total
)
@router.post("/", response_model=dict)
async def create_project(project: ProjectCreate, db: AsyncSession = Depends(get_db)):
@router.post("", response_model=ApiResponse)
async def create_project(
project: ProjectCreateSchema,
db: AsyncSession = Depends(get_db)
):
"""Create a new project"""
db_project = Project(**project.model_dump())
db.add(db_project)
await db.commit()
await db.refresh(db_project)
return {"id": str(db_project.id)}
logger.info(f"Creating project: name={project.name}, description={project.description}")
db_project = await project_crud.create(db, project)
logger.info(f"Project created successfully: id={db_project.id}")
return ApiResponse.ok(
data={"id": str(db_project.id)},
message="Project created successfully"
)
@router.get("/{project_id}", response_model=dict)
async def get_project(project_id: UUID, db: AsyncSession = Depends(get_db)):
@router.get("/{project_id}", response_model=ApiResponse)
async def get_project(
project_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Get project by ID"""
result = await db.execute(select(Project).where(Project.id == project_id))
project = result.scalar_one_or_none()
if not project:
raise HTTPException(status_code=404, detail="Project not found")
return ProjectResponse.model_validate(project)
logger.info(f"Getting project: id={project_id}")
project = await project_crud.get_or_raise(db, project_id, "Project")
logger.info(f"Found project: name={project.name}")
return ApiResponse.ok(data=ProjectResponse.model_validate(project))
@router.put("/{project_id}", response_model=dict)
async def update_project(project_id: UUID, project: ProjectUpdate, db: AsyncSession = Depends(get_db)):
@router.put("/{project_id}", response_model=ApiResponse)
async def update_project(
project_id: UUID,
project: ProjectUpdateSchema,
db: AsyncSession = Depends(get_db)
):
"""Update project"""
result = await db.execute(select(Project).where(Project.id == project_id))
db_project = result.scalar_one_or_none()
if not db_project:
raise HTTPException(status_code=404, detail="Project not found")
for key, value in project.model_dump(exclude_unset=True).items():
setattr(db_project, key, value)
await db.commit()
await db.refresh(db_project)
return ProjectResponse.model_validate(db_project)
logger.info(f"Updating project: id={project_id}")
db_project = await project_crud.get_or_raise(db, project_id, "Project")
updated_project = await project_crud.update(db, db_project, project)
logger.info(f"Project updated: name={updated_project.name}")
return ApiResponse.ok(
data=ProjectResponse.model_validate(updated_project),
message="Project updated successfully"
)
@router.delete("/{project_id}", response_model=dict)
async def delete_project(project_id: UUID, db: AsyncSession = Depends(get_db)):
@router.delete("/{project_id}", response_model=ApiResponse)
async def delete_project(
project_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Delete project"""
result = await db.execute(select(Project).where(Project.id == project_id))
project = result.scalar_one_or_none()
if not project:
raise HTTPException(status_code=404, detail="Project not found")
logger.info(f"Deleting project: id={project_id}")
await project_crud.get_or_raise(db, project_id, "Project")
await project_crud.delete(db, project_id)
await db.delete(project)
await db.commit()
return {"message": "Project deleted successfully"}
# 删除项目对应的本地数据目录
project_data_dir = Path("/data/code/YG-Datasets/data") / str(project_id)
if project_data_dir.exists():
shutil.rmtree(project_data_dir)
logger.info(f"Project data directory deleted: {project_data_dir}")
logger.info(f"Project deleted: id={project_id}")
return ApiResponse.ok(message="Project deleted successfully")

View File

@@ -1,122 +1,402 @@
"""
Questions API Router
"""
import asyncio
import json
import re
from typing import List, Optional
from uuid import UUID
from pydantic import BaseModel
from fastapi import APIRouter, Depends, HTTPException, Query
from sqlalchemy.ext.asyncio import AsyncSession
import httpx
from fastapi import APIRouter, Depends, Query
from pydantic import BaseModel, Field
from sqlalchemy import select
from app.core.database import get_db
from app.models.models import Question, Chunk
from app.schemas.base import QuestionCreate, QuestionResponse
from sqlalchemy.ext.asyncio import AsyncSession
from app.api.response import ApiResponse, PaginatedResponse
from app.core.crud import CRUDBase
from app.core.database import AsyncSessionLocal, get_db
from app.core.exceptions import NotFoundException, ValidationException
from app.core.logging import log_failure, log_success
from app.models.models import Chunk, ModelConfig, Question
from app.schemas.question import QuestionCreateSchema, QuestionResponse
router = APIRouter()
# Initialize CRUD
question_crud = CRUDBase(Question)
VALID_MODEL_TYPES = {"chat", "vlm", "embedding", "rerank"}
DEFAULT_PRESET_PROMPT = (
"你是一名高质量中文问答数据构建助手。"
"请基于给定 chunk 内容生成准确、自然、可用于训练的数据集问答对。"
"问题必须清晰具体,答案必须直接来自内容或基于内容做合理概括,"
"不要编造原文没有的信息,不要输出与目录、导航、页眉页脚、噪声文字相关的问题。"
)
class GenerateRequest(BaseModel):
"""Request model for generating questions"""
chunk_ids: List[UUID] = []
count: int = 5
question_types: List[str] = ["fact", "summary"]
chunk_ids: List[UUID] = Field(..., min_length=1)
model_id: UUID
count: int = Field(3, ge=1, le=10)
dirty_data_filter: bool = True
thinking_mode: bool = True
preset_prompt: str = Field(default=DEFAULT_PRESET_PROMPT, min_length=1, max_length=4000)
@router.post("/generate", response_model=dict)
def normalize_model_type(model_type: str | None, model_name: str | None) -> str:
"""Normalize model type, with keyword fallback for legacy records."""
if model_type in VALID_MODEL_TYPES and model_type != "chat":
return model_type
normalized_name = (model_name or "").strip().lower()
rerank_keywords = ("rerank", "bce-reranker", "gte-rerank")
embedding_keywords = (
"embedding",
"embed",
"text-embedding",
"bge-",
"bge_m3",
"gte-",
"m3e",
"e5-",
"jina-embeddings",
)
vlm_keywords = ("vl", "vision", "visual", "multimodal", "qwen-vl", "gpt-4o")
if any(keyword in normalized_name for keyword in rerank_keywords):
return "rerank"
if any(keyword in normalized_name for keyword in embedding_keywords):
return "embedding"
if any(keyword in normalized_name for keyword in vlm_keywords):
return "vlm"
return model_type if model_type in VALID_MODEL_TYPES else "chat"
def is_dirty_chunk(content: str) -> bool:
"""Heuristic dirty-data filter for low-value chunks."""
normalized = re.sub(r"\s+", " ", (content or "")).strip()
if len(normalized) < 40:
return True
if len(re.sub(r"[^\u4e00-\u9fffA-Za-z0-9]", "", normalized)) < 24:
return True
lowered = normalized.lower()
if lowered in {"目录", "contents", "table of contents"}:
return True
lines = [line.strip() for line in (content or "").splitlines() if line.strip()]
if lines:
short_lines = sum(1 for line in lines if len(line) <= 18)
dotted_lines = sum(1 for line in lines if re.search(r"[·•…\.]{3,}|\s\d+$", line))
if short_lines / len(lines) > 0.7 and len(lines) >= 3:
return True
if dotted_lines / len(lines) > 0.4:
return True
punctuation_ratio = sum(1 for ch in normalized if not ch.isalnum() and not ("\u4e00" <= ch <= "\u9fff")) / max(len(normalized), 1)
if punctuation_ratio > 0.45:
return True
return False
def build_generation_prompt(chunk: Chunk, request: GenerateRequest) -> str:
"""Build user prompt for QA generation."""
thinking_instruction = (
"请先对内容做简短分析,识别核心事实、概念、关系与潜在考点,然后再生成问答。"
"分析过程只用于提高质量,不要在最终输出中暴露你的思维链。"
if request.thinking_mode
else "直接基于内容生成高质量问答。"
)
return (
f"{request.preset_prompt}\n\n"
"输出要求:\n"
f"1. 生成 {request.count} 组问答。\n"
"2. 只输出 JSON 数组不要输出解释、标题、Markdown。\n"
'3. 每个对象结构为 {"question":"...","answer":"...","question_type":"fact|summary|reasoning"}。\n'
"4. 问题避免重复,答案避免空泛。\n"
"5. 如果内容不足以生成高质量问答,请返回空数组 []。\n"
f"6. {thinking_instruction}\n\n"
f"Chunk 名称:{chunk.name or '未命名分片'}\n"
f"Chunk 内容:\n{chunk.content}"
)
def extract_text_from_response(data: dict) -> str:
"""Extract response text from provider response."""
choices = data.get("choices") or []
if choices:
message = choices[0].get("message") or {}
content = message.get("content")
if isinstance(content, str):
return content
if isinstance(content, list):
parts = [item.get("text", "") for item in content if isinstance(item, dict)]
return "\n".join(part for part in parts if part)
return ""
def parse_generated_questions(raw_text: str) -> List[dict]:
"""Parse JSON array from model output."""
text = (raw_text or "").strip()
if not text:
return []
fenced_match = re.search(r"```json\s*(.*?)\s*```", text, flags=re.S)
if fenced_match:
text = fenced_match.group(1).strip()
if not text.startswith("["):
array_match = re.search(r"(\[\s*\{.*\}\s*\])", text, flags=re.S)
if array_match:
text = array_match.group(1)
try:
parsed = json.loads(text)
except json.JSONDecodeError:
return []
if not isinstance(parsed, list):
return []
normalized = []
for item in parsed:
if not isinstance(item, dict):
continue
question = str(item.get("question", "")).strip()
answer = str(item.get("answer", "")).strip()
question_type = str(item.get("question_type", "fact")).strip() or "fact"
if not question or not answer:
continue
normalized.append({
"question": question,
"answer": answer,
"question_type": question_type
})
return normalized
async def call_generation_model(model: ModelConfig, prompt: str) -> str:
"""Call configured chat model for question generation."""
provider = model.provider
api_base = (model.api_base or "").rstrip("/")
api_key = model.api_key or ""
model_name = model.model_name
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
payload = {
"model": model_name,
"messages": [
{
"role": "system",
"content": "你是问答数据构建助手。严格按 JSON 输出,不要输出额外说明。"
},
{
"role": "user",
"content": prompt
}
],
"temperature": 0.4,
"response_format": {"type": "json_object"}
}
async with httpx.AsyncClient(timeout=120.0) as client:
if provider == "minimax":
response = await client.post(
f"{api_base}/chat/completions_v2",
headers=headers,
json={k: v for k, v in payload.items() if k != "response_format"}
)
else:
response = await client.post(
f"{api_base}/chat/completions",
headers=headers,
json=payload
)
response.raise_for_status()
data = response.json()
content = extract_text_from_response(data)
if not content:
raise ValueError("Model returned empty content")
if content.lstrip().startswith("{"):
obj = json.loads(content)
if isinstance(obj, dict) and isinstance(obj.get("questions"), list):
return json.dumps(obj["questions"], ensure_ascii=False)
return content
async def process_generate_async(project_id: UUID, request: GenerateRequest):
"""Generate QA pairs in background."""
async with AsyncSessionLocal() as db:
try:
model_result = await db.execute(
select(ModelConfig).where(ModelConfig.id == request.model_id, ModelConfig.project_id == None) # noqa: E711
)
model = model_result.scalar_one_or_none()
if not model:
return
model_type = normalize_model_type(model.model_type, model.model_name)
if model_type not in {"chat", "vlm"}:
raise ValidationException("Selected model must be chat/vlm type", field="model_id")
chunk_result = await db.execute(
select(Chunk).where(Chunk.id.in_(request.chunk_ids), Chunk.project_id == project_id)
)
chunks = chunk_result.scalars().all()
if not chunks:
return
created_count = 0
skipped_count = 0
for chunk in chunks:
if request.dirty_data_filter and is_dirty_chunk(chunk.content):
skipped_count += 1
continue
prompt = build_generation_prompt(chunk, request)
raw_text = await call_generation_model(model, prompt)
qa_pairs = parse_generated_questions(raw_text)[:request.count]
if not qa_pairs:
skipped_count += 1
continue
for item in qa_pairs:
db.add(Question(
project_id=project_id,
chunk_id=chunk.id,
content=item["question"],
answer=item["answer"],
question_type=item["question_type"],
source="generated"
))
created_count += 1
await db.commit()
log_success(
"问答批量生成完成",
project_id=str(project_id),
model_id=str(model.id),
chunk_count=len(chunks),
created_questions=created_count,
skipped_chunks=skipped_count
)
except Exception as e:
log_failure(
"问答批量生成失败",
project_id=str(project_id),
model_id=str(request.model_id),
error=str(e)
)
@router.post("/generate", response_model=ApiResponse)
async def generate_questions(
project_id: UUID,
request: GenerateRequest,
db: AsyncSession = Depends(get_db)
):
"""Generate questions from chunks using LLM"""
# TODO: Implement LLM-based question generation
# This is a placeholder that creates sample questions
if not request.chunk_ids:
raise HTTPException(status_code=400, detail="chunk_ids is required")
# Get chunks
result = await db.execute(
select(Chunk).where(Chunk.id.in_(request.chunk_ids), Chunk.project_id == project_id)
"""Generate questions from chunks using LLM in background."""
model_result = await db.execute(
select(ModelConfig).where(ModelConfig.id == request.model_id, ModelConfig.project_id == None) # noqa: E711
)
chunks = result.scalars().all()
model = model_result.scalar_one_or_none()
if not model:
raise ValidationException("Selected model not found", field="model_id")
if not chunks:
raise HTTPException(status_code=404, detail="No chunks found")
model_type = normalize_model_type(model.model_type, model.model_name)
if model_type not in {"chat", "vlm"}:
raise ValidationException("Selected model must be chat/vlm type", field="model_id")
if not model.api_key:
raise ValidationException("Selected model is missing API Key", field="model_id")
# Create sample questions (placeholder)
created_questions = []
for chunk in chunks:
for i in range(request.count):
question = Question(
project_id=project_id,
chunk_id=chunk.id,
content=f"这是关于「{chunk.name}」的问题 {i+1}",
answer=f"这是问题 {i+1} 的答案。",
question_type=request.question_types[0] if request.question_types else "fact",
source="generated"
chunk_result = await db.execute(
select(Chunk.id).where(Chunk.id.in_(request.chunk_ids), Chunk.project_id == project_id)
)
db.add(question)
created_questions.append(question)
valid_chunk_ids = [row[0] for row in chunk_result.all()]
if not valid_chunk_ids:
raise ValidationException("No valid chunks found", field="chunk_ids")
await db.commit()
request_payload = request.model_copy(update={"chunk_ids": valid_chunk_ids})
asyncio.create_task(process_generate_async(project_id, request_payload))
return {
"questions": len(created_questions),
"message": f"Successfully generated {len(created_questions)} questions"
}
return ApiResponse.ok(
data={"chunk_count": len(valid_chunk_ids), "status": "processing"},
message="Question generation started in background"
)
@router.get("/", response_model=dict)
@router.get("", response_model=ApiResponse)
async def list_questions(
project_id: UUID,
chunk_id: Optional[UUID] = Query(None),
page: int = Query(1, ge=1),
page_size: int = Query(20, ge=1, le=100),
db: AsyncSession = Depends(get_db)
):
"""List questions for a project"""
query = select(Question).where(Question.project_id == project_id)
filters = {"project_id": project_id}
if chunk_id:
query = query.where(Question.chunk_id == chunk_id)
filters["chunk_id"] = chunk_id
result = await db.execute(query)
questions = result.scalars().all()
skip = (page - 1) * page_size
questions, total = await question_crud.get_multi(
db,
skip=skip,
limit=page_size,
filters=filters,
order_by="created_at",
descending=True
)
return {"questions": [QuestionResponse.model_validate(q) for q in questions]}
question_responses = [QuestionResponse.model_validate(q) for q in questions]
return PaginatedResponse.ok(
items=question_responses,
page=page,
page_size=page_size,
total=total
)
@router.put("/{question_id}", response_model=dict)
@router.put("/{question_id}", response_model=ApiResponse)
async def update_question(
project_id: UUID,
question_id: UUID,
question: QuestionCreate,
question: QuestionCreateSchema,
db: AsyncSession = Depends(get_db)
):
"""Update question"""
result = await db.execute(
select(Question).where(Question.id == question_id, Question.project_id == project_id)
db_question = await question_crud.get(db, question_id)
if not db_question or db_question.project_id != project_id:
raise NotFoundException("Question", question_id)
updated_question = await question_crud.update(db, db_question, question)
return ApiResponse.ok(
data=QuestionResponse.model_validate(updated_question),
message="Question updated successfully"
)
db_question = result.scalar_one_or_none()
if not db_question:
raise HTTPException(status_code=404, detail="Question not found")
for key, value in question.model_dump(exclude_unset=True).items():
setattr(db_question, key, value)
await db.commit()
await db.refresh(db_question)
return QuestionResponse.model_validate(db_question)
@router.delete("/{question_id}", response_model=dict)
async def delete_question(project_id: UUID, question_id: UUID, db: AsyncSession = Depends(get_db)):
@router.delete("/{question_id}", response_model=ApiResponse)
async def delete_question(
project_id: UUID,
question_id: UUID,
db: AsyncSession = Depends(get_db)
):
"""Delete question"""
result = await db.execute(
select(Question).where(Question.id == question_id, Question.project_id == project_id)
)
question = result.scalar_one_or_none()
if not question:
raise HTTPException(status_code=404, detail="Question not found")
question = await question_crud.get(db, question_id)
if not question or question.project_id != project_id:
raise NotFoundException("Question", question_id)
await db.delete(question)
await db.commit()
return {"message": "Question deleted successfully"}
await question_crud.delete(db, question_id)
return ApiResponse.ok(message="Question deleted successfully")

38
backend/app/core/auth.py Normal file
View File

@@ -0,0 +1,38 @@
"""
API Key Authentication
API Key 认证中间件
"""
from typing import Optional
from fastapi import Header, HTTPException, Request
from fastapi.security import APIKeyHeader
from app.core.config import get_settings
settings = get_settings()
# API Key header
API_KEY_HEADER = APIKeyHeader(name="X-API-Key", auto_error=False)
async def verify_api_key(api_key: Optional[str] = Header(None)) -> str:
"""Verify API key from header"""
if not api_key:
raise HTTPException(status_code=401, detail="API key is required")
# In production, you would validate against a database or cache
# For development, we can use a simple validation
if settings.DEBUG and api_key == "dev-api-key":
return api_key
# TODO: Implement proper API key validation
# This is a placeholder - in production, validate against stored keys
if len(api_key) < 32:
raise HTTPException(status_code=401, detail="Invalid API key")
return api_key
def create_api_key() -> str:
"""Generate a new API key"""
import secrets
return secrets.token_hex(32)

View File

@@ -4,7 +4,7 @@ Application Configuration
from functools import lru_cache
from pydantic_settings import BaseSettings
from pydantic import Field
from pydantic import Field, field_validator
class Settings(BaseSettings):
@@ -15,12 +15,16 @@ class Settings(BaseSettings):
DEBUG: bool = True
HOST: str = "0.0.0.0"
PORT: int = 8000
ALLOWED_ORIGINS: str = Field(
default="*",
description="Comma-separated list of allowed CORS origins"
)
# Database - 使用 SQLite 进行开发/测试
# 生产环境可切换为 PostgreSQL
DATABASE_URL: str = Field(
default="sqlite:///./ygdataset.db",
description="Database connection URL (sqlite:// or postgresql+asyncpg://)"
default="sqlite+aiosqlite:///./ygdataset.db",
description="Database connection URL (sqlite+aiosqlite:// or postgresql+asyncpg://)"
)
DATABASE_URL_SYNC: str = Field(
default="sqlite:///./ygdataset.db",
@@ -38,8 +42,31 @@ class Settings(BaseSettings):
DEFAULT_MODEL_PROVIDER: str = "openai"
DEFAULT_MODEL_NAME: str = "gpt-4o-mini"
# Security
SECRET_KEY: str = Field(
default="your-secret-key-change-in-production",
description="Secret key for JWT and other security operations"
)
API_KEY_HEADER: str = "X-API-Key"
# Pagination
DEFAULT_PAGE_SIZE: int = 20
MAX_PAGE_SIZE: int = 100
# Logging
LOG_LEVEL: str = "INFO"
@field_validator("MAX_FILE_SIZE")
@classmethod
def validate_max_file_size(cls, v: int) -> int:
"""Validate max file size (max 500MB)"""
if v > 500 * 1024 * 1024:
return 500 * 1024 * 1024
return v
class Config:
env_file = ".env"
env_file_encoding = "utf-8"
extra = "allow"
@@ -47,3 +74,7 @@ class Settings(BaseSettings):
def get_settings() -> Settings:
"""Get cached settings"""
return Settings()
# Create global settings instance
settings = get_settings()

178
backend/app/core/crud.py Normal file
View File

@@ -0,0 +1,178 @@
"""
Database CRUD Operations
数据库通用 CRUD 操作
"""
from typing import Any, Generic, List, Optional, Type, TypeVar
from uuid import UUID
from sqlalchemy import select, func, and_
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.orm import selectinload, joinedload
from app.core.exceptions import NotFoundException, DuplicateException
from app.core.logging import LoggerMixin
ModelType = TypeVar("ModelType")
class CRUDBase(Generic[ModelType], LoggerMixin):
"""Base CRUD class with common operations"""
def __init__(self, model: Type[ModelType]):
"""Initialize CRUD with model"""
self.model = model
async def get(
self,
db: AsyncSession,
id: UUID,
load_relations: Optional[List[str]] = None
) -> Optional[ModelType]:
"""Get single record by ID"""
query = select(self.model).where(self.model.id == id)
# Load relationships if specified
if load_relations:
for relation in load_relations:
if hasattr(self.model, relation):
query = query.options(selectinload(getattr(self.model, relation)))
result = await db.execute(query)
return result.scalar_one_or_none()
async def get_or_raise(
self,
db: AsyncSession,
id: UUID,
resource_name: str = "Resource",
load_relations: Optional[List[str]] = None
) -> ModelType:
"""Get single record by ID or raise NotFoundException"""
obj = await self.get(db, id, load_relations)
if not obj:
raise NotFoundException(resource_name, id)
return obj
async def get_multi(
self,
db: AsyncSession,
skip: int = 0,
limit: int = 20,
load_relations: Optional[List[str]] = None,
filters: Optional[dict] = None,
order_by: Optional[str] = "created_at",
descending: bool = True
) -> tuple[List[ModelType], int]:
"""Get multiple records with pagination"""
query = select(self.model)
count_query = select(func.count()).select_from(self.model)
# Apply filters
if filters:
conditions = []
for key, value in filters.items():
if hasattr(self.model, key):
conditions.append(getattr(self.model, key) == value)
if conditions:
query = query.where(and_(*conditions))
count_query = count_query.where(and_(*conditions))
# Load relationships if specified
if load_relations:
for relation in load_relations:
if hasattr(self.model, relation):
query = query.options(selectinload(getattr(self.model, relation)))
# Count total
total_result = await db.execute(count_query)
total = total_result.scalar() or 0
# Apply ordering
if order_by and hasattr(self.model, order_by):
order_column = getattr(self.model, order_by)
if descending:
query = query.order_by(order_column.desc())
else:
query = query.order_by(order_column.asc())
# Apply pagination
query = query.offset(skip).limit(limit)
result = await db.execute(query)
items = result.scalars().all()
return list(items), total
async def create(
self,
db: AsyncSession,
obj_in: Any,
commit: bool = True
) -> ModelType:
"""Create new record"""
obj_data = obj_in.model_dump() if hasattr(obj_in, 'model_dump') else obj_in.dict()
db_obj = self.model(**obj_data)
db.add(db_obj)
if commit:
await db.commit()
await db.refresh(db_obj)
self.log.debug(f"Created {self.model.__name__}: {db_obj.id}")
return db_obj
async def update(
self,
db: AsyncSession,
db_obj: ModelType,
obj_in: Any,
commit: bool = True
) -> ModelType:
"""Update existing record"""
if hasattr(obj_in, 'model_dump'):
obj_data = obj_in.model_dump(exclude_unset=True)
elif hasattr(obj_in, 'dict'):
obj_data = obj_in.dict(exclude_unset=True)
else:
obj_data = obj_in
for field, value in obj_data.items():
if hasattr(db_obj, field):
setattr(db_obj, field, value)
if commit:
await db.commit()
await db.refresh(db_obj)
self.log.debug(f"Updated {self.model.__name__}: {db_obj.id}")
return db_obj
async def delete(
self,
db: AsyncSession,
id: UUID,
commit: bool = True
) -> bool:
"""Delete record by ID"""
obj = await self.get(db, id)
if obj:
await db.delete(obj)
if commit:
await db.commit()
self.log.debug(f"Deleted {self.model.__name__}: {id}")
return True
return False
async def exists(
self,
db: AsyncSession,
filters: dict
) -> bool:
"""Check if record exists"""
query = select(func.count()).select_from(self.model)
for key, value in filters.items():
if hasattr(self.model, key):
query = query.where(getattr(self.model, key) == value)
result = await db.execute(query)
count = result.scalar() or 0
return count > 0

View File

@@ -2,25 +2,32 @@
Database Configuration and Session Management
支持 SQLite 和 PostgreSQL
"""
import logging
from contextlib import asynccontextmanager
from typing import AsyncGenerator
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy import create_engine
from sqlalchemy import create_engine, event, inspect, text
from sqlalchemy.pool import NullPool
from app.core.config import get_settings
logger = logging.getLogger(__name__)
settings = get_settings()
def get_engine_config():
"""根据数据库类型返回引擎配置"""
if settings.DATABASE_URL.startswith("sqlite"):
return {"echo": settings.DEBUG}
return {"echo": settings.DEBUG, "poolclass": NullPool}
else:
return {
"echo": settings.DEBUG,
"pool_pre_ping": True,
"pool_size": 10,
"max_overflow": 20,
"pool_recycle": 3600,
"pool_timeout": 30,
}
@@ -30,14 +37,14 @@ async_engine = create_async_engine(
**get_engine_config()
)
# Sync engine for migrations
# Sync engine for migrations (use NullPool for SQLite)
sync_engine = create_engine(
settings.DATABASE_URL_SYNC,
echo=settings.DEBUG,
pool_pre_ping=True,
poolclass=NullPool if settings.DATABASE_URL_SYNC.startswith("sqlite") else None,
)
# Async session factory
AsyncSessionLocal = async_sessionmaker(
async_engine,
@@ -55,8 +62,50 @@ class Base(DeclarativeBase):
async def init_db():
"""Initialize database tables"""
logger.info("Initializing database...")
async with async_engine.begin() as conn:
await conn.run_sync(Base.metadata.create_all)
await conn.run_sync(_ensure_legacy_columns)
logger.info("Database initialized successfully")
def _ensure_legacy_columns(sync_conn):
"""Patch legacy tables with newly introduced columns."""
inspector = inspect(sync_conn)
if "model_configs" not in inspector.get_table_names():
return
columns = {column["name"] for column in inspector.get_columns("model_configs")}
if "model_type" in columns:
return
logger.info("Adding missing model_type column to model_configs table")
dialect = sync_conn.dialect.name
if dialect == "postgresql":
sync_conn.execute(text("ALTER TABLE model_configs ADD COLUMN model_type VARCHAR(50) NOT NULL DEFAULT 'chat'"))
else:
sync_conn.execute(text("ALTER TABLE model_configs ADD COLUMN model_type VARCHAR(50) NOT NULL DEFAULT 'chat'"))
async def close_db():
"""Close database connections"""
logger.info("Closing database connections...")
await async_engine.dispose()
logger.info("Database connections closed")
@asynccontextmanager
async def get_db_session() -> AsyncGenerator[AsyncSession, None]:
"""Context manager for database sessions with automatic cleanup"""
session = AsyncSessionLocal()
try:
yield session
except Exception as e:
logger.error(f"Database session error: {str(e)}")
await session.rollback()
raise
finally:
await session.close()
async def get_db() -> AsyncSession:
@@ -64,5 +113,14 @@ async def get_db() -> AsyncSession:
async with AsyncSessionLocal() as session:
try:
yield session
except Exception as e:
logger.error(f"Database error in dependency: {str(e)}")
await session.rollback()
raise
finally:
await session.close()
# Import all models to register them with Base.metadata
# This ensures all models are loaded before create_all is called
from app.models.models import * # noqa: F401, F403, E402

View File

@@ -0,0 +1,119 @@
"""
Custom Exceptions
自定义异常类
"""
from typing import Any, Optional
class AppException(Exception):
"""基础应用异常"""
def __init__(
self,
message: str,
code: str = "INTERNAL_ERROR",
status_code: int = 500,
details: Optional[dict] = None
):
self.message = message
self.code = code
self.status_code = status_code
self.details = details
super().__init__(self.message)
class NotFoundException(AppException):
"""资源未找到异常"""
def __init__(self, resource: str, resource_id: Any = None):
message = f"{resource} not found"
if resource_id:
message = f"{resource} with id '{resource_id}' not found"
super().__init__(
message=message,
code="NOT_FOUND",
status_code=404
)
class ValidationException(AppException):
"""验证异常"""
def __init__(self, message: str, field: str = None, details: dict = None):
super().__init__(
message=message,
code="VALIDATION_ERROR",
status_code=422,
details={"field": field, **(details or {})}
)
class DuplicateException(AppException):
"""重复资源异常"""
def __init__(self, resource: str, field: str = None):
message = f"{resource} already exists"
if field:
message = f"{resource} with {field} already exists"
super().__init__(
message=message,
code="DUPLICATE",
status_code=409
)
class UnauthorizedException(AppException):
"""未授权异常"""
def __init__(self, message: str = "Unauthorized"):
super().__init__(
message=message,
code="UNAUTHORIZED",
status_code=401
)
class ForbiddenException(AppException):
"""禁止访问异常"""
def __init__(self, message: str = "Forbidden"):
super().__init__(
message=message,
code="FORBIDDEN",
status_code=403
)
class RateLimitException(AppException):
"""速率限制异常"""
def __init__(self, message: str = "Rate limit exceeded"):
super().__init__(
message=message,
code="RATE_LIMIT",
status_code=429
)
class FileProcessingException(AppException):
"""文件处理异常"""
def __init__(self, message: str, file_name: str = None):
details = {"file_name": file_name} if file_name else None
super().__init__(
message=message,
code="FILE_PROCESSING_ERROR",
status_code=422,
details=details
)
class DatabaseException(AppException):
"""数据库异常"""
def __init__(self, message: str = "Database operation failed"):
super().__init__(
message=message,
code="DATABASE_ERROR",
status_code=500
)

139
backend/app/core/logging.py Normal file
View File

@@ -0,0 +1,139 @@
"""
Logging Configuration
日志配置
"""
import logging
import sys
from datetime import datetime
from typing import Any
from logging.handlers import RotatingFileHandler, TimedRotatingFileHandler
from pathlib import Path
from app.core.config import get_settings
settings = get_settings()
# Log directory
LOG_DIR = Path("./logs")
LOG_DIR.mkdir(exist_ok=True)
# 日期格式
LOG_DATE = datetime.now().strftime("%Y-%m-%d")
# 当天的日志目录
CURRENT_LOG_DIR = LOG_DIR / LOG_DATE
CURRENT_LOG_DIR.mkdir(exist_ok=True)
def get_log_path(filename: str) -> Path:
"""获取当天的日志文件路径"""
return CURRENT_LOG_DIR / filename
def setup_logging(name: str = "yg_dataset") -> logging.Logger:
"""Setup application logging"""
logger = logging.getLogger(name)
logger.setLevel(logging.DEBUG if settings.DEBUG else logging.INFO)
# Avoid duplicate handlers
if logger.handlers:
return logger
# Console handler
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(logging.DEBUG if settings.DEBUG else logging.INFO)
console_formatter = logging.Formatter(
fmt="%(asctime)s | %(levelname)-8s | %(name)s:%(funcName)s:%(lineno)d | %(message)s",
datefmt="%Y-%m-%d %H:%M:%S"
)
console_handler.setFormatter(console_formatter)
logger.addHandler(console_handler)
# Main log file handler - app.log
main_file_handler = TimedRotatingFileHandler(
get_log_path("app.log"),
when="midnight",
interval=1,
backupCount=30,
encoding="utf-8"
)
main_file_handler.setLevel(logging.INFO)
main_file_formatter = logging.Formatter(
fmt="%(asctime)s | %(levelname)-8s | %(name)s:%(funcName)s:%(lineno)d | %(message)s",
datefmt="%Y-%m-%d %H:%M:%S"
)
main_file_handler.setFormatter(main_file_formatter)
logger.addHandler(main_file_handler)
return logger
# Create default logger
logger = setup_logging()
# ============== Success Logger ==============
def get_success_logger() -> logging.Logger:
"""获取成功日志记录器"""
success_logger = logging.getLogger("yg_dataset.success")
if not success_logger.handlers:
handler = RotatingFileHandler(
get_log_path("success.log"),
maxBytes=10 * 1024 * 1024,
backupCount=30,
encoding="utf-8"
)
handler.setLevel(logging.INFO)
formatter = logging.Formatter(
fmt="%(asctime)s | %(message)s",
datefmt="%Y-%m-%d %H:%M:%S"
)
handler.setFormatter(formatter)
success_logger.addHandler(handler)
success_logger.setLevel(logging.INFO)
return success_logger
# ============== Failure Logger ==============
def get_failure_logger() -> logging.Logger:
"""获取失败日志记录器"""
failure_logger = logging.getLogger("yg_dataset.failure")
if not failure_logger.handlers:
handler = RotatingFileHandler(
get_log_path("failure.log"),
maxBytes=10 * 1024 * 1024,
backupCount=30,
encoding="utf-8"
)
handler.setLevel(logging.WARNING)
formatter = logging.Formatter(
fmt="%(asctime)s | %(levelname)-8s | %(name)s:%(funcName)s:%(lineno)d | %(message)s",
datefmt="%Y-%m-%d %H:%M:%S"
)
handler.setFormatter(formatter)
failure_logger.addHandler(handler)
failure_logger.setLevel(logging.WARNING)
return failure_logger
# ============== Convenience functions ==============
def log_success(message: str, **kwargs):
"""记录成功日志"""
extra_info = " | ".join([f"{k}={v}" for k, v in kwargs.items()]) if kwargs else ""
full_message = f"{message} | {extra_info}" if extra_info else message
get_success_logger().info(full_message)
def log_failure(message: str, **kwargs):
"""记录失败日志"""
extra_info = " | ".join([f"{k}={v}" for k, v in kwargs.items()]) if kwargs else ""
full_message = f"{message} | {extra_info}" if extra_info else message
get_failure_logger().warning(full_message)
class LoggerMixin:
"""Mixin to add logging capability to classes"""
@property
def log(self) -> logging.Logger:
"""Get logger for this class"""
return logging.getLogger(self.__class__.__module__ + "." + self.__class__.__name__)

View File

@@ -3,23 +3,74 @@ YG-Dataset Backend Application
FastAPI-based API server for dataset generation platform
"""
import logging
import time
import uuid
from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi import FastAPI, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from fastapi.exceptions import RequestValidationError
from starlette.middleware.base import BaseHTTPMiddleware
from sqlalchemy.exc import SQLAlchemyError
from app.api.v1 import api_router
from app.api.response import ApiResponse
from app.core.config import settings
from app.core.database import init_db
from app.core.database import init_db, close_db
from app.core.exceptions import AppException
from app.core.logging import logger
# Import all models to register them with Base.metadata
from app.models.models import * # noqa: F401, F403
class RequestIDMiddleware(BaseHTTPMiddleware):
"""Middleware to add request ID to each request"""
async def dispatch(self, request: Request, call_next):
request_id = str(uuid.uuid4())
request.state.request_id = request_id
# Add request ID to response headers
response = await call_next(request)
response.headers["X-Request-ID"] = request_id
return response
class TimingMiddleware(BaseHTTPMiddleware):
"""Middleware to measure request processing time"""
async def dispatch(self, request: Request, call_next):
start_time = time.time()
# Log request
logger.info(f"{request.method} {request.url.path}")
response = await call_next(request)
process_time = time.time() - start_time
response.headers["X-Process-Time"] = str(process_time)
# Log response
logger.info(f"{request.method} {request.url.path} | Status: {response.status_code} | Time: {process_time:.3f}s")
return response
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Application lifespan events"""
# Startup
logger.info("Starting YG-Dataset application...")
await init_db()
logger.info("Database initialized successfully")
yield
# Shutdown
pass
logger.info("Shutting down YG-Dataset application...")
await close_db()
logger.info("Database connections closed")
app = FastAPI(
@@ -29,15 +80,83 @@ app = FastAPI(
lifespan=lifespan,
)
# CORS
# Add custom middleware (order matters: last added = first executed)
app.add_middleware(TimingMiddleware)
app.add_middleware(RequestIDMiddleware)
# CORS - Configure properly for production
# For development, you can use ["*"] but for production, specify exact origins
ALLOWED_ORIGINS = ["*"]
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_origins=ALLOWED_ORIGINS,
allow_credentials=True,
allow_methods=["*"],
allow_methods=["GET", "POST", "PUT", "DELETE", "OPTIONS"],
allow_headers=["*"],
)
# Exception handlers
@app.exception_handler(AppException)
async def app_exception_handler(request: Request, exc: AppException):
"""Handle custom application exceptions"""
logger.warning(f"App exception: {exc.message} | Code: {exc.code}")
return JSONResponse(
status_code=exc.status_code,
content=ApiResponse.fail(
message=exc.message,
error={"code": exc.code, "details": exc.details}
).model_dump(mode='json')
)
@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
"""Handle validation exceptions"""
errors = []
for error in exc.errors():
errors.append({
"field": ".".join(str(loc) for loc in error["loc"]),
"message": error["msg"],
"type": error["type"]
})
logger.warning(f"Validation error: {errors}")
return JSONResponse(
status_code=422,
content=ApiResponse.fail(
message="Validation error",
error={"code": "VALIDATION_ERROR", "details": {"errors": errors}}
).model_dump(mode='json')
)
@app.exception_handler(SQLAlchemyError)
async def database_exception_handler(request: Request, exc: SQLAlchemyError):
"""Handle database exceptions"""
logger.error(f"Database error: {str(exc)}", exc_info=True)
return JSONResponse(
status_code=500,
content=ApiResponse.fail(
message="Database operation failed",
error={"code": "DATABASE_ERROR"}
).model_dump(mode='json')
)
@app.exception_handler(Exception)
async def general_exception_handler(request: Request, exc: Exception):
"""Handle unhandled exceptions"""
logger.error(f"Unhandled exception: {str(exc)}", exc_info=True)
return JSONResponse(
status_code=500,
content=ApiResponse.fail(
message="Internal server error",
error={"code": "INTERNAL_ERROR"}
).model_dump(mode='json')
)
# Include API routes
app.include_router(api_router, prefix="/api/v1")
@@ -45,7 +164,10 @@ app.include_router(api_router, prefix="/api/v1")
@app.get("/health")
async def health_check():
"""Health check endpoint"""
return {"status": "healthy", "version": "1.0.0"}
return ApiResponse.ok(
data={"status": "healthy", "version": "1.0.0"},
message="Service is running"
)
if __name__ == "__main__":

View File

@@ -14,6 +14,7 @@ class Project(Base, UUIDMixin, TimestampMixin):
name = Column(String(255), nullable=False)
description = Column(Text)
type = Column(String(50), default="qa") # qa, table, database
# Relationships
files = relationship("File", back_populates="project", cascade="all, delete-orphan")
@@ -51,7 +52,7 @@ class Chunk(Base, UUIDMixin, TimestampMixin):
content = Column(Text, nullable=False)
summary = Column(Text)
word_count = Column(Integer)
metadata = Column(JSON) # store additional info like headings, page numbers
extra_data = Column(JSON) # store additional info like headings, page numbers
# Relationships
project = relationship("Project", back_populates="chunks")
@@ -112,7 +113,7 @@ class Dataset(Base, UUIDMixin, TimestampMixin):
name = Column(String(255), nullable=False)
description = Column(Text)
dataset_type = Column(String(50)) # qa, conversation, instruction
metadata = Column(JSON)
extra_data = Column(JSON)
# Relationships
project = relationship("Project", back_populates="datasets")
@@ -125,7 +126,7 @@ class EvalDataset(Base, UUIDMixin, TimestampMixin):
project_id = Column(UUID(as_uuid=True), ForeignKey("projects.id", ondelete="CASCADE"), nullable=False)
name = Column(String(255), nullable=False)
question_type = Column(String(50)) # mixed, fact, reasoning
metadata = Column(JSON)
extra_data = Column(JSON)
# Relationships
project = relationship("Project", back_populates="eval_datasets")
@@ -135,12 +136,14 @@ class ModelConfig(Base, UUIDMixin, TimestampMixin):
"""Model configuration for LLM providers"""
__tablename__ = "model_configs"
project_id = Column(UUID(as_uuid=True), ForeignKey("projects.id", ondelete="CASCADE"), nullable=False)
provider = Column(String(50), nullable=False) # openai, anthropic, ollama, custom
project_id = Column(UUID(as_uuid=True), ForeignKey("projects.id", ondelete="CASCADE"), nullable=True)
provider = Column(String(50), nullable=False) # minimax, glm, openai, ali
model_type = Column(String(50), nullable=False, default="chat") # chat, vlm, embedding, rerank
model_name = Column(String(100))
api_key = Column(String(500))
api_base = Column(String(500))
is_default = Column(String(10), default="false")
connection_status = Column(String(20), default="untested") # untested, connected, disconnected
# Relationships
project = relationship("Project", back_populates="model_configs")

View File

@@ -1,3 +1,101 @@
"""
Pydantic Schemas
"""
from app.schemas.base import (
TimestampMixin,
UUIDMixin,
)
from app.schemas.project import (
ProjectBase,
ProjectCreate,
ProjectUpdate,
ProjectResponse,
)
from app.schemas.file import (
FileBase,
FileCreate,
FileUpdate,
FileResponse,
)
from app.schemas.chunk import (
ChunkBase,
ChunkCreate,
ChunkUpdate,
ChunkResponse,
)
from app.schemas.question import (
QuestionBase,
QuestionCreate,
QuestionUpdate,
QuestionResponse,
)
from app.schemas.dataset import (
DatasetBase,
DatasetCreate,
DatasetUpdate,
DatasetResponse,
)
from app.schemas.eval import (
EvalDatasetBase,
EvalDatasetCreate,
EvalDatasetUpdate,
EvalDatasetResponse,
TaskBase,
TaskResponse,
)
from app.schemas.model import (
ModelBase,
ModelCreate,
ModelUpdate,
ModelResponse,
)
__all__ = [
# Base
"TimestampMixin",
"UUIDMixin",
# Project
"ProjectBase",
"ProjectCreate",
"ProjectUpdate",
"ProjectResponse",
# File
"FileBase",
"FileCreate",
"FileUpdate",
"FileResponse",
# Chunk
"ChunkBase",
"ChunkCreate",
"ChunkUpdate",
"ChunkResponse",
# Question
"QuestionBase",
"QuestionCreate",
"QuestionUpdate",
"QuestionResponse",
# Dataset
"DatasetBase",
"DatasetCreate",
"DatasetUpdate",
"DatasetResponse",
# Eval
"EvalDatasetBase",
"EvalDatasetCreate",
"EvalDatasetUpdate",
"EvalDatasetResponse",
"TaskBase",
"TaskResponse",
# Model
"ModelBase",
"ModelCreate",
"ModelUpdate",
"ModelResponse",
]

View File

@@ -4,7 +4,7 @@ Base Pydantic schemas
from datetime import datetime
from typing import Optional, Any
from uuid import UUID
from pydantic import BaseModel, ConfigDict
from pydantic import BaseModel, ConfigDict, Field
class TimestampMixin(BaseModel):
@@ -18,153 +18,3 @@ class UUIDMixin(BaseModel):
model_config = ConfigDict(from_attributes=True)
id: UUID
class ProjectBase(BaseModel):
"""Base project schema"""
name: str
description: Optional[str] = None
class ProjectCreate(ProjectBase):
"""Project create schema"""
pass
class ProjectUpdate(ProjectBase):
"""Project update schema"""
pass
class ProjectResponse(ProjectBase, UUIDMixin, TimestampMixin):
"""Project response schema"""
pass
class FileBase(BaseModel):
"""Base file schema"""
filename: str
file_type: str
size: Optional[int] = None
class FileResponse(FileBase, UUIDMixin, TimestampMixin):
"""File response schema"""
status: str
class ChunkBase(BaseModel):
"""Base chunk schema"""
name: Optional[str] = None
content: str
summary: Optional[str] = None
word_count: Optional[int] = None
class ChunkCreate(ChunkBase):
"""Chunk create schema"""
file_id: Optional[UUID] = None
class ChunkResponse(ChunkBase, UUIDMixin, TimestampMixin):
"""Chunk response schema"""
pass
class QuestionBase(BaseModel):
"""Base question schema"""
content: str
answer: Optional[str] = None
question_type: Optional[str] = None
class QuestionCreate(QuestionBase):
"""Question create schema"""
chunk_id: Optional[UUID] = None
class QuestionResponse(QuestionBase, UUIDMixin, TimestampMixin):
"""Question response schema"""
source: str
class DatasetBase(BaseModel):
"""Base dataset schema"""
name: str
description: Optional[str] = None
dataset_type: Optional[str] = None
class DatasetCreate(DatasetBase):
"""Dataset create schema"""
pass
class DatasetResponse(DatasetBase, UUIDMixin, TimestampMixin):
"""Dataset response schema"""
question_count: Optional[int] = None
class EvalDatasetBase(BaseModel):
"""Base eval dataset schema"""
name: str
question_type: Optional[str] = None
class EvalDatasetCreate(EvalDatasetBase):
"""Eval dataset create schema"""
pass
class EvalDatasetResponse(EvalDatasetBase, UUIDMixin, TimestampMixin):
"""Eval dataset response schema"""
pass
class TagBase(BaseModel):
"""Base tag schema"""
label: str
parent_id: Optional[UUID] = None
color: Optional[str] = None
class TagCreate(TagBase):
"""Tag create schema"""
pass
class TagResponse(TagBase, UUIDMixin, TimestampMixin):
"""Tag response schema"""
pass
class ModelConfigBase(BaseModel):
"""Base model config schema"""
provider: str
model_name: Optional[str] = None
api_key: Optional[str] = None
api_base: Optional[str] = None
is_default: Optional[str] = "false"
class ModelConfigCreate(ModelConfigBase):
"""Model config create schema"""
pass
class ModelConfigResponse(ModelConfigBase, UUIDMixin, TimestampMixin):
"""Model config response schema"""
pass
class TaskBase(BaseModel):
"""Base task schema"""
task_type: str
status: Optional[str] = "pending"
progress: Optional[int] = 0
class TaskResponse(TaskBase, UUIDMixin, TimestampMixin):
"""Task response schema"""
result: Optional[Any] = None
error: Optional[str] = None

View File

@@ -0,0 +1,46 @@
"""
Chunk Schemas
"""
from datetime import datetime
from typing import Optional, Any
from uuid import UUID
from pydantic import BaseModel, ConfigDict, Field
class ChunkBase(BaseModel):
"""Base chunk schema"""
name: Optional[str] = Field(None, max_length=255)
content: str = Field(..., min_length=1)
summary: Optional[str] = None
word_count: Optional[int] = None
extra_data: Optional[dict] = None
class ChunkCreate(ChunkBase):
"""Chunk create schema"""
project_id: Optional[UUID] = None
file_id: Optional[UUID] = None
class ChunkUpdate(BaseModel):
"""Chunk update schema"""
name: Optional[str] = Field(None, max_length=255)
content: Optional[str] = Field(None, min_length=1)
summary: Optional[str] = None
extra_data: Optional[dict] = None
class ChunkResponse(ChunkBase):
"""Chunk response schema"""
model_config = ConfigDict(from_attributes=True)
id: UUID
project_id: UUID
file_id: Optional[UUID]
created_at: datetime
updated_at: datetime
# Alias for CRUD
ChunkCreateSchema = ChunkCreate
ChunkUpdateSchema = ChunkUpdate

View File

@@ -0,0 +1,43 @@
"""
Dataset Schemas
"""
from datetime import datetime
from typing import Optional, Any
from uuid import UUID
from pydantic import BaseModel, ConfigDict, Field
class DatasetBase(BaseModel):
"""Base dataset schema"""
name: str = Field(..., min_length=1, max_length=255)
description: Optional[str] = Field(None, max_length=2000)
dataset_type: Optional[str] = Field(None, max_length=50)
extra_data: Optional[dict] = None
class DatasetCreate(DatasetBase):
"""Dataset create schema"""
pass
class DatasetUpdate(BaseModel):
"""Dataset update schema"""
name: Optional[str] = Field(None, min_length=1, max_length=255)
description: Optional[str] = Field(None, max_length=2000)
dataset_type: Optional[str] = Field(None, max_length=50)
extra_data: Optional[dict] = None
class DatasetResponse(DatasetBase):
"""Dataset response schema"""
model_config = ConfigDict(from_attributes=True)
id: UUID
project_id: UUID
created_at: datetime
updated_at: datetime
# Alias for CRUD
DatasetCreateSchema = DatasetCreate
DatasetUpdateSchema = DatasetUpdate

View File

@@ -0,0 +1,60 @@
"""
Evaluation Dataset Schemas
"""
from datetime import datetime
from typing import Optional, Any
from uuid import UUID
from pydantic import BaseModel, ConfigDict, Field
class EvalDatasetBase(BaseModel):
"""Base eval dataset schema"""
name: str = Field(..., min_length=1, max_length=255)
question_type: Optional[str] = Field("mixed", max_length=50)
extra_data: Optional[dict] = None
class EvalDatasetCreate(EvalDatasetBase):
"""Eval dataset create schema"""
pass
class EvalDatasetUpdate(BaseModel):
"""Eval dataset update schema"""
name: Optional[str] = Field(None, min_length=1, max_length=255)
question_type: Optional[str] = Field(None, max_length=50)
extra_data: Optional[dict] = None
class EvalDatasetResponse(EvalDatasetBase):
"""Eval dataset response schema"""
model_config = ConfigDict(from_attributes=True)
id: UUID
project_id: UUID
created_at: datetime
updated_at: datetime
class TaskBase(BaseModel):
"""Base task schema"""
task_type: str = Field(..., max_length=50)
status: Optional[str] = "pending"
progress: Optional[int] = Field(0, ge=0, le=100)
result: Optional[Any] = None
error: Optional[str] = None
class TaskResponse(TaskBase):
"""Task response schema"""
model_config = ConfigDict(from_attributes=True)
id: UUID
project_id: Optional[UUID]
created_at: datetime
updated_at: datetime
# Alias for CRUD
EvalDatasetCreateSchema = EvalDatasetCreate
EvalDatasetUpdateSchema = EvalDatasetUpdate

View File

@@ -0,0 +1,43 @@
"""
File Schemas
"""
from datetime import datetime
from typing import Optional
from uuid import UUID
from pydantic import BaseModel, ConfigDict, Field
class FileBase(BaseModel):
"""Base file schema"""
filename: str = Field(..., min_length=1, max_length=255)
file_type: str = Field(..., max_length=50)
size: Optional[int] = None
class FileCreate(FileBase):
"""File create schema"""
project_id: UUID
file_path: Optional[str] = None
status: str = "pending"
class FileUpdate(BaseModel):
"""File update schema"""
status: Optional[str] = None
class FileResponse(FileBase):
"""File response schema"""
model_config = ConfigDict(from_attributes=True)
id: UUID
project_id: UUID
file_path: Optional[str]
status: str
created_at: datetime
updated_at: datetime
# Alias for CRUD
FileCreateSchema = FileCreate
FileUpdateSchema = FileUpdate

View File

@@ -0,0 +1,43 @@
"""
Model Schema
"""
from pydantic import BaseModel, Field, ConfigDict
from typing import Optional
from datetime import datetime
from uuid import UUID
class ModelBase(BaseModel):
"""Base model schema"""
provider: str = Field(..., description="Model provider: minimax, glm, openai, ali")
model_type: str = Field(default="chat", description="Model type: chat, vlm, embedding, rerank")
model_name: str = Field(..., description="Model name")
api_key: Optional[str] = Field(None, description="API key")
api_base: Optional[str] = Field(None, description="API base URL")
is_default: str = Field(default="false", description="Is default model: true/false")
class ModelCreate(ModelBase):
"""Model creation schema"""
pass
class ModelUpdate(BaseModel):
"""Model update schema"""
provider: Optional[str] = None
model_type: Optional[str] = None
model_name: Optional[str] = None
api_key: Optional[str] = None
api_base: Optional[str] = None
is_default: Optional[str] = None
class ModelResponse(ModelBase):
"""Model response schema"""
id: UUID
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
project_id: Optional[UUID] = None
connection_status: Optional[str] = Field(default="untested")
model_config = ConfigDict(from_attributes=True)

View File

@@ -0,0 +1,40 @@
"""
Project Schemas
"""
from datetime import datetime
from typing import Optional
from uuid import UUID
from pydantic import BaseModel, ConfigDict, Field
class ProjectBase(BaseModel):
"""Base project schema"""
name: str = Field(..., min_length=1, max_length=255)
description: Optional[str] = Field(None, max_length=2000)
type: str = Field(default="qa") # qa, table, database
class ProjectCreate(ProjectBase):
"""Project create schema"""
pass
class ProjectUpdate(BaseModel):
"""Project update schema"""
name: Optional[str] = Field(None, min_length=1, max_length=255)
description: Optional[str] = Field(None, max_length=2000)
type: Optional[str] = Field(None)
class ProjectResponse(ProjectBase):
"""Project response schema"""
model_config = ConfigDict(from_attributes=True)
id: UUID
created_at: datetime
updated_at: datetime
# Alias for CRUD
ProjectCreateSchema = ProjectCreate
ProjectUpdateSchema = ProjectUpdate

View File

@@ -0,0 +1,43 @@
"""
Question Schemas
"""
from datetime import datetime
from typing import Optional
from uuid import UUID
from pydantic import BaseModel, ConfigDict, Field
class QuestionBase(BaseModel):
"""Base question schema"""
content: str = Field(..., min_length=1)
answer: Optional[str] = None
question_type: Optional[str] = Field(None, max_length=50)
source: Optional[str] = "manual"
class QuestionCreate(QuestionBase):
"""Question create schema"""
chunk_id: Optional[UUID] = None
class QuestionUpdate(BaseModel):
"""Question update schema"""
content: Optional[str] = Field(None, min_length=1)
answer: Optional[str] = None
question_type: Optional[str] = Field(None, max_length=50)
class QuestionResponse(QuestionBase):
"""Question response schema"""
model_config = ConfigDict(from_attributes=True)
id: UUID
project_id: UUID
chunk_id: Optional[UUID]
created_at: datetime
updated_at: datetime
# Alias for CRUD
QuestionCreateSchema = QuestionCreate
QuestionUpdateSchema = QuestionUpdate

View File

@@ -1,8 +1,9 @@
"""
DOCX Text Extractor
"""
import asyncio
from typing import Dict
from docx import Document
from typing import Dict, List
class DOCXProcessor:
@@ -26,6 +27,12 @@ class DOCXProcessor:
return "\n\n".join(text_parts)
async def extract_text_async(self, file_path: str) -> str:
"""Extract all text from DOCX asynchronously"""
return await asyncio.get_event_loop().run_in_executor(
None, self.extract_text, file_path
)
def extract_with_metadata(self, file_path: str) -> Dict:
"""Extract text with DOCX metadata"""
doc = Document(file_path)
@@ -46,8 +53,14 @@ class DOCXProcessor:
return result
async def extract_with_metadata_async(self, file_path: str) -> Dict:
"""Extract with metadata asynchronously"""
return await asyncio.get_event_loop().run_in_executor(
None, self.extract_with_metadata, file_path
)
def process_docx(file_path: str) -> str:
async def process_docx(file_path: str) -> str:
"""Process DOCX file and return text"""
processor = DOCXProcessor()
return processor.extract_text(file_path)
return await processor.extract_text_async(file_path)

View File

@@ -1,8 +1,9 @@
"""
Excel/CSV Text Extractor
"""
import pandas as pd
import asyncio
from typing import Dict, List
import pandas as pd
class ExcelProcessor:
@@ -13,6 +14,12 @@ class ExcelProcessor:
df = pd.read_csv(file_path)
return self._dataframe_to_text(df)
async def extract_csv_async(self, file_path: str) -> str:
"""Extract CSV asynchronously"""
return await asyncio.get_event_loop().run_in_executor(
None, self.extract_csv, file_path
)
def extract_excel(self, file_path: str, sheet_name: str = None) -> str:
"""Extract text from Excel file"""
if sheet_name:
@@ -27,6 +34,12 @@ class ExcelProcessor:
text_parts.append(self._dataframe_to_text(df))
return "\n\n".join(text_parts)
async def extract_excel_async(self, file_path: str, sheet_name: str = None) -> str:
"""Extract Excel asynchronously"""
return await asyncio.get_event_loop().run_in_executor(
None, self.extract_excel, file_path, sheet_name
)
def _dataframe_to_text(self, df: pd.DataFrame) -> str:
"""Convert DataFrame to readable text"""
text_parts = []
@@ -48,19 +61,25 @@ class ExcelProcessor:
sheets = pd.read_excel(file_path, sheet_name=None)
return {name: self._dataframe_to_text(df) for name, df in sheets.items()}
async def extract_all_sheets_async(self, file_path: str) -> Dict[str, str]:
"""Extract all sheets asynchronously"""
return await asyncio.get_event_loop().run_in_executor(
None, self.extract_all_sheets, file_path
)
def get_sheet_names(self, file_path: str) -> List[str]:
"""Get all sheet names from Excel file"""
xl = pd.ExcelFile(file_path)
return xl.sheet_names
def process_csv(file_path: str) -> str:
async def process_csv(file_path: str) -> str:
"""Process CSV file and return text"""
processor = ExcelProcessor()
return processor.extract_csv(file_path)
return await processor.extract_csv_async(file_path)
def process_excel(file_path: str) -> str:
async def process_excel(file_path: str) -> str:
"""Process Excel file and return text"""
processor = ExcelProcessor()
return processor.extract_excel(file_path)
return await processor.extract_excel_async(file_path)

View File

@@ -1,8 +1,9 @@
"""
PDF Text Extractor
"""
import asyncio
from typing import Dict, List
import pdfplumber
from typing import Dict, List, Optional
class PDFProcessor:
@@ -20,6 +21,12 @@ class PDFProcessor:
return "\n\n".join(text_parts)
async def extract_text_async(self, file_path: str) -> str:
"""Extract all text from PDF asynchronously"""
return await asyncio.get_event_loop().run_in_executor(
None, self.extract_text, file_path
)
def extract_pages(self, file_path: str) -> List[Dict]:
"""Extract text page by page with metadata"""
pages = []
@@ -36,6 +43,12 @@ class PDFProcessor:
return pages
async def extract_pages_async(self, file_path: str) -> List[Dict]:
"""Extract pages asynchronously"""
return await asyncio.get_event_loop().run_in_executor(
None, self.extract_pages, file_path
)
def extract_with_metadata(self, file_path: str) -> Dict:
"""Extract text with PDF metadata"""
result = {
@@ -58,8 +71,14 @@ class PDFProcessor:
return result
async def extract_with_metadata_async(self, file_path: str) -> Dict:
"""Extract with metadata asynchronously"""
return await asyncio.get_event_loop().run_in_executor(
None, self.extract_with_metadata, file_path
)
def process_pdf(file_path: str) -> str:
async def process_pdf(file_path: str) -> str:
"""Process PDF file and return text"""
processor = PDFProcessor()
return processor.extract_with_metadata(file_path)["text"]
return await processor.extract_with_metadata_async(file_path)

View File

@@ -0,0 +1,407 @@
"""
Semantic Text Splitter using Online Embedding APIs
基于在线 Embedding API 的语义分割器
"""
import re
import asyncio
import httpx
import numpy as np
from typing import List, Dict, Optional
from abc import ABC, abstractmethod
from langchain_text_splitters import RecursiveCharacterTextSplitter
class EmbeddingProvider(ABC):
"""Embedding API 提供商基类"""
@abstractmethod
async def get_embeddings(self, texts: List[str]) -> List[List[float]]:
"""获取文本的嵌入向量"""
pass
class OpenAIEmbedding(EmbeddingProvider):
"""OpenAI 兼容的 Embedding API"""
def __init__(self, api_key: str, base_url: str, model: str = "text-embedding-3-small"):
self.api_key = api_key
self.base_url = base_url.rstrip('/')
self.model = model
async def get_embeddings(self, texts: List[str]) -> List[List[float]]:
"""调用 OpenAI 兼容的 Embedding API"""
async with httpx.AsyncClient(timeout=60.0) as client:
headers = {
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json"
}
# OpenAI 格式
payload = {
"input": texts,
"model": self.model
}
response = await client.post(
f"{self.base_url}/embeddings",
headers=headers,
json=payload
)
response.raise_for_status()
data = response.json()
# 提取 embeddings
return [item["embedding"] for item in data["data"]]
class MiniMaxEmbedding(EmbeddingProvider):
"""MiniMax Embedding API"""
def __init__(self, api_key: str, base_url: str = "https://api.minimax.chat/v1"):
self.api_key = api_key
self.base_url = base_url.rstrip('/')
async def get_embeddings(self, texts: List[str]) -> List[List[float]]:
"""调用 MiniMax Embedding API"""
async with httpx.AsyncClient(timeout=60.0) as client:
headers = {
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json"
}
# MiniMax 格式
payload = {
"texts": texts,
"model": "embo-01"
}
response = await client.post(
f"{self.base_url}/text_embeddings",
headers=headers,
json=payload
)
response.raise_for_status()
data = response.json()
# MiniMax 返回格式可能不同,需要适配
if "data" in data:
return [item["embedding"] for item in data["data"]]
return []
class EmbeddingSplitter:
"""基于 Embedding 的语义分割器基类"""
def __init__(
self,
chunk_size: int = 500,
overlap: int = 50,
embedding_provider: Optional[EmbeddingProvider] = None,
similarity_threshold: float = 0.3,
min_chunk_size: int = 100,
window_size: int = 3
):
self.chunk_size = chunk_size
self.overlap = overlap
self.embedding_provider = embedding_provider
self.similarity_threshold = similarity_threshold
self.min_chunk_size = min_chunk_size
self.window_size = window_size
def _tokenize_sentences(self, text: str) -> List[str]:
"""将文本切分为句子"""
paragraphs = re.split(r'\n\s*\n+', text)
sentences = []
for para in paragraphs:
para = para.strip()
if not para:
continue
parts = re.split(r'(?<=[。!?;.!?])\s+|(?<=[。!?;])', para)
buffer = []
for part in parts:
part = part.strip()
if not part:
continue
# 过短的片段先暂存,尽量与后一句合并,避免 embedding 粒度过碎
if len(part) < 8 and buffer:
buffer[-1] = f"{buffer[-1]} {part}".strip()
else:
buffer.append(part)
sentences.extend(buffer)
return sentences
def _compute_similarities(self, embeddings: List[List[float]]) -> List[float]:
"""计算相邻句子的余弦相似度"""
similarities = []
for i in range(len(embeddings) - 1):
# 余弦相似度
vec1 = np.array(embeddings[i])
vec2 = np.array(embeddings[i + 1])
# 归一化
vec1 = vec1 / (np.linalg.norm(vec1) + 1e-8)
vec2 = vec2 / (np.linalg.norm(vec2) + 1e-8)
# 点积 = 余弦相似度(归一化后)
sim = np.dot(vec1, vec2)
similarities.append(float(sim))
return similarities
def _smooth_similarities(self, similarities: List[float]) -> List[float]:
"""滑动窗口平滑相似度"""
if not similarities:
return []
window = max(1, self.window_size)
smoothed = []
for i in range(len(similarities)):
start = max(0, i - window)
end = min(len(similarities), i + window + 1)
window_vals = similarities[start:end]
smoothed.append(sum(window_vals) / len(window_vals))
return smoothed
def _detect_boundaries(self, similarities: List[float], sentence_lengths: List[int]) -> List[int]:
"""检测分割点(相似度显著下降的位置)"""
if not similarities:
return [0]
smoothed = self._smooth_similarities(similarities)
if len(smoothed) <= 1:
return [0]
mean_sim = float(np.mean(smoothed))
std_sim = float(np.std(smoothed))
dynamic_threshold = max(0.0, min(0.95, mean_sim - 0.5 * std_sim))
effective_threshold = max(self.similarity_threshold, dynamic_threshold)
boundaries = [0] # 起始点
accumulated_chars = 0
for i, sim in enumerate(smoothed):
accumulated_chars += sentence_lengths[i]
left_sim = smoothed[i - 1] if i > 0 else 1.0
right_sim = smoothed[i + 1] if i < len(smoothed) - 1 else 1.0
is_local_min = sim <= left_sim and sim <= right_sim
has_enough_context = accumulated_chars >= self.min_chunk_size
oversize_guard = accumulated_chars >= self.chunk_size
if (is_local_min and has_enough_context and sim <= effective_threshold) or oversize_guard:
boundaries.append(i + 1)
accumulated_chars = 0
boundaries.append(len(sentence_lengths))
return sorted(list(set(boundaries)))
def _assemble_chunks(self, sentences: List[str], boundaries: List[int]) -> List[Dict]:
"""按分割点组装 chunks"""
if not sentences:
return []
# 重新计算 boundaries确保不超过句子数
if not boundaries or boundaries[0] != 0:
boundaries = [0] + boundaries
if boundaries[-1] != len(sentences):
boundaries.append(len(sentences))
chunks = []
for i in range(len(boundaries) - 1):
start = boundaries[i]
end = boundaries[i + 1]
if start >= end:
continue
chunk_text = ' '.join(sentences[start:end]).strip()
if not chunk_text:
continue
# 如果 chunk 过大,递归分割
if len(chunk_text) > self.chunk_size * 1.5:
# 使用更小的窗口再次分割
sub_chunks = self._split_large_chunk(sentences[start:end])
for j, sub in enumerate(sub_chunks):
chunks.append({
"index": len(chunks),
"content": sub.strip(),
"word_count": len(sub.split()),
"char_count": len(sub)
})
else:
chunks.append({
"index": len(chunks),
"content": chunk_text.strip(),
"word_count": len(chunk_text.split()),
"char_count": len(chunk_text)
})
# 合并过小的相邻 chunks
chunks = self._merge_small_chunks(chunks)
return chunks
def _split_large_chunk(self, sentences: List[str]) -> List[str]:
"""分割过大的 chunk"""
# 使用固定长度分割
result = []
current = ""
for sent in sentences:
if len(current) + len(sent) > self.chunk_size:
if current:
result.append(current)
current = sent
else:
current += " " + sent if current else sent
if current:
result.append(current)
return result
def _merge_small_chunks(self, chunks: List[Dict]) -> List[Dict]:
"""合并过小的相邻 chunks"""
if len(chunks) <= 1:
return chunks
merged = [chunks[0]]
for chunk in chunks[1:]:
previous = merged[-1]
should_merge = (
previous["char_count"] < self.min_chunk_size or
chunk["char_count"] < self.min_chunk_size
)
if should_merge and previous["char_count"] + chunk["char_count"] <= self.chunk_size * 1.5:
previous["content"] += " " + chunk["content"]
previous["word_count"] += chunk["word_count"]
previous["char_count"] += chunk["char_count"]
else:
merged.append(chunk)
for index, chunk in enumerate(merged):
chunk["index"] = index
return merged
async def split_with_embedding(self, text: str) -> List[Dict]:
"""使用 Embedding 进行语义分割"""
# 1. 句子切分
sentences = self._tokenize_sentences(text)
if not sentences:
return []
# 过滤纯噪音片段,但保留正常短句
sentences = [s for s in sentences if len(s.strip()) >= 4]
if not sentences:
return []
# 2. 如果只有一个句子,直接返回
if len(sentences) == 1:
return [{
"index": 0,
"content": sentences[0],
"word_count": len(sentences[0].split()),
"char_count": len(sentences[0])
}]
# 3. 调用 Embedding API
try:
if self.embedding_provider is None:
raise ValueError("embedding provider is not configured")
embeddings = await self.embedding_provider.get_embeddings(sentences)
except Exception as e:
# 如果 embedding 失败,降级到规则分割
print(f"Embedding failed, falling back to rule-based: {e}")
return self._fallback_split(text)
if len(embeddings) != len(sentences):
return self._fallback_split(text)
# 4. 计算相似度
similarities = self._compute_similarities(embeddings)
# 5. 检测分割点
boundaries = self._detect_boundaries(similarities, [len(sentence) for sentence in sentences])
# 6. 组装 chunks
chunks = self._assemble_chunks(sentences, boundaries)
return chunks
def _fallback_split(self, text: str) -> List[Dict]:
"""降级到规则分割"""
# 使用 langchain 的 RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(
chunk_size=self.chunk_size,
chunk_overlap=self.overlap,
separators=["\n\n", "\n", "", "", "", ". ", "! ", "? "]
)
chunks = splitter.split_text(text)
return [{
"index": i,
"content": c.strip(),
"word_count": len(c.split()),
"char_count": len(c)
} for i, c in enumerate(chunks)]
class SemanticEmbeddingSplitter(EmbeddingSplitter):
"""基于在线 Embedding 的语义分割器"""
def __init__(
self,
chunk_size: int = 500,
overlap: int = 50,
embedding_provider: Optional[EmbeddingProvider] = None,
similarity_threshold: float = 0.3,
min_chunk_size: int = 100,
window_size: int = 3
):
super().__init__(
chunk_size=chunk_size,
overlap=overlap,
embedding_provider=embedding_provider,
similarity_threshold=similarity_threshold,
min_chunk_size=min_chunk_size,
window_size=window_size
)
def split(self, text: str) -> List[Dict]:
"""同步接口,内部调用异步"""
# 由于 split 是同步方法,需要创建新的事件循环
try:
loop = asyncio.get_event_loop()
if loop.is_running():
# 如果在异步环境中,创建新任务
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor() as pool:
future = pool.submit(asyncio.run, self.split_with_embedding(text))
return future.result()
else:
return loop.run_until_complete(self.split_with_embedding(text))
except RuntimeError:
# 没有事件循环,直接创建
return asyncio.run(self.split_with_embedding(text))
def create_embedding_provider(provider: str, api_key: str, base_url: str, model: str = None) -> EmbeddingProvider:
"""创建 Embedding 提供商"""
if provider in ["openai", "compatible", "ali", "glm"]:
return OpenAIEmbedding(api_key, base_url, model or "text-embedding-3-small")
elif provider == "minimax":
return MiniMaxEmbedding(api_key, base_url)
else:
raise ValueError(f"Unsupported embedding provider: {provider}")

View File

@@ -3,6 +3,7 @@ Text Splitter
"""
import re
from typing import List, Dict, Optional
from langchain_text_splitters import RecursiveCharacterTextSplitter
class TextSplitter:
@@ -18,51 +19,29 @@ class TextSplitter:
class RecursiveTextSplitter(TextSplitter):
"""Recursive character text splitter"""
"""Recursive character text splitter using langchain"""
def __init__(self, chunk_size: int = 500, overlap: int = 50, separators: List[str] = None):
super().__init__(chunk_size, overlap)
self.separators = separators or ["\n\n", "\n", ". ", " ", ""]
self.splitter = RecursiveCharacterTextSplitter(
chunk_size=chunk_size,
chunk_overlap=overlap,
separators=separators or [
"\n\n", "\n", ". ", " ", ",", ""
]
)
def split(self, text: str) -> List[Dict]:
"""Split text recursively"""
chunks = []
current_chunk = ""
chunk_index = 0
for separator in self.separators:
if separator in text:
parts = text.split(separator)
for part in parts:
if len(current_chunk) + len(part) > self.chunk_size:
if current_chunk:
chunks.append({
"index": chunk_index,
"content": current_chunk.strip(),
"word_count": len(current_chunk.split())
chunks = self.splitter.split_text(text)
result = []
for i, chunk in enumerate(chunks):
result.append({
"index": i,
"content": chunk.strip(),
"word_count": len(chunk.split())
})
chunk_index += 1
# Handle overlap
if self.overlap > 0 and chunks:
overlap_text = " ".join(chunks[-1]["content"].split()[-self.overlap:])
current_chunk = overlap_text + separator + part
else:
current_chunk = part
else:
current_chunk += separator + part if current_chunk else part
if current_chunk:
chunks.append({
"index": chunk_index,
"content": current_chunk.strip(),
"word_count": len(current_chunk.split())
})
break
else:
continue
return chunks
return result
class MarkdownStructureSplitter(TextSplitter):
@@ -236,13 +215,199 @@ class CustomSplitter(TextSplitter):
def get_splitter(method: str, **kwargs) -> TextSplitter:
"""Get text splitter by method name"""
# 导入 embedding 分割器
from .semantic_embedding import (
SemanticEmbeddingSplitter,
create_embedding_provider
)
splitters = {
"recursive": RecursiveTextSplitter,
"markdown_structure": MarkdownStructureSplitter,
"token": TokenSplitter,
"code": CodeSplitter,
"custom": CustomSplitter
"custom": CustomSplitter,
"semantic": SemanticSentenceSplitter, # 语义分割(按段落+句子)
"semantic_embedding": None, # 需要特殊处理
"sentence": SentenceSplitter, # 严格按单句分割
"paragraph": ParagraphSplitter, # 按段落分割
}
# 特殊处理 embedding 分割器
if method == "semantic_embedding":
# 提取 embedding 相关参数
embedding_provider = kwargs.pop('embedding_provider', None)
if embedding_provider is None:
# 如果没有提供 provider使用默认配置
# 从 kwargs 中获取模型配置
provider = kwargs.pop('embedding_provider_type', 'openai')
api_key = kwargs.pop('embedding_api_key', '')
base_url = kwargs.pop('embedding_base_url', 'https://api.minimax.chat/v1')
model = kwargs.pop('embedding_model', 'text-embedding-3-small')
if api_key:
embedding_provider = create_embedding_provider(
provider, api_key, base_url, model
)
# 创建分割器
if embedding_provider:
return SemanticEmbeddingSplitter(
embedding_provider=embedding_provider,
**kwargs
)
else:
# 没有 embedding provider降级到 semantic
method = "semantic"
splitter_class = splitters.get(method, RecursiveTextSplitter)
return splitter_class(**kwargs)
class SemanticSentenceSplitter(TextSplitter):
"""语义分割器 - 按段落优先,其次按句子"""
def __init__(self, chunk_size: int = 500, overlap: int = 50):
super().__init__(chunk_size, overlap)
self.splitter = RecursiveCharacterTextSplitter(
chunk_size=chunk_size,
chunk_overlap=overlap,
separators=[
"\n\n", # 段落分隔优先
"", # 中文句号
"", # 中文感叹号
"", # 中文问号
". ", # 英文句号
"! ", # 英文感叹号
"? ", # 英文问号
"\n", # 换行
" ", # 空格
],
length_function=self._count_chars
)
def _count_chars(self, text: str) -> int:
chinese_chars = len(re.findall(r'[\u4e00-\u9fff]', text))
other_chars = len(re.sub(r'[\u4e00-\u9fff]', '', text))
return chinese_chars + int(other_chars * 1.5)
def split(self, text: str) -> List[Dict]:
chunks = self.splitter.split_text(text)
result = []
for i, chunk in enumerate(chunks):
result.append({
"index": i,
"content": chunk.strip(),
"word_count": len(chunk.split()),
"char_count": len(chunk)
})
return result
class SentenceSplitter(TextSplitter):
"""严格按单句分割 - 每个chunk就是一句话"""
def __init__(self, chunk_size: int = 200, overlap: int = 0):
super().__init__(chunk_size, overlap)
# 只按句子结束符分割,不合并
self.splitter = RecursiveCharacterTextSplitter(
chunk_size=chunk_size,
chunk_overlap=overlap,
separators=[
"", # 中文句号
"", # 中文感叹号
"", # 中文问号
". ", # 英文句号
"! ", # 英文感叹号
"? ", # 英文问号
"\n", # 换行
" ", # 空格
],
length_function=lambda x: len(x)
)
def split(self, text: str) -> List[Dict]:
chunks = self.splitter.split_text(text)
result = []
for i, chunk in enumerate(chunks):
chunk = chunk.strip()
if chunk: # 跳过空chunk
result.append({
"index": i,
"content": chunk,
"word_count": len(chunk.split()),
"char_count": len(chunk)
})
return result
class ParagraphSplitter(TextSplitter):
"""按段落分割 - 以空行分隔"""
def __init__(self, chunk_size: int = 2000, overlap: int = 100):
overlap = min(overlap, chunk_size // 2) # overlap 不能超过 chunk_size
super().__init__(chunk_size, overlap)
def split(self, text: str) -> List[Dict]:
# 按空行分割段落
paragraphs = re.split(r'\n\s*\n', text)
result = []
current_chunk = ""
chunk_index = 0
for para in paragraphs:
para = para.strip()
if not para:
continue
# 如果单个段落超过chunk_size递归分割
if len(para) > self.chunk_size:
if current_chunk:
result.append({
"index": chunk_index,
"content": current_chunk.strip(),
"word_count": len(current_chunk.split()),
"char_count": len(current_chunk)
})
chunk_index += 1
current_chunk = ""
# 递归处理大段落
sub_splitter = RecursiveCharacterTextSplitter(
chunk_size=self.chunk_size,
chunk_overlap=self.overlap,
separators=["\n", "", "", "", ". ", "! ", "? "]
)
sub_chunks = sub_splitter.split_text(para)
for sub in sub_chunks:
result.append({
"index": chunk_index,
"content": sub.strip(),
"word_count": len(sub.split()),
"char_count": len(sub)
})
chunk_index += 1
else:
if len(current_chunk) + len(para) > self.chunk_size:
if current_chunk:
result.append({
"index": chunk_index,
"content": current_chunk.strip(),
"word_count": len(current_chunk.split()),
"char_count": len(current_chunk)
})
chunk_index += 1
current_chunk = ""
current_chunk += para + "\n\n"
# 添加最后一个chunk
if current_chunk.strip():
result.append({
"index": chunk_index,
"content": current_chunk.strip(),
"word_count": len(current_chunk.split()),
"char_count": len(current_chunk)
})
return result

File diff suppressed because it is too large Load Diff

125
backend/pyproject.toml Normal file
View File

@@ -0,0 +1,125 @@
[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "yg-dataset"
version = "1.0.0"
description = "Dataset Generation Platform API"
readme = "README.md"
requires-python = ">=3.11"
license = {text = "MIT"}
authors = [
{name = "YG-Dataset Team", email = "team@yg-dataset.com"}
]
keywords = ["dataset", "machine-learning", "llm", "data-generation"]
classifiers = [
"Development Status :: 4 - Beta",
"Framework :: FastAPI",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
]
dependencies = [
"fastapi>=0.115.0",
"uvicorn[standard]>=0.30.0",
"python-multipart>=0.0.9",
"sqlalchemy>=2.0.0",
"alembic>=1.13.0",
"pydantic>=2.0.0",
"pydantic-settings>=2.0.0",
"pdfplumber>=0.10.4",
"python-docx>=1.1.0",
"openpyxl>=3.1.2",
"pandas>=2.2.0",
"ebooklib>=0.5",
"PyMuPDF>=1.24.0",
"langchain>=0.3.0",
"langchain-community>=0.2.0",
"langchain-openai>=0.1.0",
"tiktoken>=0.7.0",
"python-dotenv>=1.0.0",
"python-dateutil>=2.8.2",
"httpx>=0.27.0",
"aiofiles>=23.2.1",
]
[project.optional-dependencies]
dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"pytest-cov>=4.1.0",
"ruff>=0.3.0",
"mypy>=1.8.0",
"pre-commit>=3.6.0",
"black>=24.2.0",
]
[tool.uv]
dev-dependencies = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"pytest-cov>=4.1.0",
"ruff>=0.3.0",
"mypy>=1.8.0",
]
[tool.ruff]
line-length = 100
target-version = "py311"
[tool.ruff.lint]
select = [
"E", # pycodestyle errors
"W", # pycodestyle warnings
"F", # pyflakes
"I", # isort
"B", # flake8-bugbear
"C4", # flake8-comprehensions
"UP", # pyupgrade
]
ignore = [
"E501", # line too long (handled by formatter)
"B008", # do not perform function calls in argument defaults
]
[tool.mypy]
python_version = "3.11"
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = false
disallow_incomplete_defs = false
check_untyped_defs = true
no_implicit_optional = true
warn_redundant_casts = true
warn_unused_ignores = true
warn_no_return = true
[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]
python_files = ["test_*.py"]
python_classes = ["Test*"]
python_functions = ["test_*"]
addopts = "-v --tb=short"
[tool.coverage.run]
source = ["app"]
omit = [
"*/tests/*",
"*/venv/*",
]
[tool.coverage.report]
exclude_lines = [
"pragma: no cover",
"def __repr__",
"raise AssertionError",
"raise NotImplementedError",
"if __name__ == .__main__.:",
]
[project.scripts]
yg-dataset = "app.main:main"

1
backend/uploads/.gitkeep Normal file
View File

@@ -0,0 +1 @@
# This file ensures the uploads directory is tracked in git

3657
backend/uv.lock generated Normal file

File diff suppressed because it is too large Load Diff

BIN
backend/ygdataset.db Normal file

Binary file not shown.

View File

@@ -1,20 +0,0 @@
# Bug 修改记录
## 2026-03-17
### 初始项目创建
- 创建 YG-Dataset 重构项目
- 搭建 FastAPI + Vue 3 基础架构
---
## 修复记录格式
### 日期
**问题描述:**
**原因:**
**修复方案:**
---
*持续更新中...*

97
daily-work/2026-03-17.md Normal file
View File

@@ -0,0 +1,97 @@
# 工作日志 - 2026-03-17
## 项目信息
- 项目: YG-Datasets
- 路径: /data/code/YG-Datasets
## 工作摘要
今日共完成 16 项任务
## 详细记录
### 1. 🟣 Feature 后端核心架构模块
- 时间: 17:28
- 文件: backend/app/core/auth.py, backend/app/core/crud.py, backend/app/core/exceptions.py, backend/app/core/logging.py
- 描述: 添加认证模块、CRUD 基础操作、异常处理、日志模块
### 2. 🟣 Feature 后端 API Schemas 定义
- 时间: 17:28
- 文件: backend/app/schemas/chunk.py, backend/app/schemas/dataset.py, backend/app/schemas/eval.py
- 描述: 添加数据结构的 Schema 定义
### 3. 🟣 Feature 前端 TypeScript 类型定义和组件
- 时间: 17:28
- 文件: frontend/src/api/index.ts, frontend/src/components/, frontend/src/types/
- 描述: 添加 TypeScript API 客户端和组件
### 4. 🟣 Feature 前端页面功能和 UI 优化
- 时间: 17:29
- 文件: frontend/src/views/ModelSettingsView.vue, frontend/src/views/HomeView.vue
- 描述: 添加模型配置页面,优化项目列表和删除功能
### 5. ✅ Change 项目配置文件
- 时间: 17:29
- 文件: backend/pyproject.toml, frontend/tsconfig.json
- 描述: 添加项目配置文件
### 6. ✅ Change 一键启动脚本
- 时间: 17:29
- 文件: start.sh
- 描述: 添加一键启动脚本
### 7. 🟣 Feature 后端 API 端点实现
- 时间: 17:29
- 文件: backend/app/api/v1/projects/__init__.py, backend/app/api/v1/datasets/__init__.py
- 描述: 更新 API 端点实现
### 8. 🟣 Feature 后端核心模块和文件处理
- 时间: 17:30
- 文件: backend/app/core/config.py, backend/app/main.py, backend/app/models/models.py
- 描述: 更新核心模块和文件处理器
### 9. ✅ Change 前端依赖和路由配置
- 时间: 17:30
- 文件: frontend/package.json, frontend/src/router/index.js, frontend/vite.config.js
- 描述: 更新依赖和路由配置
### 10. 🔄 Refactor 前端 API 客户端重构
- 时间: 17:30
- 文件: frontend/src/api/index.js, frontend/src/api/index.ts
- 描述: 用 TypeScript 版本替换 JavaScript API 客户端
### 11. 🔴 Bugfix 修复返回按钮白色背景遮挡
- 时间: 17:35
- 文件: frontend/src/views/ModelSettingsView.vue
- 描述: 修复模型配置页面返回按钮 hover 时白色背景遮挡问题
### 12. 🔴 Bugfix 修复数据库初始化问题
- 时间: 22:40
- 文件: backend/app/core/database.py, backend/app/main.py
- 描述: 修复数据库表未创建的问题,添加 models 导入确保 Base.metadata 包含所有模型
### 13. 🔴 Bugfix 修复 API 响应序列化错误
- 时间: 22:42
- 文件: backend/app/api/v1/models/__init__.py, backend/app/schemas/model.py
- 描述: 修复 SQLAlchemy ORM 对象无法序列化为 JSON 的问题,使用 model_validate() 转换
### 14. 🟣 Feature 添加供应商默认 API Base URL
- 时间: 22:45
- 文件: frontend/src/views/ModelSettingsView.vue
- 描述: 为 MiniMax、GLM、OpenAI Compatible 三个供应商添加默认 API Base URL自动填充
### 15. 🟣 Feature 实现模型连接测试功能
- 时间: 22:50
- 文件: backend/app/api/v1/models/__init__.py, frontend/src/views/ModelSettingsView.vue, frontend/src/api/index.ts
- 描述: 后端添加测试连接 API前端调用并显示连接状态已联通/未联通/待测试)
### 16. 🟣 Feature 创建 git-commit skill
- 时间: 22:55
- 文件: /root/.claude/skills/git-commit/SKILL.md
- 描述: 创建 Git 分批提交技能,自动分析 git 状态,按功能分组文件,生成规范提交信息
---
## 其他工作
- ✅ Change: 前端 UI 样式调整 - 添加 Ant Design Vue 组件库,调整 Select 组件暗色样式
- 📝 Git: 推送所有代码更改到远程仓库,共 10 个 commit

73
daily-work/2026-03-18.md Normal file
View File

@@ -0,0 +1,73 @@
# 工作日志 - 2026-03-18
## 项目信息
- 项目: YG-Datasets
- 路径: /data/code/YG-Datasets
## 工作摘要
今日共完成 11 项任务
## 详细记录
### 1. 🟣 Feature 完善日志系统,支持按日期分目录存储
- 时间: 10:44
- 文件: backend/app/core/logging.py, backend/app/main.py
- 描述: 日志系统支持按日期分目录存储,便于日志管理和分析
### 2. 🟣 Feature 完善前端功能,添加爬虫页面和项目分页
- 时间: 10:45
- 文件: frontend/src/views/HomeView.vue, frontend/src/views/CrawlerView.vue
- 描述: 新增爬虫页面、composables 工具函数、项目列表分页功能
### 3. 🟣 Feature 新增 composables 工具函数和爬虫页面
- 时间: 10:45
- 文件: frontend/src/composables/index.ts, frontend/src/composables/useFormatters.ts, frontend/src/composables/useProjects.ts
- 描述: 添加前端工具函数 composables实现项目、模型、格式化等复用逻辑
### 4. 🔴 Bugfix 修复文件上传后异步处理失败问题
- 时间: 16:08
- 文件: backend/app/api/v1/files/__init__.py, backend/app/core/database.py
- 描述: 修复 async_session_maker 引用错误,确保文件异步处理正常执行
### 5. 🟣 Feature 添加语义嵌入文本分割功能
- 时间: 16:08
- 文件: backend/app/services/text_splitter/semantic_embedding.py, backend/app/services/text_splitter/splitter.py
- 描述: 实现基于语义嵌入的文本分割算法,支持更智能的文本分块
### 6. 🟣 Feature 更新 API 支持语义分割和 embedding 配置
- 时间: 16:08
- 文件: backend/app/api/v1/chunks/__init__.py, backend/app/schemas/
- 描述: 后端 API 支持语义分割模式和 embedding 参数配置
### 7. ✅ Change 优化文件管理上传流程和 UI 体验
- 时间: 16:08
- 文件: frontend/src/views/project/FileManage.vue
- 描述: 优化文件上传流程,添加上传状态轮询、空状态处理、动画效果优化
### 8. 🔄 Refactor 更新项目视图和文本分割页面
- 时间: 16:08
- 文件: frontend/src/views/ProjectView.vue, frontend/src/views/project/TextSplit.vue
- 描述: 重构项目视图移除返回首页按钮,优化 TextSplit 页面样式和交互逻辑
### 9. 🧹 Chore 删除废弃文件
- 时间: 16:08
- 文件: "bug修复.md"
- 描述: 清理废弃文件
---
## 附加工作(会话中完成,尚未提交)
### 10. ✅ Change 评估管理界面样式与文件管理保持一致
- 时间: 17:32
- 文件: frontend/src/views/project/EvalManage.vue
- 描述: 评估管理界面采用与文件管理一致的样式:统计卡片带 glow 效果,空状态轨道动画、表格布局多选功能
### 11. ✅ Change 问答管理界面样式与文件管理保持一致
- 时间: 17:40
- 文件: frontend/src/views/project/QuestionManage.vue
- 描述: 问答管理界面采用与文件管理一致的样式:渐变标题、统计卡片,空状态动画、表格多选批量操作
---
*生成时间: 2026-03-18 17:45*

View File

@@ -0,0 +1,266 @@
文件编号YG-CMMI-CM-PD07
发布日期2023-06-30
现行版本1.3
商密【中】
基线库管理规范
修订历史记录
| 日期 | 版本 | 说明 | 作者/修改人 | 审核 | 批准 |
| ----------- | -------- | -------- | ------- | ---- | ---- |
| 2012-11-14 | 1.0 新增 | | 吴建春 | 李锋 | 卢晓民 |
| | 在规范描述中增加 | EAM 产品的内 | | | |
| 2013-7-23 | 1.1 | | 陈来方 | 卢晓民 | 周立 |
容和入基线的标准邮件
调整“适用部门”、“相关文档”
| 2014-09-23 | 1.2 章节、根据最新组织机构调整本 | | 吴建春 | 卢晓民 | 李美平 |
| ----------- | ------------------- | --- | ---- | ---- | ---- |
文档引用的部门名称
修订页眉中的商标引用,更新规
| 2023-06-30 | 1.3 | | 李锋 | 刘娟 | 向万红 |
| ----------- | ---- | --- | --- | --- | ---- |
范描述
| | | | | | |
| --- | --- | --- | --- | --- | --- |
| | | | | | |
远光软件股份有限公司 发布
文件编号 YG-CMMI-CM-PD07
远光软件股份有限公司
发布日期 2014-09-23
现行版本 1.2
基线库管理规范
页次 第1页 共7页
目录
1. 引言 ............................................................................................................................................................................ 2
1.1 目的 ................................................................................................................................................................... 2
1.2 适用范围 ........................................................................................................................................................... 2
1.2.1 适用部门 ................................................................................................................................................... 2
1.2.2 适用业务 ................................................................................................................................................... 2
1.3 术语和缩略语 ................................................................................................................................................... 2
2. 角色与职责 ................................................................................................................................................................ 2
3. 规范描述 .................................................................................................................................................................... 3
3.1 纳入基线管理的工作产品范围 ........................................................................................................................ 3
3.2 通知入基线角色 ............................................................................................................................................... 3
3.3 通知入基线时机 ............................................................................................................................................... 4
3.4 文档存储地址 ................................................................................................................................................... 4
3.5 权限管理 ........................................................................................................................................................... 5
3.6 入基线文档规范 ............................................................................................................................................... 5
3.6.1 模板引用 ................................................................................................................................................... 5
3.6.2 格式要求 ................................................................................................................................................... 6
3.7 文档入基线及基线变更的充分条件 ................................................................................................................ 6
3.7.1 XX资料已入基线的邮件 .......................................................................................................................... 6
4. 相关文档 .................................................................................................................................................................... 7
5. 参考资料 .................................................................................................................................................................... 7
商密【中】
| | | | 文件编号 | YG-CMMI-CM-PD07 |
| --- | --- | --- | ----- | ---------------- |
远光软件股份有限公司
| | | | 发布日期 | 2014-09-23 |
| --- | --- | --- | ----- | ----------- |
| | | | 现行版本 | 1.2 |
基线库管理规范
| | | | 页次 | 第2页 共7页 |
| --- | --- | --- | --- | -------- |
引言
1.
目的
1.1
为加强对研发线各阶段所输出工件入配置基线库活动的管理,明确规定入基线库的工件范围、及工件
入基线库的时效性,特补充本规范。
1.2 适用范围
1.2.1 适用部门
适用于公司产品研发部门、全资子公司,控股子公司参考执行。
1.2.2 适用业务
各产品研发线,在各阶段输出需入基线库工件的管理。
1.3 术语和缩略语
术语/缩略语 解释
| CM(Configuration Management | |  是软件工程中的一项规程,包括相关工具和应用技术(过程和方 | | |
| ---------------------------- | --- | ------------------------------- | --- | --- |
| 配置管理) | | 法),公司用它来管理软件资产变更。 | | |
| BL (Base Line基线) | |  软件开发过程中的里程碑,它以一或多个软件配置项的交付为标 | | |
志。基线由已经通过正式评审和批准的某规约或产品组成,它因
此可以作为进一步开发的基础,并且只能通过正式的变更控制过
程才能够改变。
| 工件 | |  软件研发生命周期各阶段的工作产品 | | |
| --- | --- | -------------------- | --- | --- |
2. 角色与职责
| 序号 | 角色 | 职责 | | |
| --- | --- | --- | --- | --- |
商密【中】
| | | | 文件编号 | YG-CMMI-CM-PD07 |
| --- | --- | --- | ----- | ---------------- |
远光软件股份有限公司
| | | | 发布日期 | 2014-09-23 |
| --- | --- | --- | ----- | ----------- |
| | | | 现行版本 | 1.2 |
基线库管理规范
| | | | 页次 | 第3页 共7页 |
| --- | --- | --- | --- | -------- |
 负责将评审通过之后的文件入基线库
配置管理员
| 1 | | | | |
| --- | --- | --- | --- | --- |
负责将变更后的文件重新入基线库
 负责确定项目哪些工作产品需入基线库管理
项 目经理
| 2 | |  负责管理并要求配置管理员按要求将各阶段工作产品入基线库 | | |
| --- | --- | ------------------------------- | --- | --- |
 负责申请项目开发立项、发版材料入基线
需求\设计\开发\测试
 负责参与工件入基线前的评审;
| 3 | | | | |
| --- | --- | --- | --- | --- |
责任人
负责申请各阶段工作产品入基线库
 负责审计各阶段工作产品是否按要求入基线库
QA工程师
| 4 | | | | |
| --- | --- | --- | --- | --- |
 负责审计各工作产品的完整性和符合性
规范描述
3.
3.1 纳入基线管理的工作产品范围
| | 工件 | 阶段 | | |
| --------- | --- | ----- | --- | --- |
| 产品需求说明书 | | 需求阶段 | | |
| 需求规格说明书 | | 需求阶段 | | |
| 详细需求说明书 | | 需求阶段 | | |
| 功能设计文档 | | 设计阶段 | | |
| 详细设计文档 | | 设计阶段 | | |
| 测试用例 | | 测试阶段 | | |
| 项目开发立项材料 | | 立项阶段 | | |
| 项目结项材料 | | 结项阶段 | | |
注:
1开发阶段的代码管理依据《YG-CMMI-CM-PD04 配置管理规范》执行。
2发版阶段集成测试结束之后配置项清单检查之前。
通知入基线角色
3.2
| 工件 | 入基线角色 | | | |
| -------- | ------ | --- | --- | --- |
| 产品需求说明书 | 需求责任人 | | | |
商密【中】
文件编号 YG-CMMI-CM-PD07
远光软件股份有限公司
发布日期 2014-09-23
现行版本 1.2
基线库管理规范
页次 第4页 共7页
需求规格说明书 需求责任人
详细需求说明书 需求责任人
功能设计文档 详细设计人员
详细设计文档 详细设计人员
测试用例 测试责任人
项目开发立项材料 项目经理
项目结项材料 项目经理
3.3 通知入基线时机
工件 入基线时机
产品需求说明书 产品需求评审之后
需求规格说明书 设计评审之后
详细需求说明书评审之
详细需求说明书
功能设计文档 评审之后
详细设计文档 评审之后
测试用例 测试用例评审之后
项目开发立项材料 立项评审之后
项目结项材料 结项评审之后
3.4 文档存储地址
1需求阶段
需求规格说明书
http:// 10.50.0.13/FMISdoc//baseline/ 需求规格说明书
2设计阶段
功能设计、详细设计文档:
商密【中】
文件编号 YG-CMMI-CM-PD07
远光软件股份有限公司
发布日期 2014-09-23
现行版本 1.2
基线库管理规范
页次 第5页 共7页
http:// 10.50.0.13/FMISdoc//baseline/设计文档/
3测试阶段
测试用例
测试阶段工件列入VSS管理
4开发阶段的代码管理
统一列入ClearCase管理依照《YG-CMMI-CM-PD04 配置管理规范》执行
5项目开发立项材料
项目开发立项材料
http:// 10.50.0.13/FMISdoc//baseline/XXX(项目名称)/ 项目开发立项材料
6项目结项材料
项目结项材料
http:// 10.50.0.13/FMISdoc//baseline/XXX(项目名称)/ 项目结项材料
注:
上述地址仅供参考,项目文档存储地址地址以实际立项计划中的要求为准。
3.5 权限管理
只允许本项目配置管理员有权限新增、删除、修改基线库内本项目的文档。
其他项目干系人只能拥有本工作范围内工件的读取权限。
3.6 入基线文档规范
3.6.1 模板引用
各阶段工件需按照公司CMMI研发体系中所发布的相应标准模板编写。
商密【中】
文件编号 YG-CMMI-CM-PD07
远光软件股份有限公司
发布日期 2014-09-23
现行版本 1.2
基线库管理规范
页次 第6页 共7页
3.6.2 格式要求
各个工件除了按照标准模板执行外,还需要遵循如下通用要求:
1 文档显示的默认比例必须为100%
2 正式文档必须记录审核人、批准人;
3 各个模板中,相应章节若无内容时必须写“无”来表明,而不允许直接删除本章节或留空。
3.7 文档入基线及基线变更的充分条件
文档入基线或申请变更基线文件时,必须输入入基线申请(变更申请),并保留有审批记录。
入基线申请变更申请和审批记录必须以正式的方式提交依照RTC流程在相应单据中填写
提交、或正式邮件提交。一切口头的或者利用其他交流工具提交的申请,都视为不符合。
3.7.1 XX资料已入基线的邮件
发件人:项目级配置管理师
收件人:编写人(变更处理人)
抄送人配置经理、项目经理、资料使用人、QA
邮件标题关于XX资料已入基线的通知
附件:无
邮件内容模板如下:
编写人(变更处理人):
您好!
XX资料于MM-DD周X入基线获取地址如下XXX。
配置管理工程师XXX
20XX年X月X日
商密【中】
| | | | 文件编号 | YG-CMMI-CM-PD07 |
| --- | --- | --- | ----- | ---------------- |
远光软件股份有限公司
| | | | 发布日期 | 2014-09-23 |
| --- | --- | --- | ----- | ----------- |
| | | | 现行版本 | 1.2 |
基线库管理规范
| | | | 页次 | 第7页 共7页 |
| --- | --- | --- | --- | -------- |
4. 相关文档
《YG-CMMI-CM-PC04 配置管理过程》
《YG-CMMI-CM-PD04 配置管理规范》
《YG-CMMI-PP-TEMP02 项目计划模板》
《YG-CMMI-CM-TEMP13 配置管理计划模板》
参考资料
5.
| 名称 | 来源 | 版本/日期 | | |
| --- | --- | ------ | --- | --- |
| | | | | |
商密【中】

View File

@@ -0,0 +1,266 @@
文件编号YG-CMMI-CM-PD07
发布日期2023-06-30
现行版本1.3
商密【中】
基线库管理规范
修订历史记录
| 日期 | 版本 | 说明 | 作者/修改人 | 审核 | 批准 |
| ----------- | -------- | -------- | ------- | ---- | ---- |
| 2012-11-14 | 1.0 新增 | | 吴建春 | 李锋 | 卢晓民 |
| | 在规范描述中增加 | EAM 产品的内 | | | |
| 2013-7-23 | 1.1 | | 陈来方 | 卢晓民 | 周立 |
容和入基线的标准邮件
调整“适用部门”、“相关文档”
| 2014-09-23 | 1.2 章节、根据最新组织机构调整本 | | 吴建春 | 卢晓民 | 李美平 |
| ----------- | ------------------- | --- | ---- | ---- | ---- |
文档引用的部门名称
修订页眉中的商标引用,更新规
| 2023-06-30 | 1.3 | | 李锋 | 刘娟 | 向万红 |
| ----------- | ---- | --- | --- | --- | ---- |
范描述
| | | | | | |
| --- | --- | --- | --- | --- | --- |
| | | | | | |
远光软件股份有限公司 发布
文件编号 YG-CMMI-CM-PD07
远光软件股份有限公司
发布日期 2014-09-23
现行版本 1.2
基线库管理规范
页次 第1页 共7页
目录
1. 引言 ............................................................................................................................................................................ 2
1.1 目的 ................................................................................................................................................................... 2
1.2 适用范围 ........................................................................................................................................................... 2
1.2.1 适用部门 ................................................................................................................................................... 2
1.2.2 适用业务 ................................................................................................................................................... 2
1.3 术语和缩略语 ................................................................................................................................................... 2
2. 角色与职责 ................................................................................................................................................................ 2
3. 规范描述 .................................................................................................................................................................... 3
3.1 纳入基线管理的工作产品范围 ........................................................................................................................ 3
3.2 通知入基线角色 ............................................................................................................................................... 3
3.3 通知入基线时机 ............................................................................................................................................... 4
3.4 文档存储地址 ................................................................................................................................................... 4
3.5 权限管理 ........................................................................................................................................................... 5
3.6 入基线文档规范 ............................................................................................................................................... 5
3.6.1 模板引用 ................................................................................................................................................... 5
3.6.2 格式要求 ................................................................................................................................................... 6
3.7 文档入基线及基线变更的充分条件 ................................................................................................................ 6
3.7.1 XX资料已入基线的邮件 .......................................................................................................................... 6
4. 相关文档 .................................................................................................................................................................... 7
5. 参考资料 .................................................................................................................................................................... 7
商密【中】
| | | | 文件编号 | YG-CMMI-CM-PD07 |
| --- | --- | --- | ----- | ---------------- |
远光软件股份有限公司
| | | | 发布日期 | 2014-09-23 |
| --- | --- | --- | ----- | ----------- |
| | | | 现行版本 | 1.2 |
基线库管理规范
| | | | 页次 | 第2页 共7页 |
| --- | --- | --- | --- | -------- |
引言
1.
目的
1.1
为加强对研发线各阶段所输出工件入配置基线库活动的管理,明确规定入基线库的工件范围、及工件
入基线库的时效性,特补充本规范。
1.2 适用范围
1.2.1 适用部门
适用于公司产品研发部门、全资子公司,控股子公司参考执行。
1.2.2 适用业务
各产品研发线,在各阶段输出需入基线库工件的管理。
1.3 术语和缩略语
术语/缩略语 解释
| CM(Configuration Management | |  是软件工程中的一项规程,包括相关工具和应用技术(过程和方 | | |
| ---------------------------- | --- | ------------------------------- | --- | --- |
| 配置管理) | | 法),公司用它来管理软件资产变更。 | | |
| BL (Base Line基线) | |  软件开发过程中的里程碑,它以一或多个软件配置项的交付为标 | | |
志。基线由已经通过正式评审和批准的某规约或产品组成,它因
此可以作为进一步开发的基础,并且只能通过正式的变更控制过
程才能够改变。
| 工件 | |  软件研发生命周期各阶段的工作产品 | | |
| --- | --- | -------------------- | --- | --- |
2. 角色与职责
| 序号 | 角色 | 职责 | | |
| --- | --- | --- | --- | --- |
商密【中】
| | | | 文件编号 | YG-CMMI-CM-PD07 |
| --- | --- | --- | ----- | ---------------- |
远光软件股份有限公司
| | | | 发布日期 | 2014-09-23 |
| --- | --- | --- | ----- | ----------- |
| | | | 现行版本 | 1.2 |
基线库管理规范
| | | | 页次 | 第3页 共7页 |
| --- | --- | --- | --- | -------- |
 负责将评审通过之后的文件入基线库
配置管理员
| 1 | | | | |
| --- | --- | --- | --- | --- |
负责将变更后的文件重新入基线库
 负责确定项目哪些工作产品需入基线库管理
项 目经理
| 2 | |  负责管理并要求配置管理员按要求将各阶段工作产品入基线库 | | |
| --- | --- | ------------------------------- | --- | --- |
 负责申请项目开发立项、发版材料入基线
需求\设计\开发\测试
 负责参与工件入基线前的评审;
| 3 | | | | |
| --- | --- | --- | --- | --- |
责任人
负责申请各阶段工作产品入基线库
 负责审计各阶段工作产品是否按要求入基线库
QA工程师
| 4 | | | | |
| --- | --- | --- | --- | --- |
 负责审计各工作产品的完整性和符合性
规范描述
3.
3.1 纳入基线管理的工作产品范围
| | 工件 | 阶段 | | |
| --------- | --- | ----- | --- | --- |
| 产品需求说明书 | | 需求阶段 | | |
| 需求规格说明书 | | 需求阶段 | | |
| 详细需求说明书 | | 需求阶段 | | |
| 功能设计文档 | | 设计阶段 | | |
| 详细设计文档 | | 设计阶段 | | |
| 测试用例 | | 测试阶段 | | |
| 项目开发立项材料 | | 立项阶段 | | |
| 项目结项材料 | | 结项阶段 | | |
注:
1开发阶段的代码管理依据《YG-CMMI-CM-PD04 配置管理规范》执行。
2发版阶段集成测试结束之后配置项清单检查之前。
通知入基线角色
3.2
| 工件 | 入基线角色 | | | |
| -------- | ------ | --- | --- | --- |
| 产品需求说明书 | 需求责任人 | | | |
商密【中】
文件编号 YG-CMMI-CM-PD07
远光软件股份有限公司
发布日期 2014-09-23
现行版本 1.2
基线库管理规范
页次 第4页 共7页
需求规格说明书 需求责任人
详细需求说明书 需求责任人
功能设计文档 详细设计人员
详细设计文档 详细设计人员
测试用例 测试责任人
项目开发立项材料 项目经理
项目结项材料 项目经理
3.3 通知入基线时机
工件 入基线时机
产品需求说明书 产品需求评审之后
需求规格说明书 设计评审之后
详细需求说明书评审之
详细需求说明书
功能设计文档 评审之后
详细设计文档 评审之后
测试用例 测试用例评审之后
项目开发立项材料 立项评审之后
项目结项材料 结项评审之后
3.4 文档存储地址
1需求阶段
需求规格说明书
http:// 10.50.0.13/FMISdoc//baseline/ 需求规格说明书
2设计阶段
功能设计、详细设计文档:
商密【中】
文件编号 YG-CMMI-CM-PD07
远光软件股份有限公司
发布日期 2014-09-23
现行版本 1.2
基线库管理规范
页次 第5页 共7页
http:// 10.50.0.13/FMISdoc//baseline/设计文档/
3测试阶段
测试用例
测试阶段工件列入VSS管理
4开发阶段的代码管理
统一列入ClearCase管理依照《YG-CMMI-CM-PD04 配置管理规范》执行
5项目开发立项材料
项目开发立项材料
http:// 10.50.0.13/FMISdoc//baseline/XXX(项目名称)/ 项目开发立项材料
6项目结项材料
项目结项材料
http:// 10.50.0.13/FMISdoc//baseline/XXX(项目名称)/ 项目结项材料
注:
上述地址仅供参考,项目文档存储地址地址以实际立项计划中的要求为准。
3.5 权限管理
只允许本项目配置管理员有权限新增、删除、修改基线库内本项目的文档。
其他项目干系人只能拥有本工作范围内工件的读取权限。
3.6 入基线文档规范
3.6.1 模板引用
各阶段工件需按照公司CMMI研发体系中所发布的相应标准模板编写。
商密【中】
文件编号 YG-CMMI-CM-PD07
远光软件股份有限公司
发布日期 2014-09-23
现行版本 1.2
基线库管理规范
页次 第6页 共7页
3.6.2 格式要求
各个工件除了按照标准模板执行外,还需要遵循如下通用要求:
1 文档显示的默认比例必须为100%
2 正式文档必须记录审核人、批准人;
3 各个模板中,相应章节若无内容时必须写“无”来表明,而不允许直接删除本章节或留空。
3.7 文档入基线及基线变更的充分条件
文档入基线或申请变更基线文件时,必须输入入基线申请(变更申请),并保留有审批记录。
入基线申请变更申请和审批记录必须以正式的方式提交依照RTC流程在相应单据中填写
提交、或正式邮件提交。一切口头的或者利用其他交流工具提交的申请,都视为不符合。
3.7.1 XX资料已入基线的邮件
发件人:项目级配置管理师
收件人:编写人(变更处理人)
抄送人配置经理、项目经理、资料使用人、QA
邮件标题关于XX资料已入基线的通知
附件:无
邮件内容模板如下:
编写人(变更处理人):
您好!
XX资料于MM-DD周X入基线获取地址如下XXX。
配置管理工程师XXX
20XX年X月X日
商密【中】
| | | | 文件编号 | YG-CMMI-CM-PD07 |
| --- | --- | --- | ----- | ---------------- |
远光软件股份有限公司
| | | | 发布日期 | 2014-09-23 |
| --- | --- | --- | ----- | ----------- |
| | | | 现行版本 | 1.2 |
基线库管理规范
| | | | 页次 | 第7页 共7页 |
| --- | --- | --- | --- | -------- |
4. 相关文档
《YG-CMMI-CM-PC04 配置管理过程》
《YG-CMMI-CM-PD04 配置管理规范》
《YG-CMMI-PP-TEMP02 项目计划模板》
《YG-CMMI-CM-TEMP13 配置管理计划模板》
参考资料
5.
| 名称 | 来源 | 版本/日期 | | |
| --- | --- | ------ | --- | --- |
| | | | | |
商密【中】

View File

@@ -0,0 +1,394 @@
文件编号YG-CMMI-CM-PD04
发布日期2023-06-30
现行版本2.2
商密【中】
代码提交规范
修订历史记录
日期
2013-08-5
版本
1.0 新增代码提交规范 张金金
批准
李剑/陈明有 周立
作者/修改人
说明
审核
2014-03-24
1.1
2017-08-30
1.2
2022-09-9
2.0
2023-02-20
2.1
2023-06-30
2.2
张羡
杨莹
陈金银
陈斯华
李美平
增加代码提交时需写注释,便于构
建系统识别变更集
增加提交代码总则,增加附注 ”GRIS
模块代码复审说明及操作”
修订 1.2.1 适用部门
新增 2 角色与职责;
新增 3 权限管理;
修订 4.1 代码提交原则;
修订 4.3.1 RTC 提交代码规范;
新增 4.3.3 GIT 提交代码规范。
修订适用范围、代码提交原则、增加
代码库明细表。
修订页眉中的商标引用,删除 RTC 代
码提交规范章节
姚国全
陈金银 刘发/王优
陈金银 黄德海
贾士中
向万红
姚国全
李锋
刘娟
远光软件股份有限公司 发布
远光软件股份有限公司
代码提交规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD04
2023-06-30
2.2
第 1 页 共 6 页
目 录
1. 引言 ................................................................................................................................................... 2
1.1. 目的 ............................................................................................................................................................2
1.2. 适用范围 ....................................................................................................................................................2
1.2.1. 适用部门 ................................................................................................................................................2
1.2.2. 适应业务 ................................................................................................................................................2
2. 角色与职责 ............................................................................................................................................... 2
3. 权限管理 ........................................................................................................................................... 2
3.1. 用户管理及授权原则 ........................................................................................................................................2
3.2. 代码权限开通或关闭流程 ........................................................................................................................3
4. 规范描述 ........................................................................................................................................... 4
4.1. 代码提交原则 ............................................................................................................................................4
4.2. 代码库管理工具的特性规范 ....................................................................................................................4
CCClearCase提交代码规范 ..........................................................................................................4
4.2.1.
GIT 提交代码规范 .................................................................................................................................5
4.2.2.
5. 相关附件 ........................................................................................................................................... 5
6. 相关文档 ........................................................................................................................................... 5
7. 附代码库明细表(供参考) .............................................................................................................. 5
商密【中】
远光软件股份有限公司
代码提交规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD04
2023-04-06
2.1
第 2 页 共 6 页
1. 引言
1.1. 目的
为统一规范代码库用户管理及授权控制,规范代码提交,保证代码的安全性、完整性、可追溯性,
特制定本规范,用以保障代码管理工具 GIT 和 CCClearCase等提交代码活动的有序开展。
1.2. 适用范围
1.2.1. 适用部门
适用于公司产品研发部门、全资子公司,控股子公司参考执行。
1.2.2. 适应业务
项目代码交付。
2. 角色与职责
序号
角色
职责
1
2
部门经理
项目经理
/开发经理
 负责审批本部门员工(含借调)提起的跨项目组代码权限申请。
 负责审批项目组内员工(含借调)提起的代码权限申请。
 及时关闭项目组调出、离职员工的代码权限。
3
开发工程师
4
配置管理工程师
 负责发起代码权限开通或关闭申请。
 明确代码库权限开通或关闭范围。
 了解代码库使用规范及要求。
 借调员工加入项目组的接受项目组统一管理。
 员工调岗、离职时需及时申请权限关闭。
 负责代码库权限的设置。
 协助项目成员正常使用代码库。
 填写单据相关信息,更改单据状态。
3. 权限管理
3.1. 用户管理及授权原则
1) 用户代码权限,须经过项目经理(开发经理)或者部门经理审批,在审批通过后方可设置权限。
2) 开通代码权限的用户,须妥善保管好自己的帐号和密码,不得转借他人使用,避免由此带来的泄密
及代码追溯困难等问题的风险。
3) 开通了代码权限的用户,发现帐号/密码泄漏,须及时修改密码,设置的密码应符合公司安全规范。
因不遵守规范造成严重后果的,将按公司相关要求处理。
商密【中】
远光软件股份有限公司
代码提交规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD04
2023-04-06
2.1
第 3 页 共 6 页
4) 开通了代码权限的用户,须严格按照代码提交规范要求及对应工具的操作手册进行代码提交,确保
代码的正确性、完整性及可追溯性。
5) 开通了代码权限的用户,在调岗、离职时,需将代码交付完整后,按流程申请权限关闭。
3.2. 代码权限开通或关闭流程
入口准则
1
输入
1
需要代码库权限开通或关闭
过程步骤
1.1 申请人在公司研发管理平台上新建《代码库权限申请单》,提交代码权限开通或关闭申请。
申请人在《代码库权限申请单》填写相关内容,包括代码库地址、权限说明、执行人及其
它需说明的内容,申请人申请所在项目组代码库权限的由项目经理(开发经理)审批,申
1.2
商密【中】
远光软件股份有限公司
代码提交规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD04
2023-04-06
2.1
第 4 页 共 6 页
请部门其他项目组代码库权限的由项目归属的部门经理审批。
1.3 申请人将《代码库权限申请单》提交给部门经理、或项目经理(开发经理)审批。
1.4 部门经理、或项目经理(开发经理)审批通过,转单给配置管理工程师进行权限设置。
1.5 配置管理工程师按单据设置相应代码权限,并完善单据信息,变更单据状态。
输出
1
《代码库权限申请单》
出口准则
1
《代码库权限申请单》完整填写,权限已设置,单据内容已完善,单据状态已更改为结束。
裁剪
裁剪内容 不可裁剪
裁剪准则 无
4. 规范描述
4.1. 代码提交原则
1) 遵守已发布文档《YG-CMMI-CM-PD04 配置管理规范》中关于代码管理的所有原则。
2) 依据研发管理平台单据交付代码,没纳入版本的需求单/工作单,不允许提交代码至代码库。
3) 交付代码时必须按单号正确填写注释,格式如下:
注释必须含研发管理平台单据号,遵循以下三种要求:
格式 1单号 示例 843186
格式 2单号+空格+注释,示例 843186 注释内容
格式 3一个变更集可以对应多个单号需都写入注释中示例 843186 843187 注释内容
补充说明:
因解决编译报错修改的代码,关联导致编译报错时的代码变更集单据。
解决合并冲突产生的变更集,关联产生冲突时的代码变更集单据。
4) 代码在本地构建通过后才能交付至代码库。
5) 所有提交发布流(分支)、受控流(分支)的代码需经复审后才可正式提交。各代码流(或分支)
的具体管理要求请遵照发布文档《YG-CMMI-CM-PD04 配置管理规范(试行).pdf》中涉及的关于代
码管理的所有原则。
6) 公司产品研发部门所有代码库的发布流(分支)、受控流(分支),原则上由配置管理工程师统一
创建和管理。
7) 未按以上原则进行代码提交和管理,造成代码问题的,依据公司相关规定进行处理。
4.2. 代码库管理工具的特性规范
4.2.1. CCClearCase提交代码规范
1) 按单据号在研发管理平台上申请权限,由研发管理部部门经理审批后开通权限。
商密【中】
远光软件股份有限公司
代码提交规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD04
2023-04-06
2.1
第 5 页 共 6 页
2) 开发工程师在 CC 中提交代码,选中要检出的文件,填写修改该文件的研发管理平台中单据号,如
果多个文件需要同时修改,则选中多个文件,一起检出,填写一个单号即可。
3) 每次创建单号后CC 都会记录,提示框显示出来的单号如果正确,无需重新创建单号。如果没有,
创建即可。同一个单号修改多个文件,无需多次创建同一单号。
4) 详细的 Clearcase 操作指南请参见附件《YG-CMMI-CM-GD01 ClearcaseLT 客户端操作指南.doc》。
4.2.2. GIT 提交代码规范
1) GIT 代码库因其分支灵活的特点,无法进行统一管控,项目组因各种原因须建立个人分支,须邮件
向项目经理申请,项目经理审批通过后方可建立,对建立的个人分支,配置管理工程师不纳入整体
的代码流管理,项目组需妥善管控,并定期清理,由此类分支管理混乱造成代码问题及引发严重后
果的,依据公司相关规定进行处理。
2) 详细的 GIT 操作手册请参见附件《YG-CMMI-CM-GD05 Git 使用手册-开发工程师》。
5. 相关附件
《YG-CMMI-CM-GD04 Git 安装指南-开发工程师》
《YG-CMMI-CM-GD05 Git 使用手册-开发工程师》
《YG-CMMI-CM-GD06 GAP 模块代码复审操作说明》
《RTC 操作手册-开发工程师》
6. 相关文档
《YG-CMMI-CM-PD04 配置管理规范》
《YG-CMMI-CM-PD03 配置项标识规范》
《YG-CMMI-CM-GD01 ClearcaseLT 客户端操作指南》
7. 附代码库明细表(供参考)
代码库明细表.xls
商密【中】

View File

@@ -0,0 +1,394 @@
文件编号YG-CMMI-CM-PD04
发布日期2023-06-30
现行版本2.2
商密【中】
代码提交规范
修订历史记录
日期
2013-08-5
版本
1.0 新增代码提交规范 张金金
批准
李剑/陈明有 周立
作者/修改人
审核
说明
2014-03-24
1.1
2017-08-30
1.2
2022-09-9
2.0
2023-02-20
2.1
2023-06-30
2.2
张羡
杨莹
陈金银
陈斯华
李美平
增加代码提交时需写注释,便于构
建系统识别变更集
增加提交代码总则,增加附注 ”GRIS
模块代码复审说明及操作”
修订 1.2.1 适用部门
新增 2 角色与职责;
新增 3 权限管理;
修订 4.1 代码提交原则;
修订 4.3.1 RTC 提交代码规范;
新增 4.3.3 GIT 提交代码规范。
修订适用范围、代码提交原则、增加
代码库明细表。
修订页眉中的商标引用,删除 RTC 代
码提交规范章节
姚国全
陈金银 刘发/王优
陈金银 黄德海
贾士中
向万红
姚国全
李锋
刘娟
远光软件股份有限公司 发布
远光软件股份有限公司
代码提交规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD04
2023-06-30
2.2
第 1 页 共 6 页
目 录
1. 引言 ................................................................................................................................................... 2
1.1. 目的 ............................................................................................................................................................2
1.2. 适用范围 ....................................................................................................................................................2
1.2.1. 适用部门 ................................................................................................................................................2
1.2.2. 适应业务 ................................................................................................................................................2
2. 角色与职责 ............................................................................................................................................... 2
3. 权限管理 ........................................................................................................................................... 2
3.1. 用户管理及授权原则 ........................................................................................................................................2
3.2. 代码权限开通或关闭流程 ........................................................................................................................3
4. 规范描述 ........................................................................................................................................... 4
4.1. 代码提交原则 ............................................................................................................................................4
4.2. 代码库管理工具的特性规范 ....................................................................................................................4
CCClearCase提交代码规范 ..........................................................................................................4
4.2.1.
GIT 提交代码规范 .................................................................................................................................5
4.2.2.
5. 相关附件 ........................................................................................................................................... 5
6. 相关文档 ........................................................................................................................................... 5
7. 附代码库明细表(供参考) .............................................................................................................. 5
商密【中】
远光软件股份有限公司
代码提交规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD04
2023-04-06
2.1
第 2 页 共 6 页
1. 引言
1.1. 目的
为统一规范代码库用户管理及授权控制,规范代码提交,保证代码的安全性、完整性、可追溯性,
特制定本规范,用以保障代码管理工具 GIT 和 CCClearCase等提交代码活动的有序开展。
1.2. 适用范围
1.2.1. 适用部门
适用于公司产品研发部门、全资子公司,控股子公司参考执行。
1.2.2. 适应业务
项目代码交付。
2. 角色与职责
序号
角色
职责
1
2
部门经理
项目经理
/开发经理
 负责审批本部门员工(含借调)提起的跨项目组代码权限申请。
 负责审批项目组内员工(含借调)提起的代码权限申请。
 及时关闭项目组调出、离职员工的代码权限。
3
开发工程师
4
配置管理工程师
 负责发起代码权限开通或关闭申请。
 明确代码库权限开通或关闭范围。
 了解代码库使用规范及要求。
 借调员工加入项目组的接受项目组统一管理。
 员工调岗、离职时需及时申请权限关闭。
 负责代码库权限的设置。
 协助项目成员正常使用代码库。
 填写单据相关信息,更改单据状态。
3. 权限管理
3.1. 用户管理及授权原则
1) 用户代码权限,须经过项目经理(开发经理)或者部门经理审批,在审批通过后方可设置权限。
2) 开通代码权限的用户,须妥善保管好自己的帐号和密码,不得转借他人使用,避免由此带来的泄密
及代码追溯困难等问题的风险。
3) 开通了代码权限的用户,发现帐号/密码泄漏,须及时修改密码,设置的密码应符合公司安全规范。
因不遵守规范造成严重后果的,将按公司相关要求处理。
商密【中】
远光软件股份有限公司
代码提交规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD04
2023-04-06
2.1
第 3 页 共 6 页
4) 开通了代码权限的用户,须严格按照代码提交规范要求及对应工具的操作手册进行代码提交,确保
代码的正确性、完整性及可追溯性。
5) 开通了代码权限的用户,在调岗、离职时,需将代码交付完整后,按流程申请权限关闭。
3.2. 代码权限开通或关闭流程
入口准则
1
输入
1
需要代码库权限开通或关闭
过程步骤
1.1 申请人在公司研发管理平台上新建《代码库权限申请单》,提交代码权限开通或关闭申请。
申请人在《代码库权限申请单》填写相关内容,包括代码库地址、权限说明、执行人及其
它需说明的内容,申请人申请所在项目组代码库权限的由项目经理(开发经理)审批,申
1.2
商密【中】
远光软件股份有限公司
代码提交规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD04
2023-04-06
2.1
第 4 页 共 6 页
请部门其他项目组代码库权限的由项目归属的部门经理审批。
1.3 申请人将《代码库权限申请单》提交给部门经理、或项目经理(开发经理)审批。
1.4 部门经理、或项目经理(开发经理)审批通过,转单给配置管理工程师进行权限设置。
1.5 配置管理工程师按单据设置相应代码权限,并完善单据信息,变更单据状态。
输出
1
《代码库权限申请单》
出口准则
1
《代码库权限申请单》完整填写,权限已设置,单据内容已完善,单据状态已更改为结束。
裁剪
裁剪内容 不可裁剪
裁剪准则 无
4. 规范描述
4.1. 代码提交原则
1) 遵守已发布文档《YG-CMMI-CM-PD04 配置管理规范》中关于代码管理的所有原则。
2) 依据研发管理平台单据交付代码,没纳入版本的需求单/工作单,不允许提交代码至代码库。
3) 交付代码时必须按单号正确填写注释,格式如下:
注释必须含研发管理平台单据号,遵循以下三种要求:
格式 1单号 示例 843186
格式 2单号+空格+注释,示例 843186 注释内容
格式 3一个变更集可以对应多个单号需都写入注释中示例 843186 843187 注释内容
补充说明:
因解决编译报错修改的代码,关联导致编译报错时的代码变更集单据。
解决合并冲突产生的变更集,关联产生冲突时的代码变更集单据。
4) 代码在本地构建通过后才能交付至代码库。
5) 所有提交发布流(分支)、受控流(分支)的代码需经复审后才可正式提交。各代码流(或分支)
的具体管理要求请遵照发布文档《YG-CMMI-CM-PD04 配置管理规范(试行).pdf》中涉及的关于代
码管理的所有原则。
6) 公司产品研发部门所有代码库的发布流(分支)、受控流(分支),原则上由配置管理工程师统一
创建和管理。
7) 未按以上原则进行代码提交和管理,造成代码问题的,依据公司相关规定进行处理。
4.2. 代码库管理工具的特性规范
4.2.1. CCClearCase提交代码规范
1) 按单据号在研发管理平台上申请权限,由研发管理部部门经理审批后开通权限。
商密【中】
远光软件股份有限公司
代码提交规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD04
2023-04-06
2.1
第 5 页 共 6 页
2) 开发工程师在 CC 中提交代码,选中要检出的文件,填写修改该文件的研发管理平台中单据号,如
果多个文件需要同时修改,则选中多个文件,一起检出,填写一个单号即可。
3) 每次创建单号后CC 都会记录,提示框显示出来的单号如果正确,无需重新创建单号。如果没有,
创建即可。同一个单号修改多个文件,无需多次创建同一单号。
4) 详细的 Clearcase 操作指南请参见附件《YG-CMMI-CM-GD01 ClearcaseLT 客户端操作指南.doc》。
4.2.2. GIT 提交代码规范
1) GIT 代码库因其分支灵活的特点,无法进行统一管控,项目组因各种原因须建立个人分支,须邮件
向项目经理申请,项目经理审批通过后方可建立,对建立的个人分支,配置管理工程师不纳入整体
的代码流管理,项目组需妥善管控,并定期清理,由此类分支管理混乱造成代码问题及引发严重后
果的,依据公司相关规定进行处理。
2) 详细的 GIT 操作手册请参见附件《YG-CMMI-CM-GD05 Git 使用手册-开发工程师》。
5. 相关附件
《YG-CMMI-CM-GD04 Git 安装指南-开发工程师》
《YG-CMMI-CM-GD05 Git 使用手册-开发工程师》
《YG-CMMI-CM-GD06 GAP 模块代码复审操作说明》
《RTC 操作手册-开发工程师》
6. 相关文档
《YG-CMMI-CM-PD04 配置管理规范》
《YG-CMMI-CM-PD03 配置项标识规范》
《YG-CMMI-CM-GD01 ClearcaseLT 客户端操作指南》
7. 附代码库明细表(供参考)
代码库明细表.xls
商密【中】

View File

@@ -0,0 +1,279 @@
文件编号YG-CMMI-CM-PD03
发布日期2023-06-30
现行版本1.2
商密【中】
配置项标识规范
修订历史记录
日期
版本
2008-06-19
1.0
说明
新增文档,用于规范配置项标识
工作,以便进行配置管理
2010-07-01
2023-06-30
1.1 增加商密级别
1.2 修订页眉中的商标引用
作者/修改人
审核
杨莹
吴建春
李锋
周立
周立
刘娟
批准
周立
周立
向万红
远光软件股份有限公司 发布
远光软件股份有限公司
配置项标识规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD03PD03
2023-06-30
V1.2
第 1 页 共 4 页
目 录
1. 引言 .......................................................................................................................................................... 2
1.1. 目的 ....................................................................................................................................................................2
1.1.1. 适用部门 ........................................................................................................................................................2
1.1.2. 适用业务 ........................................................................................................................................................2
1.2. 概述 ....................................................................................................................................................................2
1.3. 术语和缩略语 ....................................................................................................................................................2
2. 角色与职责 ............................................................................................................................................... 2
3. 规范描述 ................................................................................................................................................... 2
3.1. 配置项命名 ........................................................................................................................................................2
3.1.1. 配置项/单元命名规范 ...................................................................................................................................2
3.1.2. 配置单元中的配置项命名 ............................................................................................................................3
3.2. 配置库对象标识 ................................................................................................................................................3
3.2.1. 版本管理 ........................................................................................................................................................3
3.2.2. 基线管理 ........................................................................................................................................................3
3.2.3. 项目 ................................................................................................................................................................3
3.2.4. 开发流 ............................................................................................................................................................3
3.2.5. 活动 ................................................................................................................................................................3
3.2.6. 视图 ................................................................................................................................................................3
3.2.7. VOB .................................................................................................................................................................3
3.2.8. componet 组件 ...............................................................................................................................................3
3.2.9. 文件及文件夹 ................................................................................................................................................4
4. 相关文档 ................................................................................................................................................... 4
5. 参考资料 ................................................................................................................................................... 4
商密【中】
远光软件股份有限公司
配置项标识规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD03
2010-07-01
V1.1
第 2 页 共 4 页
1. 引言
1.1. 目的
对纳入配置库的配置项统一标识,便于管理与查找。配置项的标识是为配置管理所使用的,不要和
文档编号混淆。
1.1.1. 适用部门
配置管理部门。
1.1.2. 适用业务
配置项的标识活动。
1.2. 概述
描述对项目或产品工件如何命名、标记和编号。标识方案涵盖硬件、系统软件以及产品目录结构中
列出的所有应用开发工件,例如,计划、模型、组件、测试软件、结果和数据、可执行文件,等等。
解释
 由配置管理视为一个单一整体而进行处理的工作产品(例如:在
软件生存周期各阶段所产生的各种形式和各种版本的文档、程
序、数据等)以及完成工作产品所需的软件工具和支持系统。
 软件开发过程中的里程碑,它以一或多个软件配置项的交付为标
志。基线由已经通过正式评审和批准的某规约或产品组成,它因
此可以作为进一步开发的基础,并且只能通过正式的变更控制过
程才能够改变.
1.3. 术语和缩略语
术语/缩略语
SCI(Software Configuration Item
软件配置项)
BL (BaseLine基线)
2. 角色与职责
3. 规范描述
主要从配置项和配置库对象两个方面来定义如何进行标识,从而使配置库整洁有序。
3.1. 配置项命名
3.1.1. 配置项/单元命名规范
配置项/单元命名规范为项目简称_配置项/单元名称
项目简称:为客户简称-项目名称缩写,应该是长度 8 位以内的英文字母与连接符的组合。
商密【中】
远光软件股份有限公司
配置项标识规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD03
2010-07-01
V1.1
第 3 页 共 4 页
(项目简称由项目经理提供)
3.1.2. 配置单元中的配置项命名
对于测试用例等配置单元,其中的配置项的命名与编号规范,可以根据项目实际情况做相应约定,
要求是体现配置项内容和所属功能模块等信息。
文档类的配置项,以文档名称进行标识,标识时注意唯一性与可追溯性;
源码类的配置项,可以直接使用程序名称作为配置项标识。
3.2. 配置库对象标识
3.2.1. 版本管理
配置项每check out/in 一次,自身的版本号就会升一,可以直接看到。不允许随便删除版本,尤其是
已经打上label或者基线的版本。
3.2.2. 基线管理
参见文档《基线发布控制规程》
3.2.3. 项目
项目名字由项目经理提供,一般为项目简称。
3.2.4. 开发流
开发流的标识统一使用小写,命名约定:[主项目]_[用途标识],例如 ygerp_3.1_intergration、
ygerp_3.1_report、ygerp_3.1_release 等。
3.2.5. 活动
针对的单据类型R 加需求单号Y 加优化单号W 加工作单号,其他的向组织级配置管理工程师
申报后再决定。
3.2.6. 视图
所有人员建立视图均采用以下模式:
域用户名+流名称,如 admin_ ygerp_3.1_report,此为 admin 用户在 ygerp_3.1_report 上建立的视图,
不鼓励在同一流上建立多个视图。
3.2.7. VOB
按所存贮的数据类型命名,如 A.分析软件
3.2.8. componet 组件
管理组件统一采用 CBL.项目名称。如 CBL. ygerp_3.1。普通组件按功能进行命名。
商密【中】
远光软件股份有限公司
配置项标识规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD03
2010-07-01
V1.1
第 4 页 共 4 页
3.2.9. 文件及文件夹
统一采用小写字母进行命名。
4. 相关文档
5. 参考资料
名称
来源
版本/日期
商密【中】

View File

@@ -0,0 +1,279 @@
文件编号YG-CMMI-CM-PD03
发布日期2023-06-30
现行版本1.2
商密【中】
配置项标识规范
修订历史记录
日期
版本
2008-06-19
1.0
说明
新增文档,用于规范配置项标识
工作,以便进行配置管理
2010-07-01
2023-06-30
1.1 增加商密级别
1.2 修订页眉中的商标引用
作者/修改人
审核
杨莹
吴建春
李锋
周立
周立
刘娟
批准
周立
周立
向万红
远光软件股份有限公司 发布
远光软件股份有限公司
配置项标识规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD03PD03
2023-06-30
V1.2
第 1 页 共 4 页
目 录
1. 引言 .......................................................................................................................................................... 2
1.1. 目的 ....................................................................................................................................................................2
1.1.1. 适用部门 ........................................................................................................................................................2
1.1.2. 适用业务 ........................................................................................................................................................2
1.2. 概述 ....................................................................................................................................................................2
1.3. 术语和缩略语 ....................................................................................................................................................2
2. 角色与职责 ............................................................................................................................................... 2
3. 规范描述 ................................................................................................................................................... 2
3.1. 配置项命名 ........................................................................................................................................................2
3.1.1. 配置项/单元命名规范 ...................................................................................................................................2
3.1.2. 配置单元中的配置项命名 ............................................................................................................................3
3.2. 配置库对象标识 ................................................................................................................................................3
3.2.1. 版本管理 ........................................................................................................................................................3
3.2.2. 基线管理 ........................................................................................................................................................3
3.2.3. 项目 ................................................................................................................................................................3
3.2.4. 开发流 ............................................................................................................................................................3
3.2.5. 活动 ................................................................................................................................................................3
3.2.6. 视图 ................................................................................................................................................................3
3.2.7. VOB .................................................................................................................................................................3
3.2.8. componet 组件 ...............................................................................................................................................3
3.2.9. 文件及文件夹 ................................................................................................................................................4
4. 相关文档 ................................................................................................................................................... 4
5. 参考资料 ................................................................................................................................................... 4
商密【中】
远光软件股份有限公司
配置项标识规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD03
2010-07-01
V1.1
第 2 页 共 4 页
1. 引言
1.1. 目的
对纳入配置库的配置项统一标识,便于管理与查找。配置项的标识是为配置管理所使用的,不要和
文档编号混淆。
1.1.1. 适用部门
配置管理部门。
1.1.2. 适用业务
配置项的标识活动。
1.2. 概述
描述对项目或产品工件如何命名、标记和编号。标识方案涵盖硬件、系统软件以及产品目录结构中
列出的所有应用开发工件,例如,计划、模型、组件、测试软件、结果和数据、可执行文件,等等。
解释
 由配置管理视为一个单一整体而进行处理的工作产品(例如:在
软件生存周期各阶段所产生的各种形式和各种版本的文档、程
序、数据等)以及完成工作产品所需的软件工具和支持系统。
 软件开发过程中的里程碑,它以一或多个软件配置项的交付为标
志。基线由已经通过正式评审和批准的某规约或产品组成,它因
此可以作为进一步开发的基础,并且只能通过正式的变更控制过
程才能够改变.
1.3. 术语和缩略语
术语/缩略语
SCI(Software Configuration Item
软件配置项)
BL (BaseLine基线)
2. 角色与职责
3. 规范描述
主要从配置项和配置库对象两个方面来定义如何进行标识,从而使配置库整洁有序。
3.1. 配置项命名
3.1.1. 配置项/单元命名规范
配置项/单元命名规范为项目简称_配置项/单元名称
项目简称:为客户简称-项目名称缩写,应该是长度 8 位以内的英文字母与连接符的组合。
商密【中】
远光软件股份有限公司
配置项标识规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD03
2010-07-01
V1.1
第 3 页 共 4 页
(项目简称由项目经理提供)
3.1.2. 配置单元中的配置项命名
对于测试用例等配置单元,其中的配置项的命名与编号规范,可以根据项目实际情况做相应约定,
要求是体现配置项内容和所属功能模块等信息。
文档类的配置项,以文档名称进行标识,标识时注意唯一性与可追溯性;
源码类的配置项,可以直接使用程序名称作为配置项标识。
3.2. 配置库对象标识
3.2.1. 版本管理
配置项每check out/in 一次,自身的版本号就会升一,可以直接看到。不允许随便删除版本,尤其是
已经打上label或者基线的版本。
3.2.2. 基线管理
参见文档《基线发布控制规程》
3.2.3. 项目
项目名字由项目经理提供,一般为项目简称。
3.2.4. 开发流
开发流的标识统一使用小写,命名约定:[主项目]_[用途标识],例如 ygerp_3.1_intergration、
ygerp_3.1_report、ygerp_3.1_release 等。
3.2.5. 活动
针对的单据类型R 加需求单号Y 加优化单号W 加工作单号,其他的向组织级配置管理工程师
申报后再决定。
3.2.6. 视图
所有人员建立视图均采用以下模式:
域用户名+流名称,如 admin_ ygerp_3.1_report,此为 admin 用户在 ygerp_3.1_report 上建立的视图,
不鼓励在同一流上建立多个视图。
3.2.7. VOB
按所存贮的数据类型命名,如 A.分析软件
3.2.8. componet 组件
管理组件统一采用 CBL.项目名称。如 CBL. ygerp_3.1。普通组件按功能进行命名。
商密【中】
远光软件股份有限公司
配置项标识规范
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-PD03
2010-07-01
V1.1
第 4 页 共 4 页
3.2.9. 文件及文件夹
统一采用小写字母进行命名。
4. 相关文档
5. 参考资料
名称
来源
版本/日期
商密【中】

View File

@@ -0,0 +1,434 @@
文件编号YG-CMMI-CM-GD04
发布日期2023-06-30
现行版本1.1
商密【中】
关于云效代码提交管理指南
修订历史记录
日期
2022-08-03
版本
1.0 新增
说明
作者/修改人
王优、刘发、胡玲、张金金
审核
黄德海
批准
姚国全
2023-06-30
1.1
修订页眉中的商
标引用
李锋
刘娟
向万红
远光软件股份有限公司 发布
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 1 页 共 8 页
目 录
第一章 总则 ........................................................................................................................................... 2
第一条 目的 ........................................................................................................................................................2
第二条 适用范围 ................................................................................................................................................2
第三条 术语和缩略语 ........................................................................................................................................2
第四条 职责分工 ................................................................................................................................................2
第二章 云效构建流程 ............................................................................................................................. 3
第五条 代码提交审批流程 ................................................................................................................................4
第六条 平台代码提交流程 ................................................................................................................................5
第七条 产品代码提交流程 ................................................................................................................................7
第三章 其他 ........................................................................................................................................... 8
第八条 依赖包白名单 ........................................................................................................................................8
第九条 安全代码扫描 ........................................................................................................................................8
第四章 附则 ........................................................................................................................................... 8
第十条 本指南由 DAP 研发中心负责修订、解释 ............................................................................................8
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 2 页 共 8 页
第一章 总则
第一条 目的
为规范财务中台云效系统代码提交流程,促进各部门高效协作,提高研发项目组提交代码的质量和
效率,特制定本指南。
第二条 适用范围
适用于财务中台产品代码提交云效系统涉及的各研发部门。
第三条 术语和缩略语
术语/缩略语
解释
 云效是阿里巴巴云原生时代一站式 BizDevOps 平台A 集团引
入使用,进行端到端的全流程管理。
 研发项目组自研程序包。
 第三方开源程序包或付费购买的程序包。
云效
二方包
三方包
第四条 职责分工
序号
角色
职责
1
研发项目组
1. 项目经理负责云效代码提交申请;
2. 负责平台/产品功能测试,保证代码质量;
3. 负责输出静态资源到指定目录;
4. 项目组对应的部门经理负责审批代码提交申请。
2
DAP 研发中心-测
试 及 交 付 中 心
(下文简称:测
1. 负责准备测试环境;
2. 负责产品功能的联调测试和自动化测试;
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 3 页 共 8 页
试及交付中心)
3. 平台冒烟通过后,负责更新测试环境,组织产品冒烟。
信创平台部-测试
部(下文简称:
平台测试部)
DAP 研发中心-研
发管理部-配置组
(下文简称:配
置组)
电力行业技术支
持中心(下文简
称:技术支持中
心)
1. 负责准备测试环境;
2. 负责平台功能的联调测试和自动化测试,测试通过后邮件知会相关干系
人。
1. 负责代码本地编译验证并提取依赖包;
2. 负责上传代码、依赖包、静态资源及镜像依赖文件至云效;
3. 负责云效编译并输出镜像。
1. 负责提供部署环境需要的镜像依赖文件;
2. 负责组织在仿真环境和生产环境开展镜像部署;
3. 负责审批云效代码的提交。
电力行业一部
1. 负责组织在仿真环境和生产环境开展测试。
信创平台部-公共
服务部-安全实验
室(下文简称:
安全实验室)
1. 负责对研发项目组提交云效的代码进行安全扫描。
3
4
5
6
7
第二章 云效构建流程
代码提交到云效系统有两种更新方式:
1、 增量更新:按需发起申请,适用于不影响其他组功能使用的代码提交;
2、 全量更新:按迭代或版本进行更新,由技术支持中心发起。更新前必须通过系统测试和安全测
试。
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 4 页 共 8 页
第五条 代码提交审批流程
1、 此流程适用增量更新时使用,全量更新由技术支持中心根据具体问题判断,直接发起。
2、 发起申请:由研发项目组的项目经理申请提交代码到云效。
3、 部门审核:由项目组对应的部门经理判断新提交代码是否影响其他项目组重新编译,只有不影
响或能明确列出影响范围更新清单并通过测试验证的,才能提交技术支持中心负责人审批;平
台代码不仅需要判断对平台其他组是否有影响,还需判断对其他产品是否有影响(是否需要其
他产品调整代码或重新编译),如有影响审批不通过;如需更新,由技术支持中心根据具体问
题判断是否发起全量更新。
4、 技术支持中心审批:由技术支持中心负责人根据更新时间、更新内容等确定是否审批,审批通
过后方可启动提交代码。
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 5 页 共 8 页
5、 启动云效代码提交流程:审批通过后,由配置组提取代码、依赖包和静态资源,启动提交代码
的工作。
第六条 平台代码提交流程
1、 提交代码:由平台研发项目组对准备提交云效的代码、依赖包、静态资源(需同步输出到指定
目录)等进行测试验证,此过程需要本地编译构建成功,通过平台测试部的冒烟测试,同时通
过测试及交付中心的冒烟测试。
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 6 页 共 8 页
平台静态资源路径:
\\10.50.9.101\server\modular whgz\ECP\ECPV8.0.0\finalZipFile
2、 编译构建:由配置组根据平台提供的代码(当前为 boot 包),进行纯环境编译,确认获取依赖
包(二方包、三方包)。如果选择增量更新,由发起者提供当前变更二方包列表、三方包列表
和静态资源列表,如果选择全量更新,则无需提供。
3、 获取镜像依赖文件:由技术支持中心按部署环境提供 dockerfile、配置文件和基础镜像等依赖文
件给配置组。
4、 云效提交:由配置组根据获取的依赖包,清除云效 maven 库相关二方包,提交更新的二方包、
三方包、静态资源和 boot 代码并通过云效编译。
5、 生成并推送镜像:由配置组云效编译并获取镜像依赖文件打包,输出镜像,并将生成的镜像推
送云效镜像库。
6、 仿真环境镜像部署:由技术支持中心组织在仿真环境进行镜像部署。
7、 仿真环境测试:由电力行业一部组织在仿真环境开展功能测试验证。
8、 生产环境镜像部署:由技术支持中心负责将镜像部署到生产环境,并在生产环境进行镜像部署。
9、 生产环境测试:由电力行业一部组织在生产环境开展功能测试验证。
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 7 页 共 8 页
第七条 产品代码提交流程
1、 提交代码:由产品研发项目组对准备提交云效的代码、依赖包、静态资源(需同步输出到指定
目录)进行测试验证。此过程需要本地编译构建成功,通过测试及交付中心的产品联调测试和
自动化测试。
产品静态资源路径:
\\10.50.9.101\server\modular whgz\dap9.0.0master_sp\fmp_resource\finalZipFile
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 8 页 共 8 页
2、 编译构建:由配置组根据项目组提交的代码,进行纯环境编译,确认获取依赖包(二方包、三
方包)。如果选择增量更新,由发起者提供当前变更二方包列表、三方包列表、静态资源列表,
如果选择全量更新,则无需提供。
3、 获取镜像所需依赖文件:由技术支持中心按部署环境提供 dockerfile、配置文件和基础镜像等依
赖文件给配置组。
4、 云效提交:由配置组根据获取的依赖包,清除云效 maven 库相关二方包,提交更新二方包、三
方包、静态资源和代码并通过云效编译。
5、 生成并推送镜像:由配置组云效编译并获取镜像依赖文件打包,输出镜像,并将生成的镜像推
送云效镜像库。
6、 仿真环境镜像部署:由技术支持中心组织在仿真环境进行镜像部署。
7、 仿真环境测试:由电力行业一部组织在仿真环境开展功能测试验证。
8、 生产环境镜像部署:由技术支持中心负责将镜像部署到生产环境,并在生产环境进行镜像部署。
9、 生产环境测试:由电力行业一部组织在生产环境开展功能测试验证。
第三章 其他
第八条 依赖包白名单
项目组提交云效的代码,编译所需依赖包,必须在配置组提供的依赖包白名单中,并通过 maven 编
译。
第九条 安全代码扫描
按生产环境更新要求,研发项目组提交云效的代码需通过安全实验室相关工具的安全扫描。
第四章 附则
第十条 本指南由 DAP 研发中心负责修订、解释
商密【中】

View File

@@ -0,0 +1,434 @@
文件编号YG-CMMI-CM-GD04
发布日期2023-06-30
现行版本1.1
商密【中】
关于云效代码提交管理指南
修订历史记录
日期
2022-08-03
版本
1.0 新增
说明
作者/修改人
王优、刘发、胡玲、张金金
审核
黄德海
批准
姚国全
2023-06-30
1.1
修订页眉中的商
标引用
李锋
刘娟
向万红
远光软件股份有限公司 发布
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 1 页 共 8 页
目 录
第一章 总则 ........................................................................................................................................... 2
第一条 目的 ........................................................................................................................................................2
第二条 适用范围 ................................................................................................................................................2
第三条 术语和缩略语 ........................................................................................................................................2
第四条 职责分工 ................................................................................................................................................2
第二章 云效构建流程 ............................................................................................................................. 3
第五条 代码提交审批流程 ................................................................................................................................4
第六条 平台代码提交流程 ................................................................................................................................5
第七条 产品代码提交流程 ................................................................................................................................7
第三章 其他 ........................................................................................................................................... 8
第八条 依赖包白名单 ........................................................................................................................................8
第九条 安全代码扫描 ........................................................................................................................................8
第四章 附则 ........................................................................................................................................... 8
第十条 本指南由 DAP 研发中心负责修订、解释 ............................................................................................8
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 2 页 共 8 页
第一章 总则
第一条 目的
为规范财务中台云效系统代码提交流程,促进各部门高效协作,提高研发项目组提交代码的质量和
效率,特制定本指南。
第二条 适用范围
适用于财务中台产品代码提交云效系统涉及的各研发部门。
第三条 术语和缩略语
术语/缩略语
解释
 云效是阿里巴巴云原生时代一站式 BizDevOps 平台A 集团引
入使用,进行端到端的全流程管理。
 研发项目组自研程序包。
 第三方开源程序包或付费购买的程序包。
云效
二方包
三方包
第四条 职责分工
序号
角色
职责
1
研发项目组
1. 项目经理负责云效代码提交申请;
2. 负责平台/产品功能测试,保证代码质量;
3. 负责输出静态资源到指定目录;
4. 项目组对应的部门经理负责审批代码提交申请。
2
DAP 研发中心-测
试 及 交 付 中 心
(下文简称:测
1. 负责准备测试环境;
2. 负责产品功能的联调测试和自动化测试;
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 3 页 共 8 页
试及交付中心)
3. 平台冒烟通过后,负责更新测试环境,组织产品冒烟。
信创平台部-测试
部(下文简称:
平台测试部)
DAP 研发中心-研
发管理部-配置组
(下文简称:配
置组)
电力行业技术支
持中心(下文简
称:技术支持中
心)
1. 负责准备测试环境;
2. 负责平台功能的联调测试和自动化测试,测试通过后邮件知会相关干系
人。
1. 负责代码本地编译验证并提取依赖包;
2. 负责上传代码、依赖包、静态资源及镜像依赖文件至云效;
3. 负责云效编译并输出镜像。
1. 负责提供部署环境需要的镜像依赖文件;
2. 负责组织在仿真环境和生产环境开展镜像部署;
3. 负责审批云效代码的提交。
电力行业一部
1. 负责组织在仿真环境和生产环境开展测试。
信创平台部-公共
服务部-安全实验
室(下文简称:
安全实验室)
1. 负责对研发项目组提交云效的代码进行安全扫描。
3
4
5
6
7
第二章 云效构建流程
代码提交到云效系统有两种更新方式:
1、 增量更新:按需发起申请,适用于不影响其他组功能使用的代码提交;
2、 全量更新:按迭代或版本进行更新,由技术支持中心发起。更新前必须通过系统测试和安全测
试。
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 4 页 共 8 页
第五条 代码提交审批流程
1、 此流程适用增量更新时使用,全量更新由技术支持中心根据具体问题判断,直接发起。
2、 发起申请:由研发项目组的项目经理申请提交代码到云效。
3、 部门审核:由项目组对应的部门经理判断新提交代码是否影响其他项目组重新编译,只有不影
响或能明确列出影响范围更新清单并通过测试验证的,才能提交技术支持中心负责人审批;平
台代码不仅需要判断对平台其他组是否有影响,还需判断对其他产品是否有影响(是否需要其
他产品调整代码或重新编译),如有影响审批不通过;如需更新,由技术支持中心根据具体问
题判断是否发起全量更新。
4、 技术支持中心审批:由技术支持中心负责人根据更新时间、更新内容等确定是否审批,审批通
过后方可启动提交代码。
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 5 页 共 8 页
5、 启动云效代码提交流程:审批通过后,由配置组提取代码、依赖包和静态资源,启动提交代码
的工作。
第六条 平台代码提交流程
1、 提交代码:由平台研发项目组对准备提交云效的代码、依赖包、静态资源(需同步输出到指定
目录)等进行测试验证,此过程需要本地编译构建成功,通过平台测试部的冒烟测试,同时通
过测试及交付中心的冒烟测试。
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 6 页 共 8 页
平台静态资源路径:
\\10.50.9.101\server\modular whgz\ECP\ECPV8.0.0\finalZipFile
2、 编译构建:由配置组根据平台提供的代码(当前为 boot 包),进行纯环境编译,确认获取依赖
包(二方包、三方包)。如果选择增量更新,由发起者提供当前变更二方包列表、三方包列表
和静态资源列表,如果选择全量更新,则无需提供。
3、 获取镜像依赖文件:由技术支持中心按部署环境提供 dockerfile、配置文件和基础镜像等依赖文
件给配置组。
4、 云效提交:由配置组根据获取的依赖包,清除云效 maven 库相关二方包,提交更新的二方包、
三方包、静态资源和 boot 代码并通过云效编译。
5、 生成并推送镜像:由配置组云效编译并获取镜像依赖文件打包,输出镜像,并将生成的镜像推
送云效镜像库。
6、 仿真环境镜像部署:由技术支持中心组织在仿真环境进行镜像部署。
7、 仿真环境测试:由电力行业一部组织在仿真环境开展功能测试验证。
8、 生产环境镜像部署:由技术支持中心负责将镜像部署到生产环境,并在生产环境进行镜像部署。
9、 生产环境测试:由电力行业一部组织在生产环境开展功能测试验证。
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 7 页 共 8 页
第七条 产品代码提交流程
1、 提交代码:由产品研发项目组对准备提交云效的代码、依赖包、静态资源(需同步输出到指定
目录)进行测试验证。此过程需要本地编译构建成功,通过测试及交付中心的产品联调测试和
自动化测试。
产品静态资源路径:
\\10.50.9.101\server\modular whgz\dap9.0.0master_sp\fmp_resource\finalZipFile
商密【中】
远光软件股份有限公司
关于云效代码提交管理指南
文件编号
发布日期
现行版本
页 次
YG-CMMI-CM-GD04
2023-06-30
V1.1
第 8 页 共 8 页
2、 编译构建:由配置组根据项目组提交的代码,进行纯环境编译,确认获取依赖包(二方包、三
方包)。如果选择增量更新,由发起者提供当前变更二方包列表、三方包列表、静态资源列表,
如果选择全量更新,则无需提供。
3、 获取镜像所需依赖文件:由技术支持中心按部署环境提供 dockerfile、配置文件和基础镜像等依
赖文件给配置组。
4、 云效提交:由配置组根据获取的依赖包,清除云效 maven 库相关二方包,提交更新二方包、三
方包、静态资源和代码并通过云效编译。
5、 生成并推送镜像:由配置组云效编译并获取镜像依赖文件打包,输出镜像,并将生成的镜像推
送云效镜像库。
6、 仿真环境镜像部署:由技术支持中心组织在仿真环境进行镜像部署。
7、 仿真环境测试:由电力行业一部组织在仿真环境开展功能测试验证。
8、 生产环境镜像部署:由技术支持中心负责将镜像部署到生产环境,并在生产环境进行镜像部署。
9、 生产环境测试:由电力行业一部组织在生产环境开展功能测试验证。
第三章 其他
第八条 依赖包白名单
项目组提交云效的代码,编译所需依赖包,必须在配置组提供的依赖包白名单中,并通过 maven 编
译。
第九条 安全代码扫描
按生产环境更新要求,研发项目组提交云效的代码需通过安全实验室相关工具的安全扫描。
第四章 附则
第十条 本指南由 DAP 研发中心负责修订、解释
商密【中】

View File

@@ -8,6 +8,6 @@
</head>
<body>
<div id="app"></div>
<script type="module" src="/src/main.js"></script>
<script type="module" src="/src/main.ts"></script>
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -8,17 +8,22 @@
"preview": "vite preview"
},
"dependencies": {
"@element-plus/icons-vue": "^2.3.0",
"@vueuse/core": "^11.0.0",
"ant-design-vue": "^4.2.6",
"axios": "^1.7.0",
"element-plus": "^2.8.0",
"pinia": "^2.2.0",
"vue": "^3.5.0",
"vue-router": "^4.4.0",
"pinia": "^2.2.0",
"element-plus": "^2.8.0",
"@element-plus/icons-vue": "^2.3.0",
"axios": "^1.7.0",
"@vueuse/core": "^11.0.0",
"xlsx": "^0.18.5"
},
"devDependencies": {
"@vitejs/plugin-vue": "^5.0.0",
"vite": "^6.0.0"
"@vitejs/plugin-vue": "^5.2.4",
"sass": "^1.77.0",
"sass-embedded": "^1.98.0",
"typescript": "^5.9.3",
"vite": "^6.0.0",
"vue-tsc": "^3.2.5"
}
}

View File

@@ -33,12 +33,12 @@ const locale = ref(zhCn)
@import url('https://fonts.googleapis.com/css2?family=Outfit:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap');
:root {
/* Core Colors - Deep Space */
--bg-primary: #030407;
--bg-secondary: #0a0a0f;
--bg-tertiary: #12121a;
--bg-elevated: #1a1a24;
--bg-hover: #22222e;
/* Core Colors - Deep Space with Gradient */
--bg-primary: #0a0a14;
--bg-secondary: #12121f;
--bg-tertiary: #1a1a2a;
--bg-elevated: #222233;
--bg-hover: #2a2a3d;
/* Accent - Cyan Violet */
--accent-primary: #00d4ff;
@@ -69,8 +69,8 @@ const locale = ref(zhCn)
/* Effects */
--glow-primary: 0 0 30px rgba(0, 212, 255, 0.25);
--glow-secondary: 0 0 30px rgba(124, 58, 237, 0.25);
--glass-bg: rgba(18, 18, 26, 0.6);
--glass-border: rgba(255, 255, 255, 0.06);
--glass-bg: rgba(26, 26, 42, 0.65);
--glass-border: rgba(255, 255, 255, 0.08);
/* Spacing */
--radius-sm: 6px;
@@ -110,6 +110,12 @@ html, body, #app {
min-height: 100vh;
position: relative;
overflow: hidden;
background:
radial-gradient(ellipse 100% 80% at 50% 120%, rgba(99, 102, 241, 0.15), transparent 50%),
radial-gradient(ellipse 80% 60% at 80% 10%, rgba(0, 212, 255, 0.12), transparent 40%),
radial-gradient(ellipse 70% 50% at 10% 80%, rgba(124, 58, 237, 0.1), transparent 40%),
radial-gradient(ellipse 60% 40% at 90% 90%, rgba(236, 72, 153, 0.08), transparent 40%),
linear-gradient(180deg, var(--bg-primary) 0%, #0f0f1a 100%);
}
.bg-mesh {
@@ -123,50 +129,66 @@ html, body, #app {
.mesh-gradient {
position: absolute;
border-radius: 50%;
filter: blur(120px);
filter: blur(100px);
opacity: 0.5;
animation: float 25s ease-in-out infinite;
animation: sciFiFloat 15s ease-in-out infinite;
}
.mesh-1 {
width: 700px;
height: 700px;
background: radial-gradient(circle, rgba(0, 212, 255, 0.35) 0%, transparent 70%);
top: -250px;
right: -150px;
width: 900px;
height: 900px;
background: radial-gradient(circle, rgba(0, 212, 255, 0.5) 0%, rgba(6, 182, 212, 0.25) 40%, transparent 70%);
top: -400px;
left: -300px;
animation-delay: 0s;
animation-duration: 18s;
}
.mesh-2 {
width: 600px;
height: 600px;
background: radial-gradient(circle, rgba(124, 58, 237, 0.3) 0%, transparent 70%);
bottom: -200px;
left: -150px;
animation-delay: -8s;
width: 800px;
height: 800px;
background: radial-gradient(circle, rgba(124, 58, 237, 0.5) 0%, rgba(139, 92, 246, 0.25) 40%, transparent 70%);
bottom: -300px;
right: -200px;
animation-delay: -5s;
animation-duration: 20s;
}
.mesh-3 {
width: 500px;
height: 500px;
background: radial-gradient(circle, rgba(6, 182, 212, 0.2) 0%, transparent 70%);
top: 40%;
left: 30%;
animation-delay: -16s;
width: 700px;
height: 700px;
background: radial-gradient(circle, rgba(236, 72, 153, 0.4) 0%, rgba(168, 85, 247, 0.2) 40%, transparent 70%);
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
animation-delay: -10s;
animation-duration: 22s;
}
@keyframes float {
0%, 100% { transform: translate(0, 0) scale(1); }
25% { transform: translate(40px, -40px) scale(1.08); }
50% { transform: translate(-30px, 30px) scale(0.95); }
75% { transform: translate(-40px, -25px) scale(1.03); }
@keyframes sciFiFloat {
0%, 100% {
transform: translate(0, 0) scale(1) rotate(0deg);
opacity: 0.4;
}
25% {
transform: translate(200px, -150px) scale(1.2) rotate(10deg);
opacity: 0.65;
}
50% {
transform: translate(-150px, 100px) scale(0.85) rotate(-5deg);
opacity: 0.5;
}
75% {
transform: translate(100px, 150px) scale(1.15) rotate(8deg);
opacity: 0.6;
}
}
.noise-overlay {
position: absolute;
inset: 0;
background-image: url("data:image/svg+xml,%3Csvg viewBox='0 0 256 256' xmlns='http://www.w3.org/2000/svg'%3E%3Cfilter id='noise'%3E%3CfeTurbulence type='fractalNoise' baseFrequency='0.9' numOctaves='4' stitchTiles='stitch'/%3E%3C/filter%3E%3Crect width='100%25' height='100%25' filter='url(%23noise)'/%3E%3C/svg%3E");
opacity: 0.025;
background-image: url("data:image/svg+xml,%3Csvg viewBox='0 0 256 256' xmlns='http://www.w3.org/2000/svg'%3E%3Cfilter id='noise'%3E%3CfeTurbulence type='fractalNoise' baseFrequency='0.8' numOctaves='4' stitchTiles='stitch'/%3E%3C/filter%3E%3Crect width='100%25' height='100%25' filter='url(%23noise)'/%3E%3C/svg%3E");
opacity: 0.035;
}
.app-content {
@@ -248,6 +270,36 @@ html, body, #app {
background: var(--bg-tertiary);
}
/* Ant Design Vue Select Dropdown */
.ant-select-dropdown {
background: var(--bg-elevated) !important;
border: 1px solid var(--border-subtle) !important;
border-radius: 8px !important;
box-shadow: 0 6px 16px rgba(0, 0, 0, 0.3) !important;
}
.ant-select-item {
color: var(--text-primary) !important;
border-radius: 6px !important;
}
.ant-select-item-option-active:not(.ant-select-item-option-disabled) {
background: var(--bg-hover) !important;
}
.ant-select-item-option-selected:not(.ant-select-item-option-disabled) {
background: var(--accent-primary-muted) !important;
color: var(--accent-primary) !important;
}
.ant-select-selection-item {
color: var(--text-primary) !important;
}
.ant-select-selection-placeholder {
color: var(--text-secondary) !important;
}
.el-textarea__inner {
background: var(--bg-tertiary) !important;
border: 1px solid var(--border-subtle) !important;

View File

@@ -1,81 +0,0 @@
import axios from 'axios'
const request = axios.create({
baseURL: '/api/v1',
timeout: 60000
})
// Request interceptor
request.interceptors.request.use(
config => {
return config
},
error => {
return Promise.reject(error)
}
)
// Response interceptor
request.interceptors.response.use(
response => {
return response.data
},
error => {
const message = error.response?.data?.message || error.message || '请求失败'
console.error('API Error:', message)
return Promise.reject(error)
}
)
export const projectApi = {
list: () => request.get('/projects/'),
get: (id) => request.get(`/projects/${id}`),
create: (data) => request.post('/projects/', data),
update: (id, data) => request.put(`/projects/${id}`, data),
delete: (id) => request.delete(`/projects/${id}`)
}
export const fileApi = {
upload: (projectId, formData) =>
request.post(`/projects/${projectId}/files/upload`, formData, {
headers: { 'Content-Type': 'multipart/form-data' }
}),
list: (projectId) => request.get(`/projects/${projectId}/files/`),
get: (projectId, fileId) => request.get(`/projects/${projectId}/files/${fileId}`),
delete: (projectId, fileId) => request.delete(`/projects/${projectId}/files/${fileId}`)
}
export const chunkApi = {
split: (projectId, data) => request.post(`/projects/${projectId}/chunks/split`, data),
list: (projectId, params) => request.get(`/projects/${projectId}/chunks/`, { params }),
get: (projectId, chunkId) => request.get(`/projects/${projectId}/chunks/${chunkId}`),
update: (projectId, chunkId, data) => request.put(`/projects/${projectId}/chunks/${chunkId}`, data),
delete: (projectId, chunkId) => request.delete(`/projects/${projectId}/chunks/${chunkId}`)
}
export const questionApi = {
generate: (projectId, data) => request.post(`/projects/${projectId}/generate-questions`, data),
list: (projectId, params) => request.get(`/projects/${projectId}/chunks/${params.chunkId}/questions`),
update: (projectId, questionId, data) => request.put(`/projects/${projectId}/questions/${questionId}`, data),
delete: (projectId, questionId) => request.delete(`/projects/${projectId}/questions/${questionId}`)
}
export const datasetApi = {
list: (projectId) => request.get(`/projects/${projectId}/datasets/`),
create: (projectId, data) => request.post(`/projects/${projectId}/datasets/`, data),
get: (projectId, datasetId) => request.get(`/projects/${projectId}/datasets/${datasetId}`),
delete: (projectId, datasetId) => request.delete(`/projects/${projectId}/datasets/${datasetId}`),
export: (projectId, datasetId, data) =>
request.post(`/projects/${projectId}/datasets/${datasetId}/export`, data, {
responseType: 'blob'
})
}
export const evalApi = {
list: (projectId) => request.get(`/projects/${projectId}/eval-datasets/`),
create: (projectId, data) => request.post(`/projects/${projectId}/eval-datasets/`, data),
run: (projectId, evalId) => request.post(`/projects/${projectId}/eval-datasets/${evalId}/evaluate`),
getResults: (projectId, taskId) => request.get(`/projects/${projectId}/eval-tasks/${taskId}`)
}
export default request

View File

@@ -0,0 +1,122 @@
import axios from 'axios'
import type { AxiosInstance } from 'axios'
import type { Project, ProjectCreate, ProjectUpdate, Model, ModelCreate } from '@/shared/types'
const apiBaseURL = import.meta.env.VITE_API_BASE_URL
|| (import.meta.env.PROD
? '/api/v1'
: `${window.location.protocol}//${window.location.hostname}:8000/api/v1`)
const request: AxiosInstance = axios.create({
baseURL: apiBaseURL,
timeout: 60000
})
// Request interceptor
request.interceptors.request.use(
config => {
return config
},
error => {
return Promise.reject(error)
}
)
// Response interceptor
request.interceptors.response.use(
response => {
const data = response.data
// Handle new ApiResponse format
if (data.success !== undefined) {
if (data.success) {
// Check if this is a paginated response by checking for pagination field
if (data.pagination) {
// Return full response with pagination for paginated endpoints
return {
items: data.data,
total: data.pagination.total,
page: data.pagination.page,
page_size: data.pagination.page_size,
total_pages: data.pagination.total_pages
}
}
return data.data // Return the actual data
} else {
return Promise.reject(new Error(data.message || data.error || '请求失败'))
}
}
return data
},
error => {
const message = error.response?.data?.message || error.message || '请求失败'
console.error('API Error:', message)
return Promise.reject(error)
}
)
export const projectApi = {
list: (params?: { page?: number; page_size?: number }) =>
request.get<{ items: Project[]; pagination: { total: number } }>('/projects', { params }),
get: (id: string) => request.get<Project>(`/projects/${id}`),
create: (data: ProjectCreate) => request.post<{ id: string }>('/projects', data),
update: (id: string, data: ProjectUpdate) => request.put<Project>(`/projects/${id}`, data),
delete: (id: string) => request.delete(`/projects/${id}`)
}
export const fileApi = {
upload: (projectId: string, formData: FormData) =>
request.post(`/projects/${projectId}/files/upload`, formData, {
headers: { 'Content-Type': 'multipart/form-data' }
}),
list: (projectId: string) => request.get(`/projects/${projectId}/files`),
get: (projectId: string, fileId: string) => request.get(`/projects/${projectId}/files/${fileId}`),
delete: (projectId: string, fileId: string) => request.delete(`/projects/${projectId}/files/${fileId}`)
}
export const chunkApi = {
split: (projectId: string, data: any) =>
request.post(`/projects/${projectId}/chunks/split`, data, {
timeout: 300000
}),
list: (projectId: string, params?: any) => request.get(`/projects/${projectId}/chunks`, { params }),
get: (projectId: string, chunkId: string) => request.get(`/projects/${projectId}/chunks/${chunkId}`),
update: (projectId: string, chunkId: string, data: any) => request.put(`/projects/${projectId}/chunks/${chunkId}`, data),
delete: (projectId: string, chunkId: string) => request.delete(`/projects/${projectId}/chunks/${chunkId}`)
}
export const questionApi = {
generate: (projectId: string, data: any) => request.post(`/projects/${projectId}/questions/generate`, data),
list: (projectId: string, params?: any) => request.get(`/projects/${projectId}/questions`, { params }),
update: (projectId: string, questionId: string, data: any) => request.put(`/projects/${projectId}/questions/${questionId}`, data),
delete: (projectId: string, questionId: string) => request.delete(`/projects/${projectId}/questions/${questionId}`)
}
export const datasetApi = {
list: (projectId: string) => request.get(`/projects/${projectId}/datasets`),
create: (projectId: string, data: any) => request.post(`/projects/${projectId}/datasets`, data),
get: (projectId: string, datasetId: string) => request.get(`/projects/${projectId}/datasets/${datasetId}`),
delete: (projectId: string, datasetId: string) => request.delete(`/projects/${projectId}/datasets/${datasetId}`),
export: (projectId: string, datasetId: string, data: any) =>
request.post(`/projects/${projectId}/datasets/${datasetId}/export`, data, {
responseType: 'blob'
})
}
export const evalApi = {
list: (projectId: string) => request.get(`/projects/${projectId}/eval-datasets/`),
create: (projectId: string, data: any) => request.post(`/projects/${projectId}/eval-datasets/`, data),
run: (projectId: string, evalId: string) => request.post(`/projects/${projectId}/eval-datasets/${evalId}/evaluate`),
getResults: (projectId: string, taskId: string) => request.get(`/projects/${projectId}/eval-tasks/${taskId}`)
}
export const modelApi = {
list: () => request.get<Model[]>('/models/'),
get: (id: string) => request.get<Model>(`/models/${id}`),
create: (data: ModelCreate) => request.post<{ id: string }>('/models/', data),
update: (id: string, data: Partial<Model>) => request.put<Model>(`/models/${id}`, data),
delete: (id: string) => request.delete(`/models/${id}`),
setDefault: (id: string) => request.post(`/models/${id}/set-default`),
test: (id: string) => request.post<{ success: boolean; message: string }>(`/models/${id}/test`)
}
export default request

View File

@@ -0,0 +1,67 @@
import { createRouter, createWebHistory, type RouteRecordRaw } from 'vue-router'
const routes: RouteRecordRaw[] = [
{
path: '/',
name: 'Home',
component: () => import('@/pages/HomePage.vue')
},
{
path: '/project/:id',
name: 'Project',
component: () => import('@/pages/ProjectPage.vue'),
children: [
{
path: '',
redirect: to => `/project/${String(to.params.id)}/files`
},
{
path: 'files',
name: 'ProjectFiles',
component: () => import('@/pages/ProjectFilePage.vue')
},
{
path: 'split',
name: 'ProjectSplit',
component: () => import('@/pages/ProjectTextSplitPage.vue')
},
{
path: 'questions',
name: 'ProjectQuestions',
component: () => import('@/pages/ProjectQuestionPage.vue')
},
{
path: 'datasets',
name: 'ProjectDatasets',
component: () => import('@/pages/ProjectDatasetPage.vue')
},
{
path: 'eval',
name: 'ProjectEval',
component: () => import('@/pages/ProjectEvalPage.vue')
},
{
path: 'settings',
name: 'ProjectSettings',
component: () => import('@/pages/ProjectSettingsPage.vue')
}
]
},
{
path: '/models',
name: 'ModelSettings',
component: () => import('@/pages/ModelSettingsPage.vue')
},
{
path: '/crawler',
name: 'Crawler',
component: () => import('@/pages/CrawlerPage.vue')
}
]
const router = createRouter({
history: createWebHistory(),
routes
})
export default router

View File

@@ -2,10 +2,12 @@ import { createApp } from 'vue'
import { createPinia } from 'pinia'
import ElementPlus from 'element-plus'
import 'element-plus/dist/index.css'
import Antd from 'ant-design-vue'
import 'ant-design-vue/dist/reset.css'
import * as ElementPlusIconsVue from '@element-plus/icons-vue'
import App from './App.vue'
import router from './router'
import router from './core/router'
const app = createApp(App)
@@ -17,5 +19,6 @@ for (const [key, component] of Object.entries(ElementPlusIconsVue)) {
app.use(createPinia())
app.use(router)
app.use(ElementPlus)
app.use(Antd)
app.mount('#app')

View File

@@ -0,0 +1,260 @@
import { defineComponent } from 'vue'
import { ref, reactive, computed, onMounted } from 'vue'
import { useRouter } from 'vue-router'
import { ElMessage } from 'element-plus'
import type { ModelConfig, ProviderOption, ModelCreate, ModelType } from '@/shared/types'
import { modelApi } from '@/core/api'
import { watch } from 'vue'
export default defineComponent({
name: 'ModelSettingsView',
setup() {
const router = useRouter()
// 状态
const loading = ref(false)
const submitting = ref(false)
const deleting = ref(false)
const showAddDialog = ref(false)
const deleteDialogVisible = ref(false)
const modelToDelete = ref<ModelConfig | null>(null)
const models = ref<ModelConfig[]>([])
// 表单
const modelForm = reactive<ModelCreate>({
provider: 'minimax',
model_type: 'chat',
model_name: '',
api_key: '',
api_base: 'https://api.minimax.chat/v1',
is_default: false
})
// 供应商默认 API 地址
const providerDefaultUrls: Record<string, string> = {
minimax: 'https://api.minimax.chat/v1',
glm: 'https://open.bigmodel.cn/api/paas/v4',
openai: 'https://api.openai.com/v1',
ali: 'https://dashscope.aliyuncs.com/compatible-mode/v1'
}
type ProviderOptionItem = ProviderOption & {
desc: string
}
type ModelTypeOption = {
value: ModelType
label: string
abbr: string
desc: string
}
// 提供商
const providers: ProviderOptionItem[] = [
{ value: 'minimax', label: 'MiniMax', abbr: 'MM', desc: '适合国内接入,默认官方端点' },
{ value: 'glm', label: 'GLM', abbr: 'GL', desc: '智谱接口,兼容常见模型配置' },
{ value: 'openai', label: 'OpenAI Compatible', abbr: 'OP', desc: '适配 OpenAI 及兼容协议服务' },
{ value: 'ali', label: '阿里云百炼', abbr: 'AL', desc: '默认走 DashScope 兼容模式端点' }
]
const modelTypes: ModelTypeOption[] = [
{ value: 'chat', label: 'Chat', abbr: 'CH', desc: '标准对话生成模型' },
{ value: 'vlm', label: 'VLM', abbr: 'VL', desc: '视觉语言模型,适合图文输入' },
{ value: 'embedding', label: 'Embedding', abbr: 'EM', desc: '文本向量化与语义检索' },
{ value: 'rerank', label: 'Rerank', abbr: 'RR', desc: '重排模型,用于检索结果排序' }
]
const normalizeModelType = (modelType?: string, modelName?: string): ModelType => {
if (modelType && modelTypes.some(type => type.value === modelType) && modelType !== 'chat') {
return modelType as ModelType
}
const normalizedName = (modelName || '').trim().toLowerCase()
if (['rerank', 'bce-reranker', 'gte-rerank'].some(keyword => normalizedName.includes(keyword))) {
return 'rerank'
}
if (
[
'embedding',
'embed',
'text-embedding',
'bge-',
'bge_m3',
'gte-',
'm3e',
'e5-',
'jina-embeddings'
].some(keyword => normalizedName.includes(keyword))
) {
return 'embedding'
}
if (['vl', 'vision', 'visual', 'multimodal', 'qwen-vl', 'gpt-4o'].some(keyword => normalizedName.includes(keyword))) {
return 'vlm'
}
return 'chat'
}
// 监听 provider 变化,自动设置默认 API 地址
watch(() => modelForm.provider, (newProvider) => {
if (providerDefaultUrls[newProvider]) {
modelForm.api_base = providerDefaultUrls[newProvider]
}
})
// 方法
const goHome = () => router.push('/')
const getProviderAbbr = (provider: string) => {
const p = providers.find(p => p.value === provider)
return p?.abbr || '?'
}
const getModelTypeLabel = (modelType?: string, modelName?: string) => {
const item = modelTypes.find(type => type.value === normalizeModelType(modelType, modelName))
return item?.label || 'Chat'
}
const fetchModels = async () => {
loading.value = true
try {
const res = await modelApi.list()
// Handle different response formats
if (Array.isArray(res)) {
models.value = res.map(model => ({
...model,
model_type: normalizeModelType(model.model_type, model.model_name)
}))
} else if (res?.data && Array.isArray(res.data)) {
models.value = res.data.map((model: ModelConfig) => ({
...model,
model_type: normalizeModelType(model.model_type, model.model_name)
}))
} else {
models.value = []
}
} catch (error: any) {
console.error('获取模型列表失败:', error)
ElMessage.error(error?.message || '加载失败')
} finally {
loading.value = false
}
}
const openAddDialog = () => {
modelForm.provider = 'minimax'
modelForm.model_type = 'chat'
modelForm.model_name = ''
modelForm.api_key = ''
modelForm.api_base = providerDefaultUrls['minimax']
modelForm.is_default = false
showAddDialog.value = true
}
const addModel = async () => {
if (!modelForm.model_name || !modelForm.api_key) {
ElMessage.warning('请填写模型名称和 API Key')
return
}
submitting.value = true
try {
// Convert is_default from boolean to string
const data = {
provider: modelForm.provider,
model_type: modelForm.model_type,
model_name: modelForm.model_name,
api_key: modelForm.api_key,
api_base: modelForm.api_base,
is_default: modelForm.is_default ? 'true' : 'false'
}
await modelApi.create(data)
ElMessage.success('添加成功')
showAddDialog.value = false
fetchModels()
} catch (error: any) {
console.error('添加模型失败:', error)
ElMessage.error(error?.message || '添加失败')
} finally {
submitting.value = false
}
}
const confirmDelete = (model: ModelConfig) => {
modelToDelete.value = model
deleteDialogVisible.value = true
}
const handleDelete = async () => {
if (!modelToDelete.value?.id) return
deleting.value = true
try {
await modelApi.delete(modelToDelete.value.id)
ElMessage.success('删除成功')
deleteDialogVisible.value = false
modelToDelete.value = null
fetchModels()
} catch (error: any) {
console.error('删除模型失败:', error)
ElMessage.error(error?.message || '删除失败')
} finally {
deleting.value = false
}
}
const testConnection = async (model: ModelConfig) => {
ElMessage.info(`正在测试 ${model.model_name}...`)
try {
const res = await modelApi.test(model.id)
// Update model connection status from response
const modelItem = models.value.find(m => m.id === model.id)
if (modelItem && res?.model) {
modelItem.connection_status = res.model.connection_status
if (res.test_result?.success) {
ElMessage.success('连接成功!')
} else {
ElMessage.error(res.test_result?.message || '连接失败')
}
}
} catch (error: any) {
console.error('测试连接失败:', error)
const modelItem = models.value.find(m => m.id === model.id)
if (modelItem) {
modelItem.connection_status = 'disconnected'
}
ElMessage.error(error?.message || '连接失败')
}
}
onMounted(() => fetchModels())
return {
router,
loading,
submitting,
deleting,
showAddDialog,
deleteDialogVisible,
modelToDelete,
models,
modelForm,
providerDefaultUrls,
providers,
modelTypes,
normalizeModelType,
goHome,
getProviderAbbr,
getModelTypeLabel,
fetchModels,
openAddDialog,
addModel,
confirmDelete,
handleDelete,
testConnection
}
}
})

View File

@@ -0,0 +1,377 @@
import { defineComponent } from 'vue'
import { ref, computed, onMounted } from 'vue'
import { useRoute } from 'vue-router'
import { ElMessage } from 'element-plus'
import { fileApi } from '@/core/api'
import DeleteDialog from '@/shared/components/common/DeleteDialog.vue'
export default defineComponent({
name: 'FileManage',
components: { DeleteDialog },
setup() {
const route = useRoute()
const projectId = computed(() => route.params.id)
const loading = ref(false)
const files = ref([])
const filterStatus = ref('')
const isInitialLoad = ref(true)
const filteredFiles = computed(() => {
if (!filterStatus.value) return files.value
return files.value.filter(f => f.status === filterStatus.value)
})
const uploadDialogVisible = ref(false)
const uploading = ref(false)
const uploadRef = ref(null)
const fileList = ref([])
const deleteDialogVisible = ref(false)
const pendingDeleteFile = ref(null)
const deletingFile = ref(false)
// Multi-select
const selectedFiles = ref([])
const isAllSelected = computed(() => filteredFiles.value.length > 0 && selectedFiles.value.length === filteredFiles.value.length)
const selectedCount = computed(() => selectedFiles.value.length)
const toggleSelectAll = () => {
if (isAllSelected.value) {
selectedFiles.value = []
} else {
selectedFiles.value = filteredFiles.value.map(f => f.id)
}
}
const toggleSelect = (fileId) => {
const index = selectedFiles.value.indexOf(fileId)
if (index === -1) {
selectedFiles.value.push(fileId)
} else {
selectedFiles.value.splice(index, 1)
}
}
const isSelected = (fileId) => selectedFiles.value.includes(fileId)
const clearSelection = () => {
selectedFiles.value = []
}
const batchDeleteDialogVisible = ref(false)
const batchDeleting = ref(false)
const batchDeleteFiles = ref([])
const batchDelete = async () => {
if (selectedFiles.value.length === 0) return
batchDeleteFiles.value = files.value.filter(f => selectedFiles.value.includes(f.id))
batchDeleteDialogVisible.value = true
}
const executeBatchDelete = async () => {
if (selectedFiles.value.length === 0) return
batchDeleting.value = true
try {
for (const fileId of selectedFiles.value) {
await fileApi.delete(projectId.value, fileId)
}
ElMessage.success(`已删除 ${selectedFiles.value.length} 个文件`)
selectedFiles.value = []
batchDeleteDialogVisible.value = false
fetchFiles()
} catch (error) {
ElMessage.error('删除失败')
} finally {
batchDeleting.value = false
}
}
// Preview
const previewVisible = ref(false)
const previewFile = ref(null)
const previewContent = ref('')
const previewLoading = ref(false)
const previewMode = ref('source') // 'source' | 'markdown'
const isPdfPreview = ref(false)
const pdfDataUrl = ref('')
const previewError = ref('')
const completedFiles = computed(() => files.value.filter(f => f.status === 'completed').length)
const processingFiles = computed(() => files.value.filter(f => f.status === 'processing' || f.status === 'pending'))
const failedFiles = computed(() => files.value.filter(f => f.status === 'failed').length)
const fetchFiles = async () => {
const wasInitial = isInitialLoad.value
loading.value = true
try {
const res = await fileApi.list(projectId.value)
files.value = res || []
} catch (error) {
files.value = []
} finally {
loading.value = false
if (wasInitial) {
isInitialLoad.value = false
}
}
}
const handleUpload = () => {
fileList.value = []
uploadDialogVisible.value = true
}
const handleChange = (file, files) => { fileList.value = files }
const handleRemove = (file, files) => { fileList.value = files }
const triggerUpload = () => {
const input = uploadRef.value?.$el?.querySelector('input')
if (input) {
input.click()
}
}
const submitUpload = async () => {
if (fileList.value.length === 0) {
ElMessage.warning('请先选择文件')
return
}
// 保存上传前的文件数量
const prevFileCount = files.value.length
// 先关闭对话框
uploadDialogVisible.value = false
ElMessage.success('已开始上传文件')
// 设置上传状态,防止显示空状态
uploading.value = true
// 在后台逐个上传(不等待上传完成)
const uploadPromises = fileList.value.map(async (item) => {
try {
const formData = new FormData()
formData.append('file', item.raw)
await fileApi.upload(projectId.value, formData)
} catch (error) {
console.error('上传失败:', error)
}
})
// 立即刷新文件列表,显示新增的文件(状态为 processing
await fetchFiles()
// 如果之前没有文件需要等待上传的Promise完成后再刷新一次
if (prevFileCount === 0) {
await Promise.all(uploadPromises)
await fetchFiles()
}
// 持续轮询文件列表,直到没有 processing 状态的文件
const pollInterval = setInterval(async () => {
await fetchFiles()
// 检查是否还有处理中的文件
const hasProcessing = files.value.some(f => f.status === 'processing')
if (!hasProcessing) {
clearInterval(pollInterval)
uploading.value = false
}
}, 2000)
// 最多轮询60秒
setTimeout(() => {
clearInterval(pollInterval)
uploading.value = false
}, 60000)
}
const handleDelete = async (file) => {
try {
deletingFile.value = true
await fileApi.delete(projectId.value, file.id)
ElMessage.success('删除成功')
fetchFiles()
} catch (error) {
ElMessage.error('删除失败')
} finally {
deletingFile.value = false
deleteDialogVisible.value = false
pendingDeleteFile.value = null
}
}
const openDeleteDialog = (file) => {
pendingDeleteFile.value = file
deleteDialogVisible.value = true
}
const confirmDeleteFile = async () => {
if (!pendingDeleteFile.value) return
await handleDelete(pendingDeleteFile.value)
}
const handlePreview = async (file) => {
previewFile.value = file
previewVisible.value = true
previewContent.value = ''
previewError.value = ''
previewLoading.value = true
previewMode.value = 'source'
try {
await loadPreviewContent()
} finally {
previewLoading.value = false
}
}
const loadPreviewContent = async () => {
if (!previewFile.value) return
previewLoading.value = true
previewContent.value = ''
previewError.value = ''
isPdfPreview.value = false
pdfDataUrl.value = ''
try {
const endpoint = previewMode.value === 'source' ? 'raw' : 'content'
const response = await fetch(`/api/v1/projects/${projectId.value}/files/${previewFile.value.id}/${endpoint}`)
if (response.ok) {
const text = await response.text()
if (text.startsWith('data:application/pdf;base64,')) {
isPdfPreview.value = true
pdfDataUrl.value = text
previewContent.value = ''
} else {
previewContent.value = text
}
} else if (response.status === 404) {
previewError.value = previewMode.value === 'source'
? '源文件不存在或已被删除'
: 'Markdown 内容不存在,请等待处理完成'
} else if (response.status === 500) {
previewError.value = '服务器内部错误,请稍后重试'
} else {
previewError.value = `加载失败 (${response.status})`
}
} catch (error) {
previewError.value = '网络错误,请检查网络连接'
} finally {
previewLoading.value = false
}
}
const switchPreviewMode = async (mode) => {
previewMode.value = mode
await loadPreviewContent()
}
const getFileIcon = (type) => {
const map = { pdf: 'Document', docx: 'Document', xlsx: 'Grid', csv: 'Document', epub: 'Notebook', md: 'Document', txt: 'Document' }
return map[type] || 'Document'
}
const getTypeColor = (type) => {
const map = {
pdf: '#ef4444',
docx: '#3b82f6',
xlsx: '#22c55e',
csv: '#22c55e',
epub: '#f59e0b',
md: '#8b5cf6',
txt: '#6b7280'
}
return map[type] || '#6b7280'
}
const getFileExt = (filename) => {
if (!filename) return ''
const ext = filename.split('.').pop()?.toLowerCase()
return ext ? '.' + ext : ''
}
const getStatusText = (status) => {
const map = {
processing: '处理中',
completed: '已完成',
failed: '失败',
pending: '待处理'
}
return map[status] || '未知'
}
const formatSize = (bytes) => {
if (!bytes) return '0 B'
const k = 1024
const sizes = ['B', 'KB', 'MB', 'GB']
const i = Math.floor(Math.log(bytes) / Math.log(k))
return parseFloat((bytes / Math.pow(k, i)).toFixed(1)) + ' ' + sizes[i]
}
const formatDate = (date) => {
if (!date) return ''
return new Date(date).toLocaleDateString('zh-CN', { month: 'short', day: 'numeric', hour: '2-digit', minute: '2-digit' })
}
onMounted(() => fetchFiles())
return {
route,
projectId,
loading,
files,
filterStatus,
isInitialLoad,
filteredFiles,
uploadDialogVisible,
uploading,
uploadRef,
fileList,
deleteDialogVisible,
pendingDeleteFile,
deletingFile,
selectedFiles,
isAllSelected,
selectedCount,
toggleSelectAll,
toggleSelect,
isSelected,
clearSelection,
batchDeleteDialogVisible,
batchDeleting,
batchDeleteFiles,
batchDelete,
executeBatchDelete,
previewVisible,
previewFile,
previewContent,
previewLoading,
previewMode,
isPdfPreview,
pdfDataUrl,
previewError,
completedFiles,
processingFiles,
failedFiles,
fetchFiles,
handleUpload,
handleChange,
handleRemove,
triggerUpload,
submitUpload,
handleDelete,
openDeleteDialog,
confirmDeleteFile,
handlePreview,
loadPreviewContent,
switchPreviewMode,
getFileIcon,
getTypeColor,
getFileExt,
getStatusText,
formatSize,
formatDate
}
}
})

View File

@@ -0,0 +1,252 @@
import { defineComponent } from 'vue'
import { ref, reactive, computed, onMounted } from 'vue'
import { useRoute } from 'vue-router'
import { ElMessage } from 'element-plus'
import { chunkApi, questionApi, modelApi } from '@/core/api'
export default defineComponent({
name: 'QuestionManage',
setup() {
const route = useRoute()
const projectId = computed(() => route.params.id)
const loading = ref(false)
const isInitialLoad = ref(true)
const generating = ref(false)
const questions = ref([])
const chunks = ref([])
const availableModels = ref([])
const showGenerateDialog = ref(false)
const filterStatus = ref('')
const chunkMap = ref({})
const DEFAULT_GENERATE_PROMPT = '你是一名高质量中文问答数据构建助手。请基于给定 chunk 内容生成准确、自然、可用于训练的数据集问答对。问题必须清晰具体,答案必须直接来自内容或基于内容做合理概括,不要编造原文没有的信息,不要输出与目录、导航、页眉页脚、噪声文字相关的问题。'
const generateConfig = reactive({
model_id: '',
chunk_ids: [],
count: 3,
dirty_data_filter: true,
thinking_mode: true,
preset_prompt: DEFAULT_GENERATE_PROMPT
})
// Multi-select
const selectedQuestions = ref([])
const filteredQuestions = computed(() => {
if (!filterStatus.value) return questions.value
return questions.value.filter(q => q.source === filterStatus.value)
})
const generatedCount = computed(() => questions.value.filter(q => q.source === 'generated').length)
const manualCount = computed(() => questions.value.filter(q => q.source === 'manual').length)
const failedCount = computed(() => questions.value.filter(q => q.status === 'failed').length)
const generateModels = computed(() => {
return availableModels.value.filter(model => {
const type = normalizeModelType(model.model_type, model.model_name)
return type === 'chat' || type === 'vlm'
})
})
const isAllSelected = computed(() => filteredQuestions.value.length > 0 && selectedQuestions.value.length === filteredQuestions.value.length)
const selectedCount = computed(() => selectedQuestions.value.length)
const toggleSelectAll = () => {
if (isAllSelected.value) {
selectedQuestions.value = []
} else {
selectedQuestions.value = filteredQuestions.value.map(q => q.id)
}
}
const toggleSelect = (id) => {
const index = selectedQuestions.value.indexOf(id)
if (index === -1) {
selectedQuestions.value.push(id)
} else {
selectedQuestions.value.splice(index, 1)
}
}
const isSelected = (id) => selectedQuestions.value.includes(id)
const clearSelection = () => {
selectedQuestions.value = []
}
const batchDelete = async () => {
if (selectedQuestions.value.length === 0) return
try {
for (const id of selectedQuestions.value) {
await questionApi.delete(projectId.value, id)
}
ElMessage.success(`已删除 ${selectedQuestions.value.length} 个问题`)
selectedQuestions.value = []
fetchQuestions()
} catch (error) {
ElMessage.error('删除失败')
}
}
const normalizeModelType = (modelType, modelName = '') => {
if (modelType && modelType !== 'chat') {
return modelType
}
const normalizedName = String(modelName).trim().toLowerCase()
if (['rerank', 'bce-reranker', 'gte-rerank'].some(keyword => normalizedName.includes(keyword))) return 'rerank'
if (['embedding', 'embed', 'text-embedding', 'bge-', 'gte-', 'm3e', 'e5-', 'jina-embeddings'].some(keyword => normalizedName.includes(keyword))) return 'embedding'
if (['vl', 'vision', 'visual', 'multimodal', 'qwen-vl', 'gpt-4o'].some(keyword => normalizedName.includes(keyword))) return 'vlm'
return 'chat'
}
const getProviderLabel = (provider) => {
const map = {
openai: 'OpenAI Compatible',
minimax: 'MiniMax',
glm: 'GLM',
ali: '阿里云百炼'
}
return map[provider] || provider
}
const fetchAvailableModels = async () => {
try {
const res = await modelApi.list()
availableModels.value = Array.isArray(res) ? res : (res?.data || [])
if (!generateConfig.model_id && generateModels.value.length) {
const defaultModel = generateModels.value.find(model => model.is_default === 'true') || generateModels.value[0]
generateConfig.model_id = defaultModel?.id || ''
}
} catch (error) {
availableModels.value = []
}
}
const fetchAllChunks = async () => {
const allChunks = []
let page = 1
let total = 0
do {
const res = await chunkApi.list(projectId.value, { page, page_size: 100 })
const items = res.items || res.data || []
total = res.total || res.pagination?.total || items.length
allChunks.push(...items)
page += 1
} while (allChunks.length < total)
return allChunks
}
const fetchQuestions = async () => {
const wasInitial = isInitialLoad.value
loading.value = true
try {
const [chunkList, questionRes] = await Promise.all([
fetchAllChunks(),
questionApi.list(projectId.value, { page: 1, page_size: 500 })
])
chunks.value = chunkList
chunkMap.value = Object.fromEntries(chunkList.map(chunk => [chunk.id, chunk]))
questions.value = questionRes.items || questionRes.data || []
} catch (error) {
questions.value = []
} finally {
loading.value = false
if (wasInitial) {
isInitialLoad.value = false
}
}
}
const handleGenerate = async () => {
if (generateConfig.chunk_ids.length === 0) {
ElMessage.warning('请选择文本块')
return
}
if (!generateConfig.model_id) {
ElMessage.warning('请选择生成模型')
return
}
generating.value = true
try {
await questionApi.generate(projectId.value, generateConfig)
ElMessage.success('问题生成任务已启动')
showGenerateDialog.value = false
setTimeout(fetchQuestions, 2000)
} catch (error) {
ElMessage.error('生成失败')
} finally {
generating.value = false
}
}
const handleDelete = async (question) => {
try {
await questionApi.delete(projectId.value, question.id)
ElMessage.success('删除成功')
fetchQuestions()
} catch (error) {
ElMessage.error('删除失败')
}
}
const getTypeColor = (type) => {
const map = { 'fact': '#22c55e', 'summary': '#818cf8', 'reasoning': '#f59e0b' }
return map[type] || '#818cf8'
}
const getTypeName = (type) => {
const map = { 'fact': '事实性', 'summary': '总结性', 'reasoning': '推理性' }
return map[type] || type
}
const getSourceName = (source) => {
const map = { 'generated': 'AI生成', 'manual': '手动', 'failed': '失败' }
return map[source] || source
}
onMounted(() => {
fetchAvailableModels()
fetchQuestions()
})
return {
route,
projectId,
loading,
isInitialLoad,
generating,
questions,
chunks,
availableModels,
showGenerateDialog,
filterStatus,
chunkMap,
DEFAULT_GENERATE_PROMPT,
generateConfig,
selectedQuestions,
filteredQuestions,
generatedCount,
manualCount,
failedCount,
generateModels,
isAllSelected,
selectedCount,
toggleSelectAll,
toggleSelect,
isSelected,
clearSelection,
batchDelete,
normalizeModelType,
getProviderLabel,
fetchAvailableModels,
fetchAllChunks,
fetchQuestions,
handleGenerate,
handleDelete,
getTypeColor,
getTypeName,
getSourceName
}
}
})

View File

@@ -0,0 +1,839 @@
import { defineComponent } from 'vue'
import { ref, reactive, computed, onMounted, onUnmounted, watch } from 'vue'
import { useRoute, useRouter } from 'vue-router'
import { ElMessage } from 'element-plus'
import { fileApi, chunkApi, modelApi, questionApi } from '@/core/api'
import DeleteDialog from '@/shared/components/common/DeleteDialog.vue'
export default defineComponent({
name: 'TextSplit',
components: { DeleteDialog },
setup() {
const router = useRouter()
const route = useRoute()
const projectId = computed(() => route.params.id)
const loading = ref(false)
const splitting = ref(false)
const files = ref([])
const filterStatus = ref('')
const fileChunks = ref({})
const isInitialLoad = ref(true)
const availableModels = ref([])
const filePollingTimer = ref(null)
const generateDialogVisible = ref(false)
const generatingQuestions = ref(false)
const DEFAULT_GENERATE_PROMPT = '你是一名高质量中文问答数据构建助手。请基于给定 chunk 内容生成准确、自然、可用于训练的数据集问答对。问题必须清晰具体,答案必须直接来自内容或基于内容做合理概括,不要编造原文没有的信息,不要输出与目录、导航、页眉页脚、噪声文字相关的问题。'
// Multi-select
const selectedFiles = ref([])
const splitDialogVisible = ref(false)
// Chunk Preview Dialog
const chunkPreviewVisible = ref(false)
const previewFile = ref(null)
const previewChunks = ref([])
const previewLoading = ref(false)
const savingChunks = ref(false)
const deletingChunkId = ref('')
const deletingFileChunksId = ref('')
const deleteDialogVisible = ref(false)
const deleteDialogMode = ref('')
const deleteDialogTarget = ref(null)
const previewSearch = ref('')
const previewFilter = ref('all')
const previewJumpInput = ref('')
const selectedPreviewChunkId = ref('')
const isAllSelected = computed(() => filteredFiles.value.length > 0 && selectedFiles.value.length === filteredFiles.value.length)
const selectedCount = computed(() => selectedFiles.value.length)
const filteredFiles = computed(() => {
if (!filterStatus.value) return files.value
return files.value.filter(f => {
if (filterStatus.value === 'completed') {
return fileChunks.value[f.id]
}
if (filterStatus.value === 'processing') {
return f.status === 'processing'
}
return true
})
})
const toggleSelectAll = () => {
if (isAllSelected.value) {
selectedFiles.value = []
} else {
selectedFiles.value = filteredFiles.value.map(f => f.id)
}
}
const toggleSelect = (fileId) => {
const index = selectedFiles.value.indexOf(fileId)
if (index === -1) {
selectedFiles.value.push(fileId)
} else {
selectedFiles.value.splice(index, 1)
}
}
const isSelected = (fileId) => selectedFiles.value.includes(fileId)
const clearSelection = () => {
selectedFiles.value = []
}
const splitConfig = reactive({
method: 'recursive',
chunk_size: 500,
overlap: 50,
separator: '\n\n',
embedding_model_id: '',
similarity_threshold: 0.3,
min_chunk_size: 100,
})
const generateConfig = reactive({
model_id: '',
dirty_data_filter: true,
thinking_mode: true,
preset_prompt: DEFAULT_GENERATE_PROMPT,
count: 3,
})
const methods = [
{ value: 'recursive', label: '递归字符', tag: '基础' },
{ value: 'semantic', label: '句段优先', tag: '规则' },
{ value: 'semantic_embedding', label: '语义嵌入', tag: 'API' },
{ value: 'markdown_structure', label: 'Markdown', tag: '结构' },
]
const completedFiles = computed(() => {
return Object.keys(fileChunks.value).length
})
const processingCount = computed(() => {
return files.value.filter(f => f.status === 'processing').length
})
const totalChunks = computed(() => {
return Object.values(fileChunks.value).reduce((sum, count) => sum + count, 0)
})
const normalizeModelType = (modelType, modelName = '') => {
if (modelType && modelType !== 'chat') {
return modelType
}
const normalizedName = String(modelName).trim().toLowerCase()
if (['rerank', 'bce-reranker', 'gte-rerank'].some(keyword => normalizedName.includes(keyword))) {
return 'rerank'
}
if ([
'embedding',
'embed',
'text-embedding',
'bge-',
'bge_m3',
'gte-',
'm3e',
'e5-',
'jina-embeddings'
].some(keyword => normalizedName.includes(keyword))) {
return 'embedding'
}
if (['vl', 'vision', 'visual', 'multimodal', 'qwen-vl', 'gpt-4o'].some(keyword => normalizedName.includes(keyword))) {
return 'vlm'
}
return 'chat'
}
const embeddingModels = computed(() => {
return availableModels.value.filter(model => normalizeModelType(model.model_type, model.model_name) === 'embedding')
})
const selectedEmbeddingModel = computed(() => {
return embeddingModels.value.find(model => model.id === splitConfig.embedding_model_id) || null
})
const generateModels = computed(() => {
return availableModels.value.filter(model => {
const type = normalizeModelType(model.model_type, model.model_name)
return type === 'chat' || type === 'vlm'
})
})
const getProviderLabel = (provider) => {
const providerMap = {
openai: 'OpenAI Compatible',
minimax: 'MiniMax',
glm: 'GLM',
ali: '阿里云百炼'
}
return providerMap[provider] || provider
}
const fetchAllChunks = async () => {
const allChunks = []
let page = 1
let total = 0
do {
const res = await chunkApi.list(projectId.value, { page, page_size: 100 })
const items = res.items || res.data || []
total = res.total || res.pagination?.total || items.length
allChunks.push(...items)
page += 1
} while (allChunks.length < total)
return allChunks
}
const goToModelSettings = () => {
splitDialogVisible.value = false
generateDialogVisible.value = false
router.push('/models')
}
const fetchAvailableModels = async () => {
try {
const res = await modelApi.list()
if (Array.isArray(res)) {
availableModels.value = res
} else if (res?.data && Array.isArray(res.data)) {
availableModels.value = res.data
} else {
availableModels.value = []
}
} catch (error) {
console.error(error)
availableModels.value = []
}
}
watch(embeddingModels, (models) => {
if (!models.length) {
splitConfig.embedding_model_id = ''
return
}
if (!models.some(model => model.id === splitConfig.embedding_model_id)) {
const defaultModel = models.find(model => model.is_default === 'true') || models[0]
splitConfig.embedding_model_id = defaultModel?.id || ''
}
}, { immediate: true })
watch(generateModels, (models) => {
if (!models.length) {
generateConfig.model_id = ''
return
}
if (!models.some(model => model.id === generateConfig.model_id)) {
const defaultModel = models.find(model => model.is_default === 'true') || models[0]
generateConfig.model_id = defaultModel?.id || ''
}
}, { immediate: true })
const fetchFiles = async () => {
const wasInitial = isInitialLoad.value
loading.value = true
try {
const res = await fileApi.list(projectId.value)
files.value = res || []
// 获取每个文件的 chunk 数量
await fetchChunksCount()
} catch (error) {
console.error(error)
} finally {
loading.value = false
if (wasInitial) {
isInitialLoad.value = false
}
}
}
const fetchChunksCount = async () => {
const counts = {}
for (const file of files.value) {
try {
const res = await chunkApi.list(projectId.value, { file_id: file.id })
const chunkList = res.items || res || []
if (chunkList.length > 0) {
counts[file.id] = chunkList.length
}
} catch (e) {
console.error(e)
}
}
fileChunks.value = counts
}
const openSplitDialog = () => {
if (selectedFiles.value.length === 0) {
ElMessage.warning('请先选择要分割的文件')
return
}
if (!availableModels.value.length) {
fetchAvailableModels()
}
splitDialogVisible.value = true
}
const handleBatchSplit = async () => {
if (selectedFiles.value.length === 0) {
ElMessage.warning('请先选择文件')
return
}
if (splitConfig.method === 'semantic_embedding' && !selectedEmbeddingModel.value) {
ElMessage.warning('请先选择已配置的 embedding 模型')
return
}
if (splitConfig.method === 'semantic_embedding' && !selectedEmbeddingModel.value?.api_key) {
ElMessage.warning('当前 embedding 模型缺少 API Key请先到模型配置补全')
return
}
splitting.value = true
splitDialogVisible.value = false
const successFiles = []
const failedFiles = []
try {
for (const fileId of selectedFiles.value) {
const file = files.value.find(item => item.id === fileId)
const payload = {
file_id: fileId,
method: splitConfig.method,
chunk_size: splitConfig.chunk_size,
overlap: splitConfig.overlap,
separator: splitConfig.separator,
similarity_threshold: splitConfig.similarity_threshold,
min_chunk_size: splitConfig.min_chunk_size,
}
if (splitConfig.method === 'semantic_embedding' && selectedEmbeddingModel.value) {
payload.embedding_provider = selectedEmbeddingModel.value.provider
payload.embedding_api_key = selectedEmbeddingModel.value.api_key
payload.embedding_base_url = selectedEmbeddingModel.value.api_base
payload.embedding_model = selectedEmbeddingModel.value.model_name
}
try {
await chunkApi.split(projectId.value, payload)
successFiles.push(file?.filename || fileId)
} catch (error) {
console.error(error)
failedFiles.push({
name: file?.filename || fileId,
message: error?.message || '分割失败'
})
}
}
if (successFiles.length && !failedFiles.length) {
ElMessage.success(`已为 ${successFiles.length} 个文件启动后台分割任务`)
} else if (successFiles.length && failedFiles.length) {
ElMessage.warning(`已启动 ${successFiles.length} 个,失败 ${failedFiles.length} 个:${failedFiles[0].name} - ${failedFiles[0].message}`)
} else if (failedFiles.length) {
ElMessage.error(`分割失败:${failedFiles[0].name} - ${failedFiles[0].message}`)
return
}
// 清除选择
selectedFiles.value = []
fetchFiles()
} finally {
splitting.value = false
}
}
const openGenerateDialog = () => {
if (completedFiles.value === 0) {
ElMessage.warning('没有已分割的文件可生成')
return
}
if (!availableModels.value.length) {
fetchAvailableModels()
}
generateDialogVisible.value = true
}
const handleBatchGenerate = async () => {
if (!generateConfig.model_id) {
ElMessage.warning('请选择用于生成问答的大语言模型')
return
}
generatingQuestions.value = true
try {
const allChunks = await fetchAllChunks()
if (!allChunks.length) {
ElMessage.warning('当前项目还没有可用文本块')
return
}
await questionApi.generate(projectId.value, {
chunk_ids: allChunks.map(chunk => chunk.id),
model_id: generateConfig.model_id,
dirty_data_filter: generateConfig.dirty_data_filter,
thinking_mode: generateConfig.thinking_mode,
preset_prompt: generateConfig.preset_prompt,
count: generateConfig.count
})
generateDialogVisible.value = false
ElMessage.success(`已为 ${allChunks.length} 个文本块启动后台问答生成任务`)
} catch (error) {
console.error(error)
ElMessage.error(error?.message || '问答生成启动失败')
} finally {
generatingQuestions.value = false
}
}
// Chunk Preview Methods
const openChunkPreview = async (file) => {
previewFile.value = file
previewSearch.value = ''
previewFilter.value = 'all'
previewJumpInput.value = ''
selectedPreviewChunkId.value = ''
chunkPreviewVisible.value = true
await fetchPreviewChunks(file.id)
}
const previewChunksWithIndex = computed(() => {
return previewChunks.value.map((chunk, index) => ({
...chunk,
displayIndex: index + 1
}))
})
const isChunkModified = (chunk) => chunk.editingContent !== chunk.content
const filteredPreviewChunks = computed(() => {
const keyword = previewSearch.value.trim().toLowerCase()
return previewChunksWithIndex.value.filter(chunk => {
const matchesFilter = previewFilter.value !== 'modified' || isChunkModified(chunk)
if (!matchesFilter) return false
if (!keyword) return true
return String(chunk.displayIndex).includes(keyword) || chunk.content.toLowerCase().includes(keyword) || chunk.editingContent.toLowerCase().includes(keyword)
})
})
const activePreviewChunk = computed(() => {
return previewChunks.value.find(chunk => chunk.id === selectedPreviewChunkId.value) || null
})
const activePreviewChunkIndex = computed(() => {
const index = previewChunks.value.findIndex(chunk => chunk.id === selectedPreviewChunkId.value)
return index === -1 ? 0 : index + 1
})
const modifiedPreviewCount = computed(() => {
return previewChunks.value.filter(chunk => isChunkModified(chunk)).length
})
const deleteDialogTitle = computed(() => {
if (deleteDialogMode.value === 'chunk') return '删除分片'
if (deleteDialogMode.value === 'file-chunks') return '删除全部分块'
return '删除'
})
const deleteDialogItemName = computed(() => {
if (deleteDialogMode.value === 'chunk') {
return `${activePreviewChunkIndex.value || ''}`.trim()
}
if (deleteDialogMode.value === 'file-chunks') {
return deleteDialogTarget.value?.filename || ''
}
return ''
})
const deleteDialogDetail = computed(() => {
if (deleteDialogMode.value === 'chunk') {
return '这会移除当前选中的单个文本分片,适用于清理错误切分或无效内容。'
}
if (deleteDialogMode.value === 'file-chunks') {
return '这会清空当前文件已生成的全部分块,但不会删除文件本身。'
}
return ''
})
const deleteDialogWarning = computed(() => {
if (deleteDialogMode.value === 'chunk') {
return '删除后不可恢复,和该分片关联的后续内容可能需要重新生成。'
}
if (deleteDialogMode.value === 'file-chunks') {
return '删除后不可恢复,该文件的全部分块将被清空,你需要重新执行分割才能恢复。'
}
return ''
})
const deleteDialogConfirmText = computed(() => {
if (deleteDialogMode.value === 'chunk') return '确认删除分片'
if (deleteDialogMode.value === 'file-chunks') return '确认删除全部分块'
return '确认删除'
})
const deleteDialogLoading = computed(() => {
if (deleteDialogMode.value === 'chunk') {
return !!deleteDialogTarget.value && deletingChunkId.value === deleteDialogTarget.value.id
}
if (deleteDialogMode.value === 'file-chunks') {
return !!deleteDialogTarget.value && deletingFileChunksId.value === deleteDialogTarget.value.id
}
return false
})
const selectPreviewChunk = (chunkId) => {
selectedPreviewChunkId.value = chunkId
}
const getChunkSnippet = (chunk) => {
return (chunk.editingContent || chunk.content || '')
.replace(/\s+/g, ' ')
.trim()
.slice(0, 110) || '空白分片'
}
const jumpToChunk = () => {
const index = Number(previewJumpInput.value)
if (!Number.isInteger(index) || index < 1) {
ElMessage.warning('请输入有效的块号')
return
}
const target = previewChunksWithIndex.value.find(chunk => chunk.displayIndex === index)
if (!target) {
ElMessage.warning('未找到对应块号')
return
}
if (previewFilter.value === 'modified' && !isChunkModified(target)) {
previewFilter.value = 'all'
}
selectedPreviewChunkId.value = target.id
}
const resetChunk = (chunk) => {
chunk.editingContent = chunk.content
}
const openDeleteChunkDialog = (chunk) => {
deleteDialogMode.value = 'chunk'
deleteDialogTarget.value = chunk
deleteDialogVisible.value = true
}
const openDeleteFileChunksDialog = (file) => {
deleteDialogMode.value = 'file-chunks'
deleteDialogTarget.value = file
deleteDialogVisible.value = true
}
const fetchPreviewChunks = async (fileId) => {
previewLoading.value = true
try {
const res = await chunkApi.list(projectId.value, { file_id: fileId })
previewChunks.value = (res.items || res || []).map(c => ({
...c,
editingContent: c.content
}))
selectedPreviewChunkId.value = previewChunks.value[0]?.id || ''
} catch (e) {
console.error(e)
ElMessage.error('获取 chunks 失败')
} finally {
previewLoading.value = false
}
}
const saveChunk = async (chunk) => {
savingChunks.value = true
try {
await chunkApi.update(projectId.value, chunk.id, {
content: chunk.editingContent
})
ElMessage.success('保存成功')
// Update local state
chunk.content = chunk.editingContent
} catch (e) {
console.error(e)
ElMessage.error('保存失败')
} finally {
savingChunks.value = false
}
}
const deleteChunk = async (chunk) => {
deletingChunkId.value = chunk.id
try {
await chunkApi.delete(projectId.value, chunk.id)
previewChunks.value = previewChunks.value.filter(item => item.id !== chunk.id)
if (previewFile.value?.id) {
const nextCount = Math.max((fileChunks.value[previewFile.value.id] || 1) - 1, 0)
if (nextCount > 0) {
fileChunks.value = {
...fileChunks.value,
[previewFile.value.id]: nextCount
}
} else {
const { [previewFile.value.id]: _, ...rest } = fileChunks.value
fileChunks.value = rest
}
}
ElMessage.success('删除成功')
} catch (e) {
console.error(e)
ElMessage.error('删除失败')
} finally {
deletingChunkId.value = ''
deleteDialogVisible.value = false
deleteDialogMode.value = ''
deleteDialogTarget.value = null
}
}
const deleteFileChunks = async (file) => {
deletingFileChunksId.value = file.id
try {
const allChunks = []
let page = 1
let total = 0
do {
const res = await chunkApi.list(projectId.value, {
file_id: file.id,
page,
page_size: 100
})
const items = res.items || []
total = res.total || items.length
allChunks.push(...items)
page += 1
} while (allChunks.length < total)
for (const chunk of allChunks) {
await chunkApi.delete(projectId.value, chunk.id)
}
const { [file.id]: _, ...rest } = fileChunks.value
fileChunks.value = rest
if (previewFile.value?.id === file.id) {
previewChunks.value = []
selectedPreviewChunkId.value = ''
}
ElMessage.success(`已删除 ${allChunks.length} 个分块`)
} catch (e) {
console.error(e)
ElMessage.error('删除全部分块失败')
} finally {
deletingFileChunksId.value = ''
deleteDialogVisible.value = false
deleteDialogMode.value = ''
deleteDialogTarget.value = null
}
}
const confirmDeleteAction = async () => {
if (!deleteDialogTarget.value) return
if (deleteDialogMode.value === 'chunk') {
await deleteChunk(deleteDialogTarget.value)
return
}
if (deleteDialogMode.value === 'file-chunks') {
await deleteFileChunks(deleteDialogTarget.value)
}
}
watch(filteredPreviewChunks, (chunks) => {
if (!chunks.length) {
selectedPreviewChunkId.value = ''
return
}
if (!chunks.some(chunk => chunk.id === selectedPreviewChunkId.value)) {
selectedPreviewChunkId.value = chunks[0].id
}
})
const refreshFiles = () => {
fetchFiles()
}
const startFilePolling = () => {
if (filePollingTimer.value) return
filePollingTimer.value = window.setInterval(() => {
fetchFiles()
}, 3000)
}
const stopFilePolling = () => {
if (!filePollingTimer.value) return
window.clearInterval(filePollingTimer.value)
filePollingTimer.value = null
}
const formatSize = (bytes) => {
if (!bytes) return '0 B'
const units = ['B', 'KB', 'MB', 'GB']
let i = 0
while (bytes >= 1024 && i < units.length - 1) {
bytes /= 1024
i++
}
return `${bytes.toFixed(1)} ${units[i]}`
}
const getFileBg = (type) => {
const colors = {
pdf: '#ef4444',
docx: '#3b82f6',
xlsx: '#22c55e',
csv: '#f59e0b',
md: '#8b5cf6',
txt: '#6b7280',
epub: '#ec4899'
}
return colors[type] || '#6b7280'
}
const getFileIcon = (type) => {
const icons = {
pdf: 'Document',
docx: 'Document',
xlsx: 'Grid',
csv: 'Grid',
md: 'Document',
txt: 'Document',
epub: 'Book'
}
return icons[type] || 'Document'
}
onMounted(() => {
fetchAvailableModels()
fetchFiles()
})
watch(processingCount, (count) => {
if (count > 0) {
startFilePolling()
} else {
stopFilePolling()
}
}, { immediate: true })
onUnmounted(() => {
stopFilePolling()
})
return {
router,
route,
projectId,
loading,
splitting,
files,
filterStatus,
fileChunks,
isInitialLoad,
availableModels,
filePollingTimer,
generateDialogVisible,
generatingQuestions,
DEFAULT_GENERATE_PROMPT,
selectedFiles,
splitDialogVisible,
chunkPreviewVisible,
previewFile,
previewChunks,
previewLoading,
savingChunks,
deletingChunkId,
deletingFileChunksId,
deleteDialogVisible,
deleteDialogMode,
deleteDialogTarget,
previewSearch,
previewFilter,
previewJumpInput,
selectedPreviewChunkId,
isAllSelected,
selectedCount,
filteredFiles,
toggleSelectAll,
toggleSelect,
isSelected,
clearSelection,
splitConfig,
generateConfig,
methods,
completedFiles,
processingCount,
totalChunks,
normalizeModelType,
embeddingModels,
selectedEmbeddingModel,
generateModels,
getProviderLabel,
fetchAllChunks,
goToModelSettings,
fetchAvailableModels,
fetchFiles,
fetchChunksCount,
openSplitDialog,
handleBatchSplit,
openGenerateDialog,
handleBatchGenerate,
openChunkPreview,
previewChunksWithIndex,
isChunkModified,
filteredPreviewChunks,
activePreviewChunk,
activePreviewChunkIndex,
modifiedPreviewCount,
deleteDialogTitle,
deleteDialogItemName,
deleteDialogDetail,
deleteDialogWarning,
deleteDialogConfirmText,
deleteDialogLoading,
selectPreviewChunk,
getChunkSnippet,
jumpToChunk,
resetChunk,
openDeleteChunkDialog,
openDeleteFileChunksDialog,
fetchPreviewChunks,
saveChunk,
deleteChunk,
deleteFileChunks,
confirmDeleteAction,
refreshFiles,
startFilePolling,
stopFilePolling,
formatSize,
getFileBg,
getFileIcon
}
}
})

View File

@@ -0,0 +1,386 @@
<template>
<div class="crawler-page">
<div class="page-header">
<div class="header-content">
<div class="header-icon">
<svg width="32" height="32" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<circle cx="12" cy="12" r="10"/>
<path d="M2 12h20M12 2a15.3 15.3 0 0 1 4 10 15.3 15.3 0 0 1-4 10 15.3 15.3 0 0 1-4-10 15.3 15.3 0 0 1 4-10z"/>
</svg>
</div>
<div class="header-text">
<h1>数据爬虫</h1>
<p>从网页自动采集数据用于构建训练数据集</p>
</div>
</div>
</div>
<div class="crawler-content">
<!-- Crawler Config Card -->
<div class="config-card">
<h2>爬取配置</h2>
<el-form :model="form" label-position="top">
<el-form-item label="目标网址">
<el-input
v-model="form.url"
placeholder="https://example.com"
:prefix-icon="Link"
>
<template #prepend>
<el-select v-model="form.method" style="width: 100px">
<el-option label="GET" value="GET" />
<el-option label="POST" value="POST" />
</el-select>
</template>
</el-input>
</el-form-item>
<el-form-item label="选择项目">
<el-select v-model="form.projectId" placeholder="选择目标项目" style="width: 100%">
<el-option
v-for="project in projects"
:key="project.id"
:label="project.name"
:value="project.id"
/>
</el-select>
</el-form-item>
<el-form-item label="爬取规则">
<div class="rule-options">
<el-checkbox v-model="form.extractTitle">提取标题</el-checkbox>
<el-checkbox v-model="form.extractContent">提取正文内容</el-checkbox>
<el-checkbox v-model="form.extractLinks">提取所有链接</el-checkbox>
<el-checkbox v-model="form.extractImages">提取图片链接</el-checkbox>
</div>
</el-form-item>
<el-form-item label="CSS 选择器 (可选)">
<el-input
v-model="form.cssSelector"
placeholder="如: article.content, .post-body"
/>
</el-form-item>
<el-form-item label="爬取深度">
<el-slider v-model="form.depth" :min="1" :max="5" show-input />
</el-form-item>
<el-form-item>
<el-button
type="primary"
:loading="crawling"
@click="startCrawl"
class="start-btn"
>
<el-icon><Crawler /></el-icon>
{{ crawling ? '爬取中...' : '开始爬取' }}
</el-button>
</el-form-item>
</el-form>
</div>
<!-- Results Card -->
<div class="results-card">
<div class="results-header">
<h2>爬取结果</h2>
<span class="result-count" v-if="results.length">{{ results.length }} </span>
</div>
<div class="results-content" v-loading="crawling">
<div v-if="!crawling && results.length === 0" class="empty-results">
<el-icon class="empty-icon"><Link /></el-icon>
<p>配置完成后点击"开始爬取"</p>
</div>
<div v-else class="results-list">
<div
v-for="(item, index) in results"
:key="index"
class="result-item"
>
<div class="result-title">{{ item.title || '无标题' }}</div>
<div class="result-url">{{ item.url }}</div>
<div class="result-preview" v-if="item.content">
{{ item.content.substring(0, 150) }}...
</div>
<div class="result-meta">
<el-tag size="small" v-if="item.images?.length">
{{ item.images.length }} 张图片
</el-tag>
<el-tag size="small" v-if="item.links?.length">
{{ item.links.length }} 个链接
</el-tag>
</div>
</div>
</div>
</div>
<div class="results-actions" v-if="results.length > 0">
<el-button @click="exportResults">
<el-icon><Download /></el-icon>
导出数据
</el-button>
<el-button type="primary" @click="saveToProject">
<el-icon><FolderAdd /></el-icon>
保存到项目
</el-button>
</div>
</div>
</div>
</div>
</template>
<script setup>
import { ref, onMounted } from 'vue'
import { useRouter } from 'vue-router'
import { ElMessage } from 'element-plus'
import { Link, Download, FolderAdd } from '@element-plus/icons-vue'
import { projectApi } from '@/core/api'
const router = useRouter()
const projects = ref([])
const crawling = ref(false)
const results = ref([])
const form = ref({
url: '',
method: 'GET',
projectId: '',
extractTitle: true,
extractContent: true,
extractLinks: false,
extractImages: false,
cssSelector: '',
depth: 1
})
const fetchProjects = async () => {
try {
const res = await projectApi.list()
projects.value = res.items || res || []
} catch (error) {
projects.value = []
}
}
const startCrawl = async () => {
if (!form.value.url) {
ElMessage.warning('请输入目标网址')
return
}
if (!form.value.projectId) {
ElMessage.warning('请选择目标项目')
return
}
crawling.value = true
results.value = []
try {
// Simulate crawling - in production this would call the backend API
await new Promise(resolve => setTimeout(resolve, 2000))
// Demo results
results.value = [
{
title: '示例页面标题',
url: form.value.url,
content: '这是从网页中提取的内容示例。爬虫会解析HTML结构提取文本、图片链接和其他有价值的数据。',
images: ['https://example.com/image1.jpg'],
links: ['https://example.com/page1', 'https://example.com/page2']
},
{
title: '子页面标题 1',
url: form.value.url + '/page1',
content: '这是子页面的内容...',
images: [],
links: []
}
]
ElMessage.success('爬取完成')
} catch (error) {
ElMessage.error('爬取失败: ' + error.message)
} finally {
crawling.value = false
}
}
const exportResults = () => {
const data = JSON.stringify(results.value, null, 2)
const blob = new Blob([data], { type: 'application/json' })
const url = URL.createObjectURL(blob)
const a = document.createElement('a')
a.href = url
a.download = 'crawler-results.json'
a.click()
URL.revokeObjectURL(url)
}
const saveToProject = () => {
ElMessage.success('数据已保存到项目')
}
onMounted(() => fetchProjects())
</script>
<style scoped>
.crawler-page {
min-height: 100vh;
padding: 40px;
max-width: 1200px;
margin: 0 auto;
}
.page-header {
margin-bottom: 40px;
}
.header-content {
display: flex;
align-items: center;
gap: 20px;
}
.header-icon {
width: 64px;
height: 64px;
display: flex;
align-items: center;
justify-content: center;
background: var(--accent-primary-muted);
border-radius: var(--radius-lg);
color: var(--accent-primary);
}
.header-text h1 {
font-size: 28px;
font-weight: 600;
margin-bottom: 4px;
}
.header-text p {
color: var(--text-secondary);
}
.crawler-content {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 24px;
}
.config-card,
.results-card {
background: var(--bg-secondary);
border: 1px solid var(--border-subtle);
border-radius: var(--radius-lg);
padding: 24px;
}
.config-card h2,
.results-card h2 {
font-size: 18px;
font-weight: 600;
margin-bottom: 20px;
}
.rule-options {
display: flex;
flex-direction: column;
gap: 8px;
}
.start-btn {
width: 100%;
padding: 14px;
font-size: 15px;
}
.results-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 16px;
}
.result-count {
font-size: 14px;
color: var(--text-secondary);
background: var(--accent-primary-muted);
padding: 4px 12px;
border-radius: 100px;
}
.results-content {
min-height: 300px;
}
.empty-results {
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
height: 300px;
color: var(--text-tertiary);
}
.empty-icon {
font-size: 48px;
margin-bottom: 16px;
opacity: 0.3;
}
.results-list {
display: flex;
flex-direction: column;
gap: 12px;
}
.result-item {
padding: 16px;
background: var(--bg-tertiary);
border-radius: var(--radius-md);
border: 1px solid var(--border-subtle);
}
.result-title {
font-weight: 600;
margin-bottom: 4px;
}
.result-url {
font-size: 12px;
color: var(--accent-primary);
margin-bottom: 8px;
}
.result-preview {
font-size: 13px;
color: var(--text-secondary);
margin-bottom: 8px;
line-height: 1.5;
}
.result-meta {
display: flex;
gap: 8px;
}
.results-actions {
display: flex;
gap: 12px;
margin-top: 16px;
padding-top: 16px;
border-top: 1px solid var(--border-subtle);
}
@media (max-width: 900px) {
.crawler-content {
grid-template-columns: 1fr;
}
}
</style>

View File

@@ -0,0 +1,390 @@
<template>
<div class="home">
<!-- Hero Section -->
<section class="hero">
<div class="hero-content">
<!-- Logo -->
<div class="hero-logo">
<svg width="56" height="56" viewBox="0 0 56 56" fill="none" xmlns="http://www.w3.org/2000/svg">
<defs>
<linearGradient id="logoGradient" x1="0%" y1="0%" x2="100%" y2="100%">
<stop offset="0%" style="stop-color:#00d4ff"/>
<stop offset="100%" style="stop-color:#7c3aed"/>
</linearGradient>
</defs>
<!-- 外圈 - 数据集合 -->
<rect x="4" y="4" width="48" height="48" rx="12" stroke="url(#logoGradient)" stroke-width="2.5" fill="none" opacity="0.3"/>
<!-- Y 字母 - 数据流/分支 -->
<path d="M18 42V22L28 12V18" stroke="url(#logoGradient)" stroke-width="3.5" stroke-linecap="round" stroke-linejoin="round" fill="none"/>
<path d="M28 18L38 28" stroke="url(#logoGradient)" stroke-width="3.5" stroke-linecap="round" fill="none"/>
<!-- 数据节点 - 神经网络样式 -->
<circle cx="18" cy="42" r="3" fill="#00d4ff"/>
<circle cx="28" cy="12" r="3" fill="#7c3aed"/>
<circle cx="38" cy="28" r="3" fill="#00d4ff"/>
<circle cx="28" cy="18" r="2.5" fill="#00d4ff" opacity="0.7"/>
<!-- 连接线 - 数据流向 -->
<circle cx="28" cy="32" r="2" fill="#7c3aed" opacity="0.5"/>
<circle cx="20" cy="32" r="1.5" fill="#00d4ff" opacity="0.4"/>
<circle cx="36" cy="38" r="1.5" fill="#7c3aed" opacity="0.4"/>
</svg>
<span class="logo-text">YG<span class="logo-highlight">Datasets</span></span>
</div>
<div class="hero-badge">
<span class="badge-dot"></span>
<span>AI 驱动数据生成</span>
</div>
<h1 class="hero-title">
构建高质量<br />
<span class="glow-text">训练数据集</span>
</h1>
<p class="hero-subtitle">
通过智能分割AI 生成问答和无缝评估
将文档转化为结构化数据集
</p>
<div class="hero-actions">
<el-button type="primary" size="large" @click="createProject" class="btn-primary">
<el-icon><Plus /></el-icon>
创建项目
</el-button>
<el-button size="large" @click="goToCrawler" class="btn-secondary">
<el-icon><Connection /></el-icon>
数据爬虫
</el-button>
<el-button size="large" @click="goToModels" class="btn-secondary">
<el-icon><Cpu /></el-icon>
模型管理
</el-button>
</div>
</div>
<!-- Hero Visual - Modern Abstract Composition -->
<div class="hero-visual">
<!-- Galaxy Background -->
<div class="galaxy-bg">
<!-- Nebula clouds -->
<div class="nebula-cloud nebula-1"></div>
<div class="nebula-cloud nebula-2"></div>
<div class="nebula-cloud nebula-3"></div>
<!-- Galaxy core -->
<div class="galaxy-core"></div>
<!-- Spiral arms -->
<div class="galaxy-spiral">
<div class="spiral-arm spiral-arm-1"></div>
<div class="spiral-arm spiral-arm-2"></div>
<div class="spiral-arm spiral-arm-3"></div>
</div>
<!-- Orbit rings with stars -->
<div class="orbit-ring orbit-ring-1">
<span class="orbit-star"></span>
<span class="orbit-star"></span>
<span class="orbit-star"></span>
<span class="orbit-star"></span>
</div>
<div class="orbit-ring orbit-ring-2">
<span class="orbit-star"></span>
<span class="orbit-star"></span>
<span class="orbit-star"></span>
<span class="orbit-star"></span>
</div>
<div class="orbit-ring orbit-ring-3">
<span class="orbit-star"></span>
<span class="orbit-star"></span>
<span class="orbit-star"></span>
</div>
<div class="orbit-ring orbit-ring-4">
<span class="orbit-star"></span>
<span class="orbit-star"></span>
</div>
</div>
<!-- Light rays -->
<div class="light-rays">
<div class="ray"></div>
<div class="ray"></div>
<div class="ray"></div>
<div class="ray"></div>
<div class="ray"></div>
</div>
<!-- Ambient particles -->
<span class="ambient-particle"></span>
<span class="ambient-particle"></span>
<span class="ambient-particle"></span>
<span class="ambient-particle"></span>
<span class="ambient-particle"></span>
<!-- Abstract background orbs -->
<div class="orb orb-1"></div>
<div class="orb orb-2"></div>
<div class="orb orb-3"></div>
<!-- Central floating UI element -->
<div class="floating-ui">
<div class="ui-header">
<div class="ui-dot"></div>
<div class="ui-dot"></div>
<div class="ui-dot"></div>
</div>
<div class="ui-content">
<div class="ui-line"></div>
<div class="ui-line short"></div>
<div class="ui-line"></div>
</div>
<div class="ui-badge">
<el-icon><Check /></el-icon>
<span>处理完成</span>
</div>
</div>
<!-- Floating feature pills - main features -->
<div class="feature-pill pill-1">
<el-icon><Document /></el-icon>
<span>多格式支持</span>
</div>
<div class="feature-pill pill-2">
<el-icon><MagicStick /></el-icon>
<span>AI 生成</span>
</div>
<div class="feature-pill pill-3">
<el-icon><DataAnalysis /></el-icon>
<span>智能评估</span>
</div>
<!-- Additional floating labels -->
<div class="feature-pill pill-4">
<el-icon><Connection /></el-icon>
<span>API 集成</span>
</div>
<div class="feature-pill pill-5">
<el-icon><Clock /></el-icon>
<span>批量处理</span>
</div>
<div class="feature-pill pill-6">
<el-icon><Lock /></el-icon>
<span>数据安全</span>
</div>
<div class="feature-pill pill-7">
<el-icon><TrendCharts /></el-icon>
<span>可视化</span>
</div>
</div>
</section>
<!-- Projects Section -->
<section class="projects-section">
<div class="section-header">
<div class="section-title">
<h2>我的项目</h2>
<p>{{ total }} 个项目</p>
</div>
<el-button type="primary" @click="createProject" class="add-btn">
<el-icon><Plus /></el-icon>
新建
</el-button>
</div>
<!-- Projects Grid -->
<div class="projects-grid" v-loading="loading">
<!-- Empty State -->
<EmptyState
v-if="!loading && projects.length === 0"
:icon="FolderAdd"
title="暂无项目"
description="创建您的第一个项目开始生成数据集"
action-text="创建项目"
@action="createProject"
/>
<!-- Project Cards -->
<ProjectCard
v-else
v-for="(project, index) in projects"
:key="project.id"
:project="project"
:index="index"
@click="openProject"
@delete="confirmDelete"
/>
</div>
<!-- Pagination -->
<div class="pagination-wrapper" v-if="needPagination">
<div class="pagination-minimal">
<span class="page-info"> {{ currentPage }} / {{ totalPages }} </span>
<div class="page-arrows">
<button
class="arrow-btn"
:disabled="currentPage === 1"
@click="handlePageChange(currentPage - 1)"
>
<el-icon><ArrowLeft /></el-icon>
</button>
<button
class="arrow-btn"
:disabled="currentPage === totalPages"
@click="handlePageChange(currentPage + 1)"
>
<el-icon><ArrowRight /></el-icon>
</button>
</div>
</div>
</div>
</section>
<!-- Create Dialog -->
<CreateProjectDialog
v-model:visible="dialogVisible"
:loading="submitting"
@submit="handleCreateSubmit"
/>
<!-- Delete Confirmation Dialog -->
<DeleteDialog
v-model:visible="deleteDialogVisible"
:item-name="projectToDelete?.name"
:loading="deleting"
@confirm="handleDelete"
/>
</div>
</template>
<script setup lang="ts">
import { ref, onMounted, computed } from 'vue'
import { useRouter } from 'vue-router'
import { ElMessage } from 'element-plus'
import { FolderAdd, Check, Connection, Clock, Lock, TrendCharts, ArrowLeft, ArrowRight } from '@element-plus/icons-vue'
import { projectApi } from '@/core/api'
import type { Project, ProjectCreate } from '@/shared/types'
// Components
import EmptyState from '@/shared/components/common/EmptyState.vue'
import ProjectCard from '@/shared/components/common/ProjectCard.vue'
import CreateProjectDialog from '@/shared/components/common/CreateProjectDialog.vue'
import DeleteDialog from '@/shared/components/common/DeleteDialog.vue'
const router = useRouter()
const loading = ref(false)
const projects = ref([])
const dialogVisible = ref(false)
const deleteDialogVisible = ref(false)
const projectToDelete = ref(null)
const submitting = ref(false)
const deleting = ref(false)
// Pagination
const currentPage = ref(1)
const pageSize = ref(9)
const total = ref(0)
const fetchProjects = async () => {
loading.value = true
try {
const res = await projectApi.list({ page: currentPage.value, page_size: pageSize.value })
// API returns: { items: [], total, page, page_size, total_pages }
if (res && typeof res === 'object' && 'items' in res) {
projects.value = res.items || []
total.value = res.total || 0
} else if (Array.isArray(res)) {
projects.value = res
total.value = res.length
} else {
projects.value = []
total.value = 0
}
} catch (error) {
projects.value = []
total.value = 0
} finally {
loading.value = false
}
}
const handlePageChange = (page: number) => {
currentPage.value = page
fetchProjects()
}
const needPagination = computed(() => total.value > pageSize.value || projects.value.length === pageSize.value)
const totalPages = computed(() => Math.ceil(total.value / pageSize.value))
const createProject = () => {
dialogVisible.value = true
}
const handleCreateSubmit = async (formData) => {
// Validation - name, description and type are required
if (!formData.name || formData.name.trim() === '') {
ElMessage.warning('请输入项目名称')
return
}
if (!formData.description || formData.description.trim() === '') {
ElMessage.warning('请输入项目描述')
return
}
if (!formData.type) {
ElMessage.warning('请选择项目类型')
return
}
console.log('Creating project with form:', formData)
submitting.value = true
try {
const res = await projectApi.create(formData)
console.log('Create response:', res)
ElMessage.success('项目创建成功')
dialogVisible.value = false
fetchProjects()
// New format: {id: "..."}
const projectId = res.id
console.log('Navigating to:', projectId)
router.push(`/project/${projectId}`)
} catch (error) {
console.error('Create project error:', error)
ElMessage.error('创建项目失败: ' + (error.message || '未知错误'))
} finally {
submitting.value = false
}
}
const openProject = (project) => {
router.push(`/project/${project.id}`)
}
const confirmDelete = (project) => {
projectToDelete.value = project
deleteDialogVisible.value = true
}
const handleDelete = async () => {
if (!projectToDelete.value) return
deleting.value = true
try {
await projectApi.delete(projectToDelete.value.id)
ElMessage.success('项目已删除')
deleteDialogVisible.value = false
projectToDelete.value = null
fetchProjects()
} catch (error) {
ElMessage.error('删除失败')
} finally {
deleting.value = false
}
}
const goToDataSquare = () => router.push('/data-square')
const goToCrawler = () => router.push('/crawler')
const goToModels = () => router.push('/models')
onMounted(() => fetchProjects())
</script>
<style scoped>
@import '@/styles/pages/home.scss';
</style>

View File

@@ -0,0 +1,262 @@
<template>
<div class="model-settings">
<!-- 背景效果 -->
<div class="bg-effects">
<div class="glow-orb glow-1"></div>
<div class="glow-orb glow-2"></div>
</div>
<!-- 页面头部 -->
<header class="page-header">
<div class="header-left">
<el-button text class="back-btn" @click="goHome">
<el-icon><ArrowLeft /></el-icon>
<span>返回</span>
</el-button>
</div>
<div class="header-content">
<h1 class="page-title">
<el-icon class="title-icon"><Cpu /></el-icon>
模型管理
</h1>
<p class="page-subtitle">管理您的 AI 模型 API</p>
</div>
<div class="header-right">
<el-button type="primary" class="add-btn" @click="openAddDialog">
<el-icon><Plus /></el-icon>
<span>添加模型</span>
</el-button>
</div>
</header>
<!-- 主内容 -->
<main class="page-main">
<!-- 模型列表 -->
<section class="models-section">
<div class="section-header">
<h2 class="section-title">
<span class="title-line"></span>
已配置的模型
</h2>
<span class="count-badge">{{ models.length }} </span>
</div>
<!-- 空状态 -->
<div v-if="models.length === 0 && !loading" class="empty-state">
<div class="empty-illustration">
<div class="pulse-ring"></div>
<el-icon size="48"><Setting /></el-icon>
</div>
<h3>暂无模型配置</h3>
<p>添加您的第一个 AI 模型开始使用</p>
<el-button type="primary" @click="openAddDialog">添加模型</el-button>
</div>
<!-- 模型卡片 -->
<div v-else class="models-grid">
<article
v-for="(model, index) in models"
:key="model.id"
class="model-card"
:class="{ 'is-default': model.is_default === 'true' }"
:style="{ '--delay': index * 0.08 + 's' }"
>
<div class="card-glow"></div>
<!-- 默认标识 -->
<div v-if="model.is_default === 'true'" class="default-badge">
<el-icon><Star /></el-icon>
默认
</div>
<!-- 提供商图标 -->
<div class="provider-logo" :class="model.provider">
{{ getProviderAbbr(model.provider) }}
</div>
<!-- 模型信息 -->
<div class="model-info">
<div class="model-name-row">
<h3 class="model-name">{{ model.model_name }}</h3>
<span class="model-type-badge" :class="`type-${normalizeModelType(model.model_type, model.model_name)}`">
{{ getModelTypeLabel(model.model_type, model.model_name) }}
</span>
</div>
<p class="model-endpoint">
<el-icon><Link /></el-icon>
{{ model.api_base || '默认端点' }}
</p>
</div>
<!-- 底部操作 -->
<div class="card-footer">
<div class="status-badge" :class="model.connection_status">
<span class="status-dot" :class="model.connection_status"></span>
<template v-if="model.connection_status === 'connected'">已联通</template>
<template v-else-if="model.connection_status === 'failed'">连接失败</template>
<template v-else>未测试</template>
</div>
<div class="card-actions">
<el-button text class="action-btn test" @click="testConnection(model)">
测试连接
</el-button>
<el-button text class="action-btn delete" @click="confirmDelete(model)">
删除
</el-button>
</div>
</div>
</article>
</div>
</section>
</main>
<el-dialog
v-model="showAddDialog"
:show-close="false"
width="560px"
class="add-dialog"
:append-to-body="true"
>
<template #header>
<div class="dialog-header">
<div class="dialog-icon">
<el-icon size="20"><Plus /></el-icon>
</div>
<div class="dialog-title">
<h3>添加模型</h3>
<p>配置新的 AI 模型</p>
</div>
<button class="dialog-close" @click="showAddDialog = false">
<el-icon><Close /></el-icon>
</button>
</div>
</template>
<el-form :model="modelForm" label-position="top" class="model-form">
<el-form-item label="选择提供商">
<el-select
v-model="modelForm.provider"
placeholder="选择 AI 服务提供商"
size="large"
class="provider-select"
popper-class="provider-select-dropdown"
>
<el-option
v-for="provider in providers"
:key="provider.value"
:label="provider.label"
:value="provider.value"
>
<div class="provider-option-item">
<span class="provider-icon">{{ provider.abbr }}</span>
<div class="provider-copy">
<span>{{ provider.label }}</span>
<small>{{ provider.desc }}</small>
</div>
</div>
</el-option>
</el-select>
</el-form-item>
<el-form-item label="模型类型">
<el-select
v-model="modelForm.model_type"
placeholder="选择模型类型"
size="large"
class="provider-select"
>
<el-option
v-for="type in modelTypes"
:key="type.value"
:label="type.label"
:value="type.value"
>
<div class="provider-option-item">
<span class="provider-icon">{{ type.abbr }}</span>
<div class="provider-copy">
<span>{{ type.label }}</span>
<small>{{ type.desc }}</small>
</div>
</div>
</el-option>
</el-select>
</el-form-item>
<el-form-item label="模型名称" required>
<el-input
v-model="modelForm.model_name"
placeholder="例如: gpt-4o-mini / text-embedding-v3-small"
size="large"
/>
</el-form-item>
<el-form-item label="API Key" required>
<el-input
v-model="modelForm.api_key"
type="password"
placeholder="输入 API Key"
size="large"
show-password
/>
</el-form-item>
<el-form-item label="API 地址">
<el-input
v-model="modelForm.api_base"
placeholder="自定义 API 地址"
size="large"
/>
</el-form-item>
<el-form-item>
<el-checkbox v-model="modelForm.is_default">
设为默认模型
</el-checkbox>
</el-form-item>
</el-form>
<template #footer>
<div class="dialog-footer">
<el-button @click="showAddDialog = false" size="large">取消</el-button>
<el-button type="primary" @click="addModel" :loading="submitting" size="large">
添加模型
</el-button>
</div>
</template>
</el-dialog>
<el-dialog
v-model="deleteDialogVisible"
:show-close="false"
width="400"
class="delete-dialog"
:append-to-body="true"
>
<template #header>
<div class="delete-header">
<div class="delete-icon">
<el-icon size="24"><WarningFilled /></el-icon>
</div>
<h3>确认删除</h3>
</div>
</template>
<div class="delete-content">
<p>确定要删除模型 <strong>{{ modelToDelete?.model_name }}</strong> </p>
<p class="warning-text">此操作不可恢复</p>
</div>
<template #footer>
<div class="delete-footer">
<el-button @click="deleteDialogVisible = false" size="large">取消</el-button>
<el-button type="danger" @click="handleDelete" :loading="deleting" size="large">
确认删除
</el-button>
</div>
</template>
</el-dialog>
</div>
</template>
<script lang="ts" src="../page-logic/ModelSettingsPage.ts"></script>
<style scoped src="../styles/pages/model-settings.css"></style>

View File

@@ -128,7 +128,7 @@
import { ref, reactive, computed, onMounted } from 'vue'
import { useRoute } from 'vue-router'
import { ElMessage, ElMessageBox } from 'element-plus'
import { datasetApi } from '@/api'
import { datasetApi } from '@/core/api'
const route = useRoute()
const projectId = computed(() => route.params.id)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,319 @@
<template>
<div class="file-manage">
<!-- Header -->
<div class="page-header">
<div class="header-left">
<h2 class="page-title">文件管理</h2>
<p class="page-subtitle">管理您的文档集合</p>
</div>
<div class="header-actions">
<el-button type="primary" @click="handleUpload" class="upload-btn">
<el-icon><Upload /></el-icon>
<span>上传文件</span>
</el-button>
</div>
</div>
<!-- Stats Cards -->
<div class="stats-grid">
<div
class="stat-card stat-total"
:class="{ active: filterStatus === '' }"
@click="filterStatus = ''"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><Document /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ files.length }}</span>
<span class="stat-label">总文件数</span>
</div>
</div>
</div>
<div
class="stat-card stat-completed"
:class="{ active: filterStatus === 'completed' }"
@click="filterStatus = filterStatus === 'completed' ? '' : 'completed'"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><CircleCheckFilled /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ completedFiles }}</span>
<span class="stat-label">已完成</span>
</div>
</div>
</div>
<div
class="stat-card stat-processing"
:class="{ active: filterStatus === 'processing' }"
@click="filterStatus = filterStatus === 'processing' ? '' : 'processing'"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><Loading /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ processingFiles.length }}</span>
<span class="stat-label">处理中</span>
</div>
</div>
</div>
<div
class="stat-card stat-failed"
:class="{ active: filterStatus === 'failed' }"
@click="filterStatus = filterStatus === 'failed' ? '' : 'failed'"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><CircleCloseFilled /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ failedFiles }}</span>
<span class="stat-label">失败</span>
</div>
</div>
</div>
</div>
<!-- File List Container -->
<div class="file-container" v-loading="loading && isInitialLoad">
<!-- Empty State -->
<div v-if="!loading && !isInitialLoad && filteredFiles.length === 0 && !uploading" class="empty-state">
<div class="empty-illustration">
<div class="orbit orbit-1"></div>
<div class="orbit orbit-2"></div>
<div class="orbit orbit-3"></div>
<div class="empty-core">
<el-icon size="40"><FolderOpened /></el-icon>
</div>
</div>
<h3 class="empty-title">暂无文件</h3>
<p class="empty-desc">上传您的第一个文档开启智能处理之旅</p>
</div>
<!-- Files Table -->
<div v-else class="files-table-wrapper">
<!-- Table Header -->
<div class="table-header">
<div class="table-select">
<el-checkbox
:model-value="isAllSelected"
@change="toggleSelectAll"
class="select-all"
>
<span v-if="selectedCount > 0" class="selected-text">已选择 {{ selectedCount }} </span>
<span v-else>全选</span>
</el-checkbox>
</div>
<div class="table-actions" v-if="selectedCount > 0">
<el-button type="danger" size="small" plain @click="clearSelection" class="batch-clear-btn">
<el-icon><Close /></el-icon>
<span>清除选择</span>
</el-button>
<el-button type="danger" size="small" plain @click="batchDelete" class="batch-delete-btn">
<el-icon><Delete /></el-icon>
<span>批量删除</span>
</el-button>
</div>
</div>
<!-- Table Body -->
<div class="files-table">
<div
v-for="(file, index) in filteredFiles"
:key="file.id"
class="file-row"
:class="{
'is-selected': isSelected(file.id),
'is-processing': file.status === 'processing',
'row-animated': isInitialLoad
}"
:style="{ '--delay': index * 0.04 + 's' }"
@click="toggleSelect(file.id)"
>
<!-- Select Checkbox -->
<div class="col-select" @click.stop>
<el-checkbox
:model-value="isSelected(file.id)"
@change="toggleSelect(file.id)"
/>
</div>
<!-- File Icon -->
<div class="col-icon">
<div class="file-type-icon" :style="{ '--type-color': getTypeColor(file.file_type) }">
<el-icon size="18">
<component :is="getFileIcon(file.file_type)" />
</el-icon>
</div>
</div>
<!-- File Name -->
<div class="col-name">
<span class="filename-text">{{ file.filename }}</span>
<span class="file-ext">{{ getFileExt(file.filename) }}</span>
</div>
<!-- Size -->
<div class="col-size">
{{ formatSize(file.size) }}
</div>
<!-- Date -->
<div class="col-date">
{{ formatDate(file.created_at) }}
</div>
<!-- Status -->
<div class="col-status">
<div class="status-pill" :class="'status-' + file.status">
<span class="status-dot"></span>
<span class="status-text">{{ getStatusText(file.status) }}</span>
</div>
</div>
<!-- Actions -->
<div class="col-actions" @click.stop>
<el-tooltip content="预览" placement="top" v-if="file.status === 'completed'">
<el-button text size="small" class="action-btn preview" @click="handlePreview(file)">
<el-icon><View /></el-icon>
</el-button>
</el-tooltip>
<el-button text size="small" class="action-btn delete" @click="openDeleteDialog(file)">
<el-icon><Delete /></el-icon>
</el-button>
</div>
</div>
</div>
</div>
</div>
<!-- Upload Dialog -->
<el-dialog v-model="uploadDialogVisible" title="上传文件" width="520px" class="upload-dialog" :close-on-click-modal="false">
<div class="upload-area" @click="triggerUpload">
<el-upload
ref="uploadRef"
class="upload-component"
:auto-upload="false"
:limit="10"
:on-change="handleChange"
:on-remove="handleRemove"
:file-list="fileList"
drag
multiple
accept=".pdf,.docx,.doc,.xlsx,.xls,.csv,.epub,.md,.txt"
style="display: none;"
/>
<div class="upload-content">
<div class="upload-illustration">
<div class="upload-ring"></div>
<div class="upload-core">
<el-icon size="32"><UploadFilled /></el-icon>
</div>
</div>
<div class="upload-text">
拖拽文件到此处 <em>点击选择</em>
</div>
<div class="upload-hint">
支持 PDFDOCXExcelEPUBMarkdown 等格式
</div>
</div>
</div>
<!-- Selected Files -->
<div v-if="fileList.length > 0" class="selected-area">
<div class="selected-header">
<span>已选择 <strong>{{ fileList.length }}</strong> 个文件</span>
<el-button text size="small" @click="fileList = []">清空</el-button>
</div>
<div class="selected-list">
<div v-for="item in fileList" :key="item.uid" class="selected-item">
<el-icon size="14"><Document /></el-icon>
<span>{{ item.name }}</span>
</div>
</div>
</div>
<template #footer>
<el-button @click="uploadDialogVisible = false">取消</el-button>
<el-button type="primary" @click="submitUpload" :loading="uploading" :disabled="fileList.length === 0">
开始上传
</el-button>
</template>
</el-dialog>
</div>
<Teleport to="body">
<Transition name="fade">
<div
v-if="previewVisible"
class="preview-backdrop"
@click="previewVisible = false"
></div>
</Transition>
<Transition name="slide-right">
<div v-if="previewVisible" class="preview-modal">
<div class="preview-header">
<div class="header-title">
<el-icon class="title-icon"><Document /></el-icon>
<span class="filename">{{ previewFile?.filename }}</span>
</div>
<el-button class="close-btn" text @click="previewVisible = false">
<el-icon><Close /></el-icon>
</el-button>
</div>
<div class="preview-tabs-wrapper">
<div class="preview-tabs">
<button
class="tab-item"
:class="{ active: previewMode === 'source' }"
@click="switchPreviewMode('source')"
>
源文件
</button>
<button
class="tab-item"
:class="{ active: previewMode === 'markdown' }"
@click="switchPreviewMode('markdown')"
>
Markdown
</button>
<div class="tab-indicator" :class="{ 'at-right': previewMode === 'markdown' }"></div>
</div>
</div>
<div class="preview-content" v-loading="previewLoading">
<iframe v-if="isPdfPreview && pdfDataUrl" :src="pdfDataUrl" class="pdf-viewer"></iframe>
<pre v-else-if="previewContent" class="code-content">{{ previewContent }}</pre>
<div v-else-if="!previewLoading && !isPdfPreview" class="preview-empty">
<el-icon size="32"><Document /></el-icon>
<span>暂无内容</span>
</div>
</div>
</div>
</Transition>
</Teleport>
<DeleteDialog
v-model:visible="deleteDialogVisible"
title="删除文件"
:item-name="pendingDeleteFile?.filename || ''"
detail-text="该操作会移除原始文件以及关联的处理结果请确认当前项目内不再需要它"
warning-text="删除后不可恢复文件相关的分割结果和后续数据将一并失效"
confirm-text="确认删除文件"
:loading="deletingFile"
@confirm="confirmDeleteFile"
/>
</template>
<script lang="ts" src="../page-logic/ProjectFilePage.ts"></script>
<style scoped src="../styles/pages/project-file.css"></style>

View File

@@ -32,13 +32,6 @@
<span class="nav-dot"></span>
</router-link>
</nav>
<div class="sidebar-footer">
<router-link to="/" class="home-link">
<el-icon><HomeFilled /></el-icon>
<span>返回首页</span>
</router-link>
</div>
</aside>
<!-- Main Content -->
@@ -51,7 +44,7 @@
<script setup>
import { ref, computed, onMounted } from 'vue'
import { useRoute, useRouter } from 'vue-router'
import { projectApi } from '@/api'
import { projectApi } from '@/core/api'
import { ElMessage } from 'element-plus'
const route = useRoute()
@@ -64,7 +57,7 @@ const project = ref({ name: '加载中...', description: '' })
const navItems = [
{ path: 'files', label: '文件管理', icon: 'Folder' },
{ path: 'split', label: '文本分割', icon: 'Operation' },
{ path: 'split', label: '分割生成', icon: 'Operation' },
{ path: 'questions', label: '问答管理', icon: 'ChatDotSquare' },
{ path: 'datasets', label: '数据集', icon: 'Collection' },
{ path: 'eval', label: '评估系统', icon: 'DataAnalysis' },
@@ -76,7 +69,8 @@ const isActive = (path) => route.path.includes(path)
const fetchProject = async () => {
try {
const res = await projectApi.get(id.value)
project.value = res.data
// New format: returns project directly
project.value = res
} catch (error) {
ElMessage.error('加载项目失败')
}
@@ -295,37 +289,6 @@ onMounted(() => fetchProject())
}
/* Sidebar Footer */
.sidebar-footer {
padding: 16px 20px;
border-top: 1px solid var(--border-subtle);
}
.home-link {
display: flex;
align-items: center;
gap: 12px;
padding: 12px 14px;
border-radius: var(--radius-md);
color: var(--text-tertiary);
font-size: 14px;
text-decoration: none;
transition: all var(--transition-fast);
}
.home-link:hover {
background: var(--bg-hover);
color: var(--text-primary);
}
.home-link span {
transition: opacity 0.2s ease;
}
.sidebar.collapsed .home-link span {
opacity: 0;
width: 0;
overflow: hidden;
}
/* Main Content */
.main-content {
@@ -347,8 +310,7 @@ onMounted(() => fetchProject())
}
.sidebar .project-details,
.sidebar .nav-label,
.sidebar .home-link span {
.sidebar .nav-label {
display: none;
}

View File

@@ -0,0 +1,244 @@
<template>
<div class="question-manage">
<!-- Header -->
<div class="page-header">
<div class="header-left">
<h2 class="page-title">问答管理</h2>
<p class="page-subtitle">管理和生成问答数据</p>
</div>
<div class="header-actions">
<el-button type="primary" @click="showGenerateDialog = true" class="generate-btn">
<el-icon><Plus /></el-icon>
<span>生成问题</span>
</el-button>
</div>
</div>
<!-- Stats Cards -->
<div class="stats-grid">
<div
class="stat-card stat-total"
:class="{ active: filterStatus === '' }"
@click="filterStatus = ''"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><ChatDotSquare /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ questions.length }}</span>
<span class="stat-label">总问题数</span>
</div>
</div>
</div>
<div
class="stat-card stat-completed"
:class="{ active: filterStatus === 'generated' }"
@click="filterStatus = filterStatus === 'generated' ? '' : 'generated'"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><MagicStick /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ generatedCount }}</span>
<span class="stat-label">AI 生成</span>
</div>
</div>
</div>
<div
class="stat-card stat-processing"
:class="{ active: filterStatus === 'manual' }"
@click="filterStatus = filterStatus === 'manual' ? '' : 'manual'"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><EditPen /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ manualCount }}</span>
<span class="stat-label">手动添加</span>
</div>
</div>
</div>
<div
class="stat-card stat-failed"
:class="{ active: filterStatus === 'failed' }"
@click="filterStatus = filterStatus === 'failed' ? '' : 'failed'"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><CircleCloseFilled /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ failedCount }}</span>
<span class="stat-label">失败</span>
</div>
</div>
</div>
</div>
<!-- Question Container -->
<div class="question-container" v-loading="loading && isInitialLoad">
<!-- Empty State -->
<div v-if="!loading && !isInitialLoad && filteredQuestions.length === 0" class="empty-state">
<div class="empty-illustration">
<div class="orbit orbit-1"></div>
<div class="orbit orbit-2"></div>
<div class="orbit orbit-3"></div>
<div class="empty-core">
<el-icon size="40"><ChatDotSquare /></el-icon>
</div>
</div>
<h3 class="empty-title">暂无问答数据</h3>
<p class="empty-desc">生成您的第一个问答数据集</p>
<el-button type="primary" @click="showGenerateDialog = true" class="empty-btn">生成问题</el-button>
</div>
<!-- Question Table -->
<div v-else class="question-table-wrapper">
<!-- Table Header -->
<div class="table-header">
<div class="table-select">
<el-checkbox
:model-value="isAllSelected"
@change="toggleSelectAll"
class="select-all"
>
<span v-if="selectedCount > 0" class="selected-text">已选择 {{ selectedCount }} </span>
<span v-else>全选</span>
</el-checkbox>
</div>
<div class="table-actions" v-if="selectedCount > 0">
<el-button type="danger" size="small" plain @click="clearSelection" class="batch-clear-btn">
<el-icon><Close /></el-icon>
<span>清除选择</span>
</el-button>
<el-button type="danger" size="small" plain @click="batchDelete" class="batch-delete-btn">
<el-icon><Delete /></el-icon>
<span>批量删除</span>
</el-button>
</div>
</div>
<!-- Table Body -->
<div class="question-table">
<div
v-for="(question, index) in filteredQuestions"
:key="question.id"
class="question-row"
:class="{
'is-selected': isSelected(question.id),
'row-animated': isInitialLoad
}"
:style="{ '--delay': index * 0.04 + 's' }"
@click="toggleSelect(question.id)"
>
<!-- Select Checkbox -->
<div class="col-select" @click.stop>
<el-checkbox
:model-value="isSelected(question.id)"
@change="toggleSelect(question.id)"
/>
</div>
<!-- Question Content -->
<div class="col-content">
<div class="question-text">{{ question.content }}</div>
<div class="answer-text" v-if="question.answer">: {{ question.answer }}</div>
</div>
<!-- Type -->
<div class="col-type">
<el-tag size="small" :style="{ '--tag-color': getTypeColor(question.question_type) }" effect="dark">
{{ getTypeName(question.question_type) }}
</el-tag>
</div>
<!-- Source -->
<div class="col-source">
<span class="source-badge" :class="'source-' + question.source">{{ getSourceName(question.source) }}</span>
</div>
<!-- Actions -->
<div class="col-actions" @click.stop>
<el-popconfirm title="确定删除此问题?" @confirm="handleDelete(question)">
<template #reference>
<el-button text size="small" class="action-btn delete">
<el-icon><Delete /></el-icon>
</el-button>
</template>
</el-popconfirm>
</div>
</div>
</div>
</div>
</div>
<el-dialog v-model="showGenerateDialog" title="生成问题" width="640px" class="generate-dialog">
<el-form :model="generateConfig" label-position="top">
<el-form-item label="生成模型">
<el-select
v-model="generateConfig.model_id"
placeholder="选择 chat / vlm 模型"
style="width: 100%"
size="large"
>
<el-option
v-for="model in generateModels"
:key="model.id"
:label="`${model.model_name} · ${getProviderLabel(model.provider)}`"
:value="model.id"
/>
</el-select>
</el-form-item>
<el-form-item label="文本块">
<el-select
v-model="generateConfig.chunk_ids"
multiple
placeholder="选择文本块"
style="width: 100%"
size="large"
>
<el-option
v-for="chunk in chunks"
:key="chunk.id"
:label="chunk.name || chunk.content.slice(0, 50) + '...'"
:value="chunk.id"
/>
</el-select>
</el-form-item>
<div class="form-row">
<el-form-item label="每个块生成数量">
<el-input-number v-model="generateConfig.count" :min="1" :max="8" size="large" style="width: 100%" />
</el-form-item>
</div>
<el-form-item label="生成策略">
<div style="display: flex; gap: 16px; flex-wrap: wrap;">
<el-checkbox v-model="generateConfig.dirty_data_filter">脏数据过滤</el-checkbox>
<el-checkbox v-model="generateConfig.thinking_mode">思考模式</el-checkbox>
</div>
</el-form-item>
<el-form-item label="预设提示语">
<el-input
v-model="generateConfig.preset_prompt"
type="textarea"
:rows="8"
resize="none"
/>
</el-form-item>
</el-form>
<template #footer>
<el-button @click="showGenerateDialog = false">取消</el-button>
<el-button type="primary" @click="handleGenerate" :loading="generating">开始生成</el-button>
</template>
</el-dialog>
</div>
</template>
<script lang="ts" src="../page-logic/ProjectQuestionPage.ts"></script>
<style scoped src="../styles/pages/project-question.css"></style>

View File

@@ -95,7 +95,7 @@
import { ref, reactive, onMounted, computed } from 'vue'
import { useRoute } from 'vue-router'
import { ElMessage } from 'element-plus'
import { projectApi } from '@/api'
import { projectApi } from '@/core/api'
const route = useRoute()
const projectId = computed(() => route.params.id)
@@ -125,8 +125,9 @@ const prompts = reactive({
const fetchProject = async () => {
try {
const res = await projectApi.get(projectId.value)
projectInfo.name = res.data.name
projectInfo.description = res.data.description || ''
// New format: project directly in response
projectInfo.name = res.name
projectInfo.description = res.description || ''
} catch (error) {
console.error(error)
}

View File

@@ -0,0 +1,830 @@
<template>
<div class="text-split">
<!-- Header -->
<div class="page-header">
<div class="header-left">
<h2 class="page-title">分割生成</h2>
<p class="page-subtitle">选择文件进行智能分割</p>
</div>
<div class="header-actions">
<el-button
@click="openSplitDialog"
:disabled="selectedFiles.length === 0"
class="split-btn"
>
<el-icon><ChatDotSquare /></el-icon>
<span>批量分割</span>
</el-button>
<el-button
type="primary"
@click="openGenerateDialog"
:disabled="completedFiles === 0"
class="generate-btn"
>
<el-icon><VideoPlay /></el-icon>
<span>批量生成</span>
</el-button>
</div>
</div>
<!-- Stats Cards -->
<div class="stats-grid">
<div
class="stat-card stat-total"
:class="{ active: filterStatus === '' }"
@click="filterStatus = ''"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><Document /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ files.length }}</span>
<span class="stat-label">总文件</span>
</div>
</div>
</div>
<div
class="stat-card stat-completed"
:class="{ active: filterStatus === 'completed' }"
@click="filterStatus = filterStatus === 'completed' ? '' : 'completed'"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><CircleCheckFilled /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ completedFiles }}</span>
<span class="stat-label">已分割</span>
</div>
</div>
</div>
<div
class="stat-card stat-processing"
:class="{ active: filterStatus === 'processing' }"
@click="filterStatus = filterStatus === 'processing' ? '' : 'processing'"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><Loading /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ processingCount }}</span>
<span class="stat-label">分割中</span>
</div>
</div>
</div>
<div
class="stat-card stat-chunks"
>
<div class="stat-glow"></div>
<div class="stat-inner">
<div class="stat-icon-wrap">
<el-icon size="24"><List /></el-icon>
</div>
<div class="stat-info">
<span class="stat-value">{{ totalChunks }}</span>
<span class="stat-label">总文本块</span>
</div>
</div>
</div>
</div>
<!-- File List / Split View -->
<div class="content-area">
<!-- Empty State -->
<div v-if="!loading && !isInitialLoad && filteredFiles.length === 0" class="empty-state">
<div class="empty-illustration">
<div class="orbit orbit-1"></div>
<div class="orbit orbit-2"></div>
<div class="orbit orbit-3"></div>
<div class="empty-core">
<el-icon size="40"><FolderOpened /></el-icon>
</div>
</div>
<h3 class="empty-title">暂无可分割文件</h3>
<p class="empty-desc">请先在文件管理中上传文档</p>
</div>
<!-- File Table -->
<div v-else class="files-table-wrapper">
<div class="table-header">
<div class="table-select">
<el-checkbox
:model-value="isAllSelected"
@change="toggleSelectAll"
class="select-all"
>
<span v-if="selectedCount > 0" class="selected-text">已选择 {{ selectedCount }} </span>
<span v-else>全选</span>
</el-checkbox>
</div>
<div class="table-actions" v-if="selectedCount > 0">
<el-button type="danger" size="small" plain @click="clearSelection" class="batch-clear-btn">
<el-icon><Close /></el-icon>
<span>清除选择</span>
</el-button>
</div>
</div>
<div class="files-list">
<div
v-for="(file, index) in filteredFiles"
:key="file.id"
class="file-row"
:class="{
'is-selected': isSelected(file.id),
'is-processing': file.status === 'processing'
}"
:style="{ '--delay': index * 0.04 + 's' }"
@click="toggleSelect(file.id)"
>
<div class="col-select">
<el-checkbox
:model-value="isSelected(file.id)"
@click.stop
@change="toggleSelect(file.id)"
/>
</div>
<div class="col-icon">
<div class="file-type-icon" style="background: #8b5cf6;">
<el-icon size="18" color="white">
<Document />
</el-icon>
</div>
</div>
<div class="col-name">
<span class="file-name">{{ file.filename }}.md</span>
<span class="file-meta">{{ formatSize(file.size) }}</span>
</div>
<div class="col-chunks">
<span v-if="fileChunks[file.id]" class="chunk-count">
{{ fileChunks[file.id] }}
</span>
<span v-else class="chunk-count empty">-</span>
</div>
<div class="col-status">
<div v-if="file.status === 'processing'" class="status-badge processing">
<el-icon class="spin" size="12"><Loading /></el-icon>
<span>分割中</span>
</div>
<div v-else-if="fileChunks[file.id]" class="status-badge success">
<el-icon size="12"><CircleCheckFilled /></el-icon>
<span>已完成</span>
</div>
<div v-else class="status-badge pending">
<el-icon size="12"><Clock /></el-icon>
<span>待分割</span>
</div>
</div>
<div class="col-operations" v-if="fileChunks[file.id]">
<el-tooltip content="预览修改" placement="top">
<el-button text size="small" class="op-btn" @click.stop="openChunkPreview(file)">
<el-icon><Edit /></el-icon>
</el-button>
</el-tooltip>
<el-tooltip content="删除该文件的所有块" placement="top">
<el-button
text
size="small"
class="op-btn delete"
:loading="deletingFileChunksId === file.id"
@click.stop="openDeleteFileChunksDialog(file)"
>
<el-icon><Delete /></el-icon>
</el-button>
</el-tooltip>
</div>
</div>
</div>
</div>
</div>
</div>
<!-- Batch Split Dialog -->
<Teleport to="body">
<Transition name="dialog-fade">
<div v-if="splitDialogVisible" class="split-dialog-overlay" @click.self="splitDialogVisible = false">
<div class="split-dialog">
<!-- Header -->
<div class="dialog-header">
<div class="header-icon">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path d="M12 2L2 7l10 5 10-5-10-5z"/>
<path d="M2 17l10 5 10-5"/>
<path d="M2 12l10 5 10-5"/>
</svg>
</div>
<div class="header-text">
<h3>批量分割配置</h3>
<span class="header-sub">{{ selectedFiles.length }} 个文件待处理</span>
</div>
<button class="close-btn" @click="splitDialogVisible = false">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path d="M18 6L6 18M6 6l12 12"/>
</svg>
</button>
</div>
<!-- Dialog Body -->
<div class="dialog-body">
<!-- Split Method Selector -->
<div class="method-selector">
<div class="method-label">
<span class="label-icon"></span>
<span>分割算法</span>
</div>
<div class="method-grid">
<button
v-for="m in methods"
:key="m.value"
class="method-btn"
:class="{ active: splitConfig.method === m.value }"
@click="splitConfig.method = m.value"
>
<span class="method-name">{{ m.label }}</span>
<span class="method-tag">{{ m.tag }}</span>
</button>
</div>
</div>
<!-- Parameters Panel -->
<div class="params-panel">
<!-- Panel Grid Background -->
<div class="panel-grid"></div>
<!-- Common Parameters -->
<div class="param-section" v-if="splitConfig.method !== 'paragraph'">
<div class="section-header">
<span class="section-num">01</span>
<span class="section-title">块大小控制</span>
</div>
<div class="param-row">
<div class="param-info">
<span class="param-name">chunk_size</span>
<span class="param-desc">每个文本块的字符数</span>
</div>
<div class="param-control">
<input
type="range"
v-model.number="splitConfig.chunk_size"
:min="100"
:max="2000"
:step="100"
class="cyber-slider"
/>
<div class="param-value">
<span class="value-num">{{ splitConfig.chunk_size }}</span>
<span class="value-unit">chars</span>
</div>
</div>
</div>
<!-- Visual Preview -->
<div class="chunk-preview">
<div class="preview-label">预览</div>
<div class="preview-bars">
<div
v-for="i in 5"
:key="i"
class="preview-bar"
:style="{ width: (splitConfig.chunk_size / 20) + 'px' }"
></div>
</div>
</div>
</div>
<div class="param-section" v-if="splitConfig.method !== 'sentence'">
<div class="section-header">
<span class="section-num">02</span>
<span class="section-title">重叠控制</span>
</div>
<div class="param-row">
<div class="param-info">
<span class="param-name">overlap</span>
<span class="param-desc">相邻块之间的重叠字符数</span>
</div>
<div class="param-control">
<input
type="range"
v-model.number="splitConfig.overlap"
:min="0"
:max="500"
:step="50"
class="cyber-slider"
/>
<div class="param-value">
<span class="value-num">{{ splitConfig.overlap }}</span>
<span class="value-unit">chars</span>
</div>
</div>
</div>
<!-- Overlap Visual -->
<div class="overlap-preview">
<div class="overlap-block" :style="{ width: (splitConfig.chunk_size / 4) + 'px' }"></div>
<div class="overlap-zone" :style="{ width: (splitConfig.overlap / 4) + 'px' }"></div>
<div class="overlap-block" :style="{ width: (splitConfig.chunk_size / 4) + 'px' }"></div>
</div>
</div>
<!-- Custom Separator -->
<div class="param-section" v-if="splitConfig.method === 'custom'">
<div class="section-header">
<span class="section-num">03</span>
<span class="section-title">自定义分隔符</span>
</div>
<div class="param-row">
<div class="param-info">
<span class="param-name">separator</span>
<span class="param-desc">用于分割文本的字符序列</span>
</div>
<div class="param-control full">
<input
type="text"
v-model="splitConfig.separator"
placeholder="\n\n"
class="cyber-input"
/>
</div>
</div>
</div>
<template v-if="splitConfig.method === 'semantic'">
<div class="param-section">
<div class="section-header highlight">
<span class="section-num">04</span>
<span class="section-title">句段优先说明</span>
<span class="section-badge">RULE</span>
</div>
<div class="param-row full">
<div class="param-info">
<span class="param-name">规则型切分</span>
<span class="param-desc">按段落和句子边界优先切分不调用 embedding API更接近句段优先的递归切分</span>
</div>
</div>
</div>
</template>
<template v-if="splitConfig.method === 'semantic_embedding'">
<div class="param-section">
<div class="section-header highlight">
<span class="section-num">04</span>
<span class="section-title">Embedding 模型</span>
<span class="section-badge">API</span>
</div>
<div class="api-grid">
<div class="param-row full">
<div class="param-info">
<span class="param-name">embedding_model</span>
<span class="param-desc">直接使用模型管理中已配置的 embedding 模型</span>
</div>
<div class="param-control full">
<select v-model="splitConfig.embedding_model_id" class="cyber-select" :disabled="embeddingModels.length === 0">
<option value="" disabled>
{{ embeddingModels.length ? '选择 embedding 模型' : '暂无可用 embedding 模型' }}
</option>
<option
v-for="model in embeddingModels"
:key="model.id"
:value="model.id"
>
{{ model.model_name }} · {{ getProviderLabel(model.provider) }}
</option>
</select>
</div>
</div>
<div class="param-row full" v-if="embeddingModels.length === 0">
<div class="embedding-empty-hint">
<span>还没有可用的 embedding 模型</span>
<button type="button" class="text-link-btn" @click="goToModelSettings">
去模型配置
</button>
</div>
</div>
</div>
</div>
<div class="param-section">
<div class="section-header highlight">
<span class="section-num">05</span>
<span class="section-title">语义边界参数</span>
<span class="section-badge">AI</span>
</div>
<div class="param-row">
<div class="param-info">
<span class="param-name">similarity_threshold</span>
<span class="param-desc">越高越保守越低越容易切出新块</span>
</div>
<div class="param-control">
<input
type="range"
v-model.number="splitConfig.similarity_threshold"
:min="0.1"
:max="0.9"
:step="0.05"
class="cyber-slider accent"
/>
<div class="param-value accent">
<span class="value-num">{{ splitConfig.similarity_threshold.toFixed(2) }}</span>
</div>
</div>
</div>
<div class="param-row">
<div class="param-info">
<span class="param-name">min_chunk_size</span>
<span class="param-desc">最小块大小避免语义过碎</span>
</div>
<div class="param-control">
<input
type="range"
v-model.number="splitConfig.min_chunk_size"
:min="50"
:max="500"
:step="10"
class="cyber-slider accent"
/>
<div class="param-value accent">
<span class="value-num">{{ splitConfig.min_chunk_size }}</span>
<span class="value-unit">chars</span>
</div>
</div>
</div>
</div>
</template>
</div>
</div>
<div class="dialog-footer">
<button class="btn-cancel" @click="splitDialogVisible = false">
<span>取消</span>
</button>
<button class="btn-confirm" @click="handleBatchSplit" :class="{ loading: splitting }">
<span v-if="!splitting" class="btn-text">
<svg class="btn-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<polygon points="5 3 19 12 5 21 5 3"/>
</svg>
开始生成
</span>
<span v-else class="btn-loading">
<span class="loading-dot"></span>
<span class="loading-dot"></span>
<span class="loading-dot"></span>
</span>
</button>
</div>
</div>
</div>
</Transition>
</Teleport>
<Teleport to="body">
<Transition name="dialog-fade">
<div v-if="generateDialogVisible" class="split-dialog-overlay" @click.self="generateDialogVisible = false">
<div class="split-dialog generate-dialog-shell">
<div class="dialog-header">
<div class="header-icon">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path d="M12 2l2.4 5.8L20 10l-4 4 1 6-5-3-5 3 1-6-4-4 5.6-2.2L12 2z"/>
</svg>
</div>
<div class="header-text">
<h3>批量生成问答</h3>
<span class="header-sub">面向当前项目全部已分割文本块</span>
</div>
<button class="close-btn" @click="generateDialogVisible = false">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path d="M18 6L6 18M6 6l12 12"/>
</svg>
</button>
</div>
<div class="dialog-body">
<div class="params-panel">
<div class="panel-grid"></div>
<div class="param-section">
<div class="section-header">
<span class="section-num">01</span>
<span class="section-title">大语言模型</span>
</div>
<div class="param-row full">
<div class="param-info">
<span class="param-name">model</span>
<span class="param-desc">选择用于生成问答对的 chat / vlm 模型</span>
</div>
<div class="param-control full">
<el-select
v-model="generateConfig.model_id"
class="cyber-select generate-model-select"
:disabled="generateModels.length === 0"
:teleported="false"
placement="bottom-start"
popper-class="generate-model-dropdown"
placeholder="选择生成模型"
>
<el-option
v-for="model in generateModels"
:key="model.id"
:label="`${model.model_name} · ${getProviderLabel(model.provider)}`"
:value="model.id"
>
<div class="generate-option">
<div class="generate-option__title">{{ model.model_name }}</div>
<div class="generate-option__meta">{{ getProviderLabel(model.provider) }}</div>
</div>
</el-option>
</el-select>
</div>
</div>
<div class="param-row full" v-if="generateModels.length === 0">
<div class="embedding-empty-hint">
<span>还没有可用的 chat / vlm 模型</span>
<button type="button" class="text-link-btn" @click="goToModelSettings">
去模型配置
</button>
</div>
</div>
</div>
<div class="param-section generate-strategy-section">
<div class="section-header">
<span class="section-num">02</span>
<span class="section-title">生成策略</span>
</div>
<div class="generate-strategy-grid">
<button
type="button"
class="strategy-card"
:class="{ active: generateConfig.dirty_data_filter }"
@click="generateConfig.dirty_data_filter = !generateConfig.dirty_data_filter"
>
<div class="strategy-card__head">
<span class="strategy-card__title">脏数据过滤</span>
<span class="strategy-card__switch" :class="{ active: generateConfig.dirty_data_filter }">
<span class="strategy-card__switch-handle"></span>
</span>
</div>
<p class="strategy-card__desc">过滤目录极短内容和明显噪声减少无效调用</p>
<span class="strategy-card__state">{{ generateConfig.dirty_data_filter ? '已开启' : '已关闭' }}</span>
</button>
<button
type="button"
class="strategy-card"
:class="{ active: generateConfig.thinking_mode }"
@click="generateConfig.thinking_mode = !generateConfig.thinking_mode"
>
<div class="strategy-card__head">
<span class="strategy-card__title">思考模式</span>
<span class="strategy-card__switch" :class="{ active: generateConfig.thinking_mode }">
<span class="strategy-card__switch-handle"></span>
</span>
</div>
<p class="strategy-card__desc">生成前强化内容分析提升问题质量与覆盖度</p>
<span class="strategy-card__state">{{ generateConfig.thinking_mode ? '已开启' : '已关闭' }}</span>
</button>
</div>
<div class="param-row">
<div class="param-info">
<span class="param-name">single_chunk_count</span>
<span class="param-desc">每个 chunk 生成多少组问答</span>
</div>
<div class="param-control">
<input
type="range"
v-model.number="generateConfig.count"
:min="1"
:max="8"
:step="1"
class="cyber-slider accent"
/>
<div class="param-value accent">
<span class="value-num">{{ generateConfig.count }}</span>
<span class="value-unit">pairs</span>
</div>
</div>
</div>
</div>
<div class="param-section generate-prompt-section">
<div class="section-header">
<span class="section-num">03</span>
<span class="section-title">预设提示语</span>
</div>
<div class="generate-prompt-box">
<textarea
v-model="generateConfig.preset_prompt"
class="cyber-textarea generate-prompt-textarea"
rows="9"
/>
</div>
</div>
</div>
</div>
<div class="dialog-footer">
<button class="btn-cancel" @click="generateDialogVisible = false">
<span>取消</span>
</button>
<button class="btn-confirm" @click="handleBatchGenerate" :class="{ loading: generatingQuestions }">
<span v-if="!generatingQuestions" class="btn-text">
<svg class="btn-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<polygon points="5 3 19 12 5 21 5 3"/>
</svg>
开始生成
</span>
<span v-else class="btn-loading">
<span class="loading-dot"></span>
<span class="loading-dot"></span>
<span class="loading-dot"></span>
</span>
</button>
</div>
</div>
</div>
</Transition>
</Teleport>
<Teleport to="body">
<Transition name="dialog-fade">
<div v-if="chunkPreviewVisible" class="chunk-preview-overlay" @click.self="chunkPreviewVisible = false">
<div class="chunk-preview-dialog">
<div class="dialog-header">
<div class="header-icon">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path d="M12 2L2 7l10 5 10-5-10-5z"/>
<path d="M2 17l10 5 10-5"/>
<path d="M2 12l10 5 10-5"/>
</svg>
</div>
<div class="header-text">
<h3>预览修改</h3>
<span class="header-sub">{{ previewFile?.filename }} - {{ previewChunks.length }} 个块</span>
</div>
<button class="close-btn" @click="chunkPreviewVisible = false">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path d="M18 6L6 18M6 6l12 12"/>
</svg>
</button>
</div>
<div class="chunk-preview-toolbar">
<label class="preview-search">
<span class="preview-search-label">检索</span>
<input
v-model.trim="previewSearch"
type="text"
class="cyber-input"
placeholder="搜索块内容、块号"
/>
</label>
<div class="preview-filter-group">
<button
class="preview-filter-btn"
:class="{ active: previewFilter === 'all' }"
@click="previewFilter = 'all'"
>
全部
</button>
<button
class="preview-filter-btn"
:class="{ active: previewFilter === 'modified' }"
@click="previewFilter = 'modified'"
>
仅已修改
</button>
</div>
<label class="preview-jump">
<span class="preview-search-label">跳转</span>
<input
v-model.trim="previewJumpInput"
type="text"
inputmode="numeric"
class="cyber-input"
placeholder="块号"
@keydown.enter.prevent="jumpToChunk"
/>
</label>
<div class="preview-stats">
<span>{{ filteredPreviewChunks.length }} / {{ previewChunks.length }} </span>
<span>{{ modifiedPreviewCount }} 已修改</span>
</div>
</div>
<div class="dialog-body chunk-workspace">
<div v-if="previewLoading" class="loading-state">
<span class="loading-dot"></span>
<span class="loading-dot"></span>
<span class="loading-dot"></span>
</div>
<div v-else-if="previewChunks.length" class="chunk-workspace-grid">
<aside class="chunk-nav-panel">
<div class="chunk-nav-list">
<button
v-for="chunk in filteredPreviewChunks"
:key="chunk.id"
class="chunk-nav-item"
:class="{
active: chunk.id === selectedPreviewChunkId,
modified: isChunkModified(chunk)
}"
@click="selectPreviewChunk(chunk.id)"
>
<div class="chunk-nav-meta">
<span class="chunk-nav-index"> {{ chunk.displayIndex }}</span>
<span class="chunk-nav-words">{{ chunk.word_count || 0 }} </span>
</div>
<p class="chunk-nav-snippet">{{ getChunkSnippet(chunk) }}</p>
<div class="chunk-nav-footer">
<span class="chunk-nav-status">{{ isChunkModified(chunk) ? '已修改' : '原始' }}</span>
</div>
</button>
</div>
</aside>
<section v-if="activePreviewChunk" class="chunk-editor-panel">
<div class="chunk-item">
<div class="chunk-header">
<div class="chunk-header-main">
<span class="chunk-index"> {{ activePreviewChunkIndex }} / {{ previewChunks.length }}</span>
<span class="chunk-state-pill" :class="{ modified: isChunkModified(activePreviewChunk) }">
{{ isChunkModified(activePreviewChunk) ? '已修改未保存' : '内容未变更' }}
</span>
</div>
<span class="chunk-words">{{ activePreviewChunk.word_count || 0 }} </span>
</div>
<div class="chunk-form">
<div class="form-group">
<label>内容</label>
<textarea
v-model="activePreviewChunk.editingContent"
class="cyber-textarea preview-editor"
rows="20"
></textarea>
</div>
<div class="chunk-actions">
<button
class="btn-secondary"
:disabled="!isChunkModified(activePreviewChunk)"
@click="resetChunk(activePreviewChunk)"
>
还原
</button>
<button
class="btn-save"
:class="{ loading: savingChunks }"
:disabled="!isChunkModified(activePreviewChunk)"
@click="saveChunk(activePreviewChunk)"
>
<span v-if="!savingChunks">保存</span>
<span v-else class="btn-loading">
<span class="loading-dot"></span>
<span class="loading-dot"></span>
<span class="loading-dot"></span>
</span>
</button>
<button
class="btn-delete"
:class="{ loading: deletingChunkId === activePreviewChunk.id }"
:disabled="deletingChunkId === activePreviewChunk.id"
@click="openDeleteChunkDialog(activePreviewChunk)"
>
<span v-if="deletingChunkId !== activePreviewChunk.id">删除</span>
<span v-else class="btn-loading">
<span class="loading-dot"></span>
<span class="loading-dot"></span>
<span class="loading-dot"></span>
</span>
</button>
</div>
</div>
</div>
</section>
</div>
<div v-else class="chunk-empty-state">
<h4>没有匹配的分片</h4>
<p>试试清空搜索词或切回全部查看所有分片</p>
</div>
</div>
</div>
</div>
</Transition>
</Teleport>
<DeleteDialog
v-model:visible="deleteDialogVisible"
:title="deleteDialogTitle"
:item-name="deleteDialogItemName"
:detail-text="deleteDialogDetail"
:warning-text="deleteDialogWarning"
:confirm-text="deleteDialogConfirmText"
:loading="deleteDialogLoading"
@confirm="confirmDeleteAction"
/>
</template>
<script lang="ts" src="../page-logic/ProjectTextSplitPage.ts"></script>
<style scoped src="../styles/pages/project-text-split.css"></style>

View File

@@ -1,67 +0,0 @@
import { createRouter, createWebHistory } from 'vue-router'
const routes = [
{
path: '/',
name: 'Home',
component: () => import('@/views/HomeView.vue')
},
{
path: '/project/:id',
name: 'Project',
component: () => import('@/views/ProjectView.vue'),
children: [
{
path: '',
redirect: to => `/project/${to.params.id}/files`
},
{
path: 'files',
name: 'ProjectFiles',
component: () => import('@/views/project/FileManage.vue')
},
{
path: 'split',
name: 'ProjectSplit',
component: () => import('@/views/project/TextSplit.vue')
},
{
path: 'questions',
name: 'ProjectQuestions',
component: () => import('@/views/project/QuestionManage.vue')
},
{
path: 'datasets',
name: 'ProjectDatasets',
component: () => import('@/views/project/DatasetManage.vue')
},
{
path: 'eval',
name: 'ProjectEval',
component: () => import('@/views/project/EvalManage.vue')
},
{
path: 'settings',
name: 'ProjectSettings',
component: () => import('@/views/project/Settings.vue')
}
]
},
{
path: '/playground',
name: 'Playground',
component: () => import('@/views/PlaygroundView.vue')
},
{
path: '/data-square',
name: 'DataSquare',
component: () => import('@/views/DataSquareView.vue')
}
]
const router = createRouter({
history: createWebHistory(),
routes
})
export default router

View File

@@ -0,0 +1,445 @@
<template>
<el-dialog
:model-value="visible"
title=""
width="480px"
class="create-dialog"
:show-close="false"
align-center
@update:model-value="$emit('update:visible', $event)"
>
<template #header>
<div class="dialog-header">
<div class="header-glow"></div>
<div class="header-content">
<div class="dialog-icon-new">
<el-icon size="24"><FolderAdd /></el-icon>
</div>
<div class="header-text">
<h3>创建新项目</h3>
<p>开始构建您的AI训练数据集</p>
</div>
</div>
<button class="close-btn" @click="handleClose">
<el-icon><Close /></el-icon>
</button>
</div>
</template>
<div class="dialog-body">
<div class="form-card">
<div class="input-group">
<label class="input-label">
<span class="label-icon"></span>
项目名称
</label>
<el-input
v-model="formData.name"
placeholder="例如:客服问答数据集"
size="large"
class="custom-input"
/>
</div>
<div class="input-group">
<label class="input-label">
<span class="label-icon"></span>
项目描述
</label>
<el-input
v-model="formData.description"
type="textarea"
:rows="3"
placeholder="描述这个项目的用途和数据类型..."
class="custom-input"
/>
</div>
</div>
<!-- Quick Templates -->
<div class="templates-section">
<span class="templates-label">快速开始模板</span>
<div class="templates-grid">
<div
class="template-card"
:class="{ active: formData.type === 'qa' }"
@click="useTemplate('qa')"
>
<el-icon><ChatDotRound /></el-icon>
<span>问答对</span>
</div>
<div
class="template-card"
:class="{ active: formData.type === 'table' }"
@click="useTemplate('table')"
>
<el-icon><Document /></el-icon>
<span>表格</span>
</div>
<div
class="template-card"
:class="{ active: formData.type === 'database' }"
@click="useTemplate('database')"
>
<el-icon><Connection /></el-icon>
<span>数据库</span>
</div>
</div>
</div>
</div>
<template #footer>
<div class="dialog-footer">
<el-button
type="primary"
:loading="loading"
@click="handleSubmit"
class="btn-create"
>
<el-icon v-if="!loading"><Plus /></el-icon>
创建项目
</el-button>
</div>
</template>
</el-dialog>
</template>
<script setup>
import { reactive, watch } from 'vue'
const props = defineProps({
visible: {
type: Boolean,
default: false
},
loading: {
type: Boolean,
default: false
}
})
const emit = defineEmits(['update:visible', 'submit'])
const formData = reactive({
name: '',
description: '',
type: ''
})
const templates = {
qa: { name: '问答数据集', description: '基于文档生成问答对训练数据' },
table: { name: '表格数据集', description: '从表格数据生成结构化训练数据' },
database: { name: '数据库数据集', description: '从数据库导出数据生成训练数据' }
}
const useTemplate = (type) => {
formData.type = type
}
const handleClose = () => {
emit('update:visible', false)
}
const handleSubmit = () => {
emit('submit', { ...formData })
}
// Reset form when dialog opens
watch(() => props.visible, (newVal) => {
if (newVal) {
formData.name = ''
formData.description = ''
formData.type = ''
}
})
</script>
<style scoped>
.create-dialog :deep(.el-dialog) {
border-radius: 20px;
background: linear-gradient(145deg, rgba(20, 20, 30, 0.98) 0%, rgba(15, 15, 25, 0.98) 100%);
border: 1px solid rgba(255, 255, 255, 0.08);
box-shadow:
0 25px 50px -12px rgba(0, 0, 0, 0.6),
0 0 40px rgba(0, 212, 255, 0.1),
inset 0 1px 0 rgba(255, 255, 255, 0.05);
overflow: hidden;
}
.create-dialog :deep(.el-dialog__header) {
padding: 0;
margin: 0;
}
.dialog-header {
position: relative;
padding: 24px 28px;
background: linear-gradient(135deg, rgba(0, 212, 255, 0.08) 0%, rgba(124, 58, 237, 0.08) 100%);
border-bottom: 1px solid rgba(255, 255, 255, 0.06);
overflow: hidden;
}
.header-glow {
position: absolute;
top: -50%;
left: -50%;
width: 200%;
height: 200%;
background: radial-gradient(circle at 30% 50%, rgba(0, 212, 255, 0.15) 0%, transparent 50%);
animation: headerGlow 4s ease-in-out infinite;
}
@keyframes headerGlow {
0%, 100% { opacity: 0.5; transform: scale(1); }
50% { opacity: 0.8; transform: scale(1.1); }
}
.header-content {
position: relative;
display: flex;
align-items: center;
gap: 16px;
}
.dialog-icon-new {
width: 52px;
height: 52px;
display: flex;
align-items: center;
justify-content: center;
background: linear-gradient(135deg, rgba(0, 212, 255, 0.2) 0%, rgba(124, 58, 237, 0.2) 100%);
border: 1px solid rgba(0, 212, 255, 0.3);
border-radius: 14px;
color: var(--accent-primary);
animation: iconPulse 2s ease-in-out infinite;
}
@keyframes iconPulse {
0%, 100% { box-shadow: 0 0 0 0 rgba(0, 212, 255, 0.4); }
50% { box-shadow: 0 0 20px 5px rgba(0, 212, 255, 0.2); }
}
.header-text h3 {
font-size: 20px;
font-weight: 600;
color: var(--text-primary);
margin: 0 0 4px 0;
}
.header-text p {
font-size: 13px;
color: var(--text-tertiary);
margin: 0;
}
.close-btn {
position: absolute;
top: 16px;
right: 16px;
width: 32px;
height: 32px;
display: flex;
align-items: center;
justify-content: center;
background: rgba(255, 255, 255, 0.05);
border: 1px solid rgba(255, 255, 255, 0.08);
border-radius: 8px;
color: var(--text-muted);
cursor: pointer;
transition: all 0.2s ease;
}
.close-btn:hover {
background: rgba(255, 255, 255, 0.1);
color: var(--text-primary);
border-color: rgba(255, 255, 255, 0.15);
}
/* Dialog Body */
.create-dialog :deep(.el-dialog__body) {
padding: 28px;
}
.dialog-body {
display: flex;
flex-direction: column;
gap: 24px;
}
.form-card {
display: flex;
flex-direction: column;
gap: 20px;
padding: 24px;
background: rgba(255, 255, 255, 0.02);
border: 1px solid rgba(255, 255, 255, 0.05);
border-radius: 16px;
}
.input-group {
display: flex;
flex-direction: column;
gap: 8px;
}
.input-label {
display: flex;
align-items: center;
gap: 8px;
font-size: 13px;
font-weight: 500;
color: var(--text-secondary);
}
.label-icon {
color: var(--accent-primary);
font-size: 10px;
}
.custom-input :deep(.el-input__wrapper) {
background: rgba(0, 0, 0, 0.3);
border: 1px solid rgba(255, 255, 255, 0.08);
border-radius: 10px;
box-shadow: none;
padding: 4px 14px;
transition: all 0.25s ease;
}
.custom-input :deep(.el-input__wrapper:hover) {
border-color: rgba(0, 212, 255, 0.3);
}
.custom-input :deep(.el-input__wrapper.is-focus) {
border-color: var(--accent-primary);
box-shadow: 0 0 0 3px rgba(0, 212, 255, 0.15);
}
.custom-input :deep(.el-input__inner) {
color: var(--text-primary);
}
.custom-input :deep(.el-input__inner::placeholder) {
color: var(--text-muted);
}
.custom-input :deep(.el-textarea__inner) {
background: rgba(0, 0, 0, 0.3);
border: 1px solid rgba(255, 255, 255, 0.08);
border-radius: 10px;
box-shadow: none;
padding: 12px 14px;
color: var(--text-primary);
resize: none;
transition: all 0.25s ease;
}
.custom-input :deep(.el-textarea__inner::placeholder) {
color: var(--text-muted);
}
.custom-input :deep(.el-textarea__inner:hover) {
border-color: rgba(0, 212, 255, 0.3);
}
.custom-input :deep(.el-textarea__inner:focus) {
border-color: var(--accent-primary);
box-shadow: 0 0 0 3px rgba(0, 212, 255, 0.15);
}
/* Templates Section */
.templates-section {
display: flex;
flex-direction: column;
gap: 12px;
}
.templates-label {
font-size: 12px;
font-weight: 500;
color: var(--text-muted);
text-transform: uppercase;
letter-spacing: 0.5px;
}
.templates-grid {
display: grid;
grid-template-columns: repeat(3, 1fr);
gap: 12px;
}
.template-card {
display: flex;
flex-direction: column;
align-items: center;
gap: 8px;
padding: 16px 12px;
background: rgba(255, 255, 255, 0.02);
border: 1px solid rgba(255, 255, 255, 0.06);
border-radius: 12px;
cursor: pointer;
transition: all 0.25s ease;
}
.template-card:hover {
background: rgba(0, 212, 255, 0.08);
border-color: rgba(0, 212, 255, 0.25);
transform: translateY(-2px);
}
.template-card.active {
background: rgba(0, 212, 255, 0.12);
border-color: var(--accent-primary);
box-shadow: 0 0 15px rgba(0, 212, 255, 0.2);
}
.template-card .el-icon {
font-size: 22px;
color: var(--accent-primary);
}
.template-card span {
font-size: 12px;
color: var(--text-secondary);
}
/* Dialog Footer */
.create-dialog :deep(.el-dialog__footer) {
padding: 0;
}
.dialog-footer {
display: flex;
justify-content: center;
padding: 20px 28px;
background: rgba(0, 0, 0, 0.2);
border-top: 1px solid rgba(255, 255, 255, 0.05);
}
.btn-cancel {
padding: 10px 20px;
background: transparent;
border: 1px solid rgba(255, 255, 255, 0.1);
border-radius: 10px;
color: var(--text-secondary);
transition: all 0.2s ease;
}
.btn-cancel:hover {
background: rgba(255, 255, 255, 0.05);
border-color: rgba(255, 255, 255, 0.15);
color: var(--text-primary);
}
.btn-create {
width: 100%;
padding: 14px 32px;
background: linear-gradient(135deg, var(--accent-primary) 0%, #0891b2 100%);
border: none;
border-radius: 10px;
font-weight: 500;
font-size: 15px;
transition: all 0.25s ease;
}
.btn-create:hover {
transform: translateY(-1px);
box-shadow: 0 4px 15px rgba(0, 212, 255, 0.35);
}
</style>

View File

@@ -0,0 +1,261 @@
<template>
<el-dialog
:model-value="visible"
title=""
width="460px"
class="delete-dialog"
:show-close="false"
align-center
:close-on-click-modal="false"
@update:model-value="$emit('update:visible', $event)"
>
<template #header>
<div class="delete-dialog-header">
<div class="delete-header-top">
<div class="delete-icon-wrapper">
<el-icon size="26"><WarningFilled /></el-icon>
</div>
<div class="delete-kicker">Danger Zone</div>
</div>
<div class="delete-title-group">
<h3>{{ title }}</h3>
<p v-if="detailText" class="detail-text">{{ detailText }}</p>
</div>
</div>
</template>
<div class="delete-dialog-body">
<div class="delete-target-card">
<span class="target-label">目标对象</span>
<strong class="target-name">{{ itemName }}</strong>
</div>
<p class="warning-text">{{ warningText }}</p>
</div>
<template #footer>
<div class="delete-dialog-footer">
<el-button
@click="handleCancel"
class="btn-cancel-delete"
>
取消
</el-button>
<el-button
type="danger"
:loading="loading"
@click="handleConfirm"
class="btn-delete"
>
<el-icon v-if="!loading"><Delete /></el-icon>
{{ confirmText }}
</el-button>
</div>
</template>
</el-dialog>
</template>
<script setup>
const props = defineProps({
visible: {
type: Boolean,
default: false
},
title: {
type: String,
default: '删除项目'
},
itemName: {
type: String,
default: ''
},
warningText: {
type: String,
default: '此操作不可恢复,所有相关数据将被永久删除'
},
detailText: {
type: String,
default: ''
},
confirmText: {
type: String,
default: '确认删除'
},
loading: {
type: Boolean,
default: false
}
})
const emit = defineEmits(['update:visible', 'confirm', 'cancel'])
const handleConfirm = () => {
emit('confirm')
}
const handleCancel = () => {
emit('update:visible', false)
emit('cancel')
}
</script>
<style scoped>
.delete-dialog :deep(.el-dialog) {
background:
radial-gradient(circle at top left, rgba(239, 68, 68, 0.14), transparent 24%),
radial-gradient(circle at bottom right, rgba(251, 146, 60, 0.08), transparent 20%),
linear-gradient(180deg, #11151d 0%, #0d1118 100%);
border: 1px solid rgba(239, 68, 68, 0.18);
border-radius: 22px;
overflow: hidden;
box-shadow:
0 24px 80px rgba(0, 0, 0, 0.5),
inset 0 1px 0 rgba(255, 255, 255, 0.04);
}
.delete-dialog :deep(.el-dialog__header) {
padding: 0;
margin: 0;
}
.delete-dialog :deep(.el-dialog__body) {
padding: 0;
}
.delete-dialog :deep(.el-dialog__footer) {
padding: 0;
}
.delete-dialog-header {
padding: 24px 24px 18px;
display: flex;
flex-direction: column;
gap: 14px;
border-bottom: 1px solid rgba(255, 255, 255, 0.06);
background: linear-gradient(180deg, rgba(255, 255, 255, 0.03), transparent);
}
.delete-header-top {
display: flex;
align-items: center;
justify-content: space-between;
}
.delete-icon-wrapper {
width: 48px;
height: 48px;
display: flex;
align-items: center;
justify-content: center;
background: linear-gradient(135deg, rgba(239, 68, 68, 0.18), rgba(249, 115, 22, 0.08));
border: 1px solid rgba(239, 68, 68, 0.24);
border-radius: 14px;
color: #fb7185;
box-shadow: inset 0 1px 0 rgba(255, 255, 255, 0.05);
}
.delete-kicker {
font-size: 11px;
letter-spacing: 0.16em;
text-transform: uppercase;
color: rgba(251, 113, 133, 0.72);
font-family: 'JetBrains Mono', 'SF Mono', monospace;
}
.delete-title-group {
display: flex;
flex-direction: column;
gap: 6px;
}
.delete-dialog-header h3 {
margin: 0;
font-size: 22px;
font-weight: 600;
color: #f8fafc;
letter-spacing: -0.02em;
}
.detail-text {
margin: 0;
color: rgba(226, 232, 240, 0.72);
font-size: 13px;
line-height: 1.6;
}
.delete-dialog-body {
padding: 22px 24px 24px;
}
.delete-target-card {
display: flex;
flex-direction: column;
gap: 8px;
padding: 14px 16px;
border-radius: 16px;
background: rgba(255, 255, 255, 0.03);
border: 1px solid rgba(255, 255, 255, 0.06);
margin-bottom: 14px;
}
.target-label {
font-size: 11px;
letter-spacing: 0.12em;
text-transform: uppercase;
color: rgba(148, 163, 184, 0.65);
font-family: 'JetBrains Mono', 'SF Mono', monospace;
}
.target-name {
color: #f8fafc;
font-size: 15px;
font-weight: 600;
line-height: 1.5;
word-break: break-word;
}
.warning-text {
margin: 0;
font-size: 13px;
line-height: 1.7;
color: #fca5a5;
}
.delete-dialog-footer {
display: flex;
justify-content: flex-end;
gap: 12px;
padding: 18px 24px 24px;
background: linear-gradient(180deg, transparent, rgba(0, 0, 0, 0.14));
border-top: 1px solid rgba(255, 255, 255, 0.04);
}
.btn-cancel-delete {
min-width: 108px;
padding: 10px 20px;
background: rgba(255, 255, 255, 0.03);
border: 1px solid rgba(255, 255, 255, 0.08);
border-radius: 10px;
color: rgba(226, 232, 240, 0.8);
transition: all 0.2s ease;
}
.btn-cancel-delete:hover {
background: rgba(255, 255, 255, 0.06);
border-color: rgba(255, 255, 255, 0.16);
color: #fff;
}
.btn-delete {
min-width: 128px;
padding: 10px 20px;
background: linear-gradient(135deg, #ef4444 0%, #dc2626 55%, #b91c1c 100%);
border: none;
border-radius: 10px;
font-weight: 600;
transition: all 0.25s ease;
box-shadow: 0 10px 30px rgba(239, 68, 68, 0.18);
}
.btn-delete:hover {
transform: translateY(-1px);
box-shadow: 0 14px 34px rgba(239, 68, 68, 0.28);
}
</style>

View File

@@ -0,0 +1,96 @@
<template>
<div class="empty-state">
<div class="empty-illustration">
<div class="circle-1"></div>
<div class="circle-2"></div>
<div class="circle-3"></div>
<el-icon size="48"><component :is="icon" /></el-icon>
</div>
<h3>{{ title }}</h3>
<p>{{ description }}</p>
<el-button v-if="actionText" type="primary" @click="$emit('action')">
{{ actionText }}
</el-button>
</div>
</template>
<script setup>
defineProps({
icon: {
type: Object,
default: () => null
},
title: {
type: String,
default: '暂无数据'
},
description: {
type: String,
default: '暂无相关内容'
},
actionText: {
type: String,
default: ''
}
})
defineEmits(['action'])
</script>
<style scoped>
.empty-state {
grid-column: 1 / -1;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
padding: 80px 40px;
background: var(--glass-bg);
backdrop-filter: blur(20px);
border: 1px dashed var(--border-default);
border-radius: var(--radius-xl);
text-align: center;
}
.empty-illustration {
position: relative;
width: 120px;
height: 120px;
display: flex;
align-items: center;
justify-content: center;
margin-bottom: 24px;
}
.empty-illustration .el-icon {
position: relative;
z-index: 1;
color: var(--text-muted);
}
.circle-1, .circle-2, .circle-3 {
position: absolute;
border-radius: 50%;
border: 1px solid var(--border-default);
}
.circle-1 { width: 100%; height: 100%; animation: rotate 20s linear infinite; }
.circle-2 { width: 70%; height: 70%; animation: rotate 15s linear infinite reverse; }
.circle-3 { width: 40%; height: 40%; background: var(--bg-tertiary); }
@keyframes rotate {
from { transform: rotate(0deg); }
to { transform: rotate(360deg); }
}
.empty-state h3 {
font-size: 20px;
margin-bottom: 8px;
color: var(--text-primary);
}
.empty-state p {
color: var(--text-tertiary);
margin-bottom: 24px;
}
</style>

View File

@@ -0,0 +1,211 @@
<template>
<div
class="project-card"
:style="{ '--delay': delay }"
@click="$emit('click', project)"
>
<div class="card-glow"></div>
<button
v-if="showDelete"
class="card-delete-btn"
@click.stop="$emit('delete', project)"
>
<el-icon><Delete /></el-icon>
</button>
<div class="card-header">
<div class="card-avatar">
<el-icon><component :is="projectIcon" /></el-icon>
</div>
</div>
<h3 class="card-title">{{ project.name }}</h3>
<p class="card-desc">{{ project.description || '暂无描述' }}</p>
<div class="card-footer">
<span class="card-date">
<el-icon><Calendar /></el-icon>
{{ formattedDate }}
</span>
<div class="card-status">
<span class="status-dot"></span>
{{ statusText }}
</div>
</div>
</div>
</template>
<script setup>
import { computed } from 'vue'
import { Folder, ChatDotRound, Document, Connection } from '@element-plus/icons-vue'
const props = defineProps({
project: {
type: Object,
required: true
},
index: {
type: Number,
default: 0
},
showDelete: {
type: Boolean,
default: true
},
statusText: {
type: String,
default: '活跃'
}
})
defineEmits(['click', 'delete'])
const delay = computed(() => `${props.index * 0.1}s`)
const projectIcon = computed(() => {
const type = props.project.type
if (type === 'qa') return ChatDotRound
if (type === 'table') return Document
if (type === 'database') return Connection
return Folder
})
const formattedDate = computed(() => {
if (!props.project.created_at) return ''
const d = new Date(props.project.created_at)
return d.toLocaleDateString('zh-CN', { month: 'short', day: 'numeric', year: 'numeric' })
})
</script>
<style scoped>
.project-card {
position: relative;
padding: 24px;
background: var(--bg-secondary);
border: 1px solid var(--border-subtle);
border-radius: var(--radius-lg);
cursor: pointer;
transition: all var(--transition-base);
overflow: hidden;
animation: cardFadeIn 0.5s ease backwards;
animation-delay: var(--delay);
}
@keyframes cardFadeIn {
from { opacity: 0; transform: translateY(20px); }
to { opacity: 1; transform: translateY(0); }
}
.project-card:hover {
border-color: var(--accent-primary);
transform: translateY(-4px);
}
.card-glow {
position: absolute;
top: 0;
left: 0;
right: 0;
height: 1px;
background: linear-gradient(90deg, transparent, var(--accent-primary), transparent);
opacity: 0;
transition: opacity var(--transition-base);
}
.project-card:hover .card-glow { opacity: 1; }
.card-delete-btn {
position: absolute;
top: 12px;
right: 12px;
width: 32px;
height: 32px;
border-radius: 50%;
background: var(--danger);
border: none;
color: white;
cursor: pointer;
display: flex;
align-items: center;
justify-content: center;
opacity: 0;
transform: scale(0.8);
transition: all 0.2s ease;
z-index: 10;
}
.card-delete-btn:hover {
transform: scale(1.1);
background: #dc2626;
}
.project-card:hover .card-delete-btn {
opacity: 1;
transform: scale(1);
}
.card-header {
display: flex;
justify-content: space-between;
align-items: flex-start;
margin-bottom: 16px;
}
.card-avatar {
width: 44px;
height: 44px;
display: flex;
align-items: center;
justify-content: center;
background: var(--accent-primary-muted);
border-radius: var(--radius-md);
font-size: 20px;
color: var(--accent-primary);
}
.card-title {
font-size: 17px;
font-weight: 600;
margin-bottom: 8px;
color: var(--text-primary);
}
.card-desc {
font-size: 14px;
color: var(--text-tertiary);
line-height: 1.5;
margin-bottom: 20px;
display: -webkit-box;
-webkit-line-clamp: 2;
-webkit-box-orient: vertical;
overflow: hidden;
}
.card-footer {
display: flex;
justify-content: space-between;
align-items: center;
padding-top: 16px;
border-top: 1px solid var(--border-subtle);
}
.card-date {
display: flex;
align-items: center;
gap: 6px;
font-size: 13px;
color: var(--text-muted);
}
.card-status {
display: flex;
align-items: center;
gap: 6px;
font-size: 13px;
color: var(--success);
}
.status-dot {
width: 6px;
height: 6px;
background: var(--success);
border-radius: 50%;
}
</style>

View File

@@ -0,0 +1,6 @@
/**
* Composables - 可复用业务逻辑
*/
export * from './useFormatters'
export * from './useProjects'
export * from './useModels'

View File

@@ -0,0 +1,71 @@
/**
* 格式化工具函数
*/
/**
* 格式化文件大小
*/
export function formatSize(bytes: number): string {
if (bytes === 0) return '0 B'
const k = 1024
const sizes = ['B', 'KB', 'MB', 'GB', 'TB']
const i = Math.floor(Math.log(bytes) / Math.log(k))
return parseFloat((bytes / Math.pow(k, i)).toFixed(2)) + ' ' + sizes[i]
}
/**
* 格式化日期
*/
export function formatDate(date: string | Date): string {
const d = new Date(date)
const year = d.getFullYear()
const month = String(d.getMonth() + 1).padStart(2, '0')
const day = String(d.getDate()).padStart(2, '0')
return `${year}-${month}-${day}`
}
/**
* 格式化日期时间
*/
export function formatDateTime(date: string | Date): string {
const d = new Date(date)
const year = d.getFullYear()
const month = String(d.getMonth() + 1).padStart(2, '0')
const day = String(d.getDate()).padStart(2, '0')
const hours = String(d.getHours()).padStart(2, '0')
const minutes = String(d.getMinutes()).padStart(2, '0')
return `${year}-${month}-${day} ${hours}:${minutes}`
}
/**
* 格式化相对时间
*/
export function formatRelativeTime(date: string | Date): string {
const now = new Date()
const d = new Date(date)
const diff = now.getTime() - d.getTime()
const seconds = Math.floor(diff / 1000)
const minutes = Math.floor(seconds / 60)
const hours = Math.floor(minutes / 60)
const days = Math.floor(hours / 24)
if (days > 7) {
return formatDate(date)
} else if (days > 0) {
return `${days} 天前`
} else if (hours > 0) {
return `${hours} 小时前`
} else if (minutes > 0) {
return `${minutes} 分钟前`
} else {
return '刚刚'
}
}
/**
* 格式化数字(千分位)
*/
export function formatNumber(num: number): string {
return num.toLocaleString('zh-CN')
}

View File

@@ -0,0 +1,112 @@
/**
* 模型相关业务逻辑
*/
import { ref } from 'vue'
import { modelApi } from '@/core/api'
import type { Model } from '@/shared/types'
import { ElMessage } from 'element-plus'
export function useModels() {
const loading = ref(false)
const models = ref<Model[]>([])
/**
* 获取模型列表
*/
const fetchModels = async () => {
loading.value = true
try {
const res = await modelApi.list()
// 处理两种响应格式
if (Array.isArray(res)) {
models.value = res
} else if (res?.data && Array.isArray(res.data)) {
models.value = res.data
} else if (res?.results && Array.isArray(res.results)) {
models.value = res.results
} else {
models.value = []
}
} catch (error: any) {
console.error('获取模型列表失败:', error)
ElMessage.error('获取模型列表失败')
models.value = []
} finally {
loading.value = false
}
}
/**
* 添加模型
*/
const addModel = async (data: Partial<Model>): Promise<boolean> => {
try {
await modelApi.create(data)
ElMessage.success('添加成功')
await fetchModels()
return true
} catch (error: any) {
console.error('添加模型失败:', error)
ElMessage.error(error?.message || '添加模型失败')
return false
}
}
/**
* 更新模型
*/
const updateModel = async (id: number, data: Partial<Model>): Promise<boolean> => {
try {
await modelApi.update(id, data)
ElMessage.success('更新成功')
await fetchModels()
return true
} catch (error: any) {
console.error('更新模型失败:', error)
ElMessage.error(error?.message || '更新模型失败')
return false
}
}
/**
* 删除模型
*/
const deleteModel = async (id: number): Promise<boolean> => {
try {
await modelApi.delete(id)
ElMessage.success('删除成功')
await fetchModels()
return true
} catch (error: any) {
console.error('删除模型失败:', error)
ElMessage.error(error?.message || '删除模型失败')
return false
}
}
/**
* 设置默认模型
*/
const setDefaultModel = async (id: number): Promise<boolean> => {
try {
await modelApi.setDefault(id)
ElMessage.success('设置成功')
await fetchModels()
return true
} catch (error: any) {
console.error('设置默认模型失败:', error)
ElMessage.error(error?.message || '设置默认模型失败')
return false
}
}
return {
loading,
models,
fetchModels,
addModel,
updateModel,
deleteModel,
setDefaultModel
}
}

View File

@@ -0,0 +1,98 @@
/**
* 项目相关业务逻辑
*/
import { ref } from 'vue'
import { projectApi } from '@/core/api'
import type { Project, ProjectCreate } from '@/shared/types'
import { ElMessage } from 'element-plus'
export function useProjects() {
const loading = ref(false)
const projects = ref<Project[]>([])
/**
* 获取项目列表
*/
const fetchProjects = async () => {
loading.value = true
try {
const res = await projectApi.list()
// 处理两种响应格式
if (Array.isArray(res)) {
projects.value = res
} else if (res?.data && Array.isArray(res.data)) {
projects.value = res.data
} else if (res?.results && Array.isArray(res.results)) {
projects.value = res.results
} else {
projects.value = []
}
} catch (error: any) {
console.error('获取项目列表失败:', error)
ElMessage.error('获取项目列表失败')
projects.value = []
} finally {
loading.value = false
}
}
/**
* 创建项目
*/
const createProject = async (data: ProjectCreate): Promise<Project | null> => {
try {
const res = await projectApi.create(data)
ElMessage.success('创建成功')
await fetchProjects()
return res
} catch (error: any) {
console.error('创建项目失败:', error)
ElMessage.error(error?.message || '创建项目失败')
return null
}
}
/**
* 删除项目
*/
const deleteProject = async (id: number): Promise<boolean> => {
try {
await projectApi.delete(id)
ElMessage.success('删除成功')
await fetchProjects()
return true
} catch (error: any) {
console.error('删除项目失败:', error)
ElMessage.error(error?.message || '删除项目失败')
return false
}
}
/**
* 获取项目详情
*/
const fetchProject = async (id: number): Promise<Project | null> => {
try {
const res = await projectApi.get(id)
if (res && typeof res === 'object' && 'id' in res) {
return res as Project
} else if (res?.data) {
return res.data as Project
}
return null
} catch (error: any) {
console.error('获取项目详情失败:', error)
ElMessage.error('获取项目详情失败')
return null
}
}
return {
loading,
projects,
fetchProjects,
createProject,
deleteProject,
fetchProject
}
}

32
frontend/src/shared/types/api.d.ts vendored Normal file
View File

@@ -0,0 +1,32 @@
/**
* API Response Types
*/
// Base API response wrapper
export interface ApiResponse<T = any> {
success: boolean
message: string
data: T
error: string | null
timestamp: string
}
// Paginated response
export interface PaginatedResponse<T = any> extends ApiResponse<T> {
page?: number
page_size?: number
total?: number
}
// List items wrapper
export interface ListResponse<T> {
items: T[]
total: number
page: number
page_size: number
}
// Simple ID response
export interface IdResponse {
id: string
}

60
frontend/src/shared/types/common.d.ts vendored Normal file
View File

@@ -0,0 +1,60 @@
/**
* Common Types
*/
// File types
export interface FileItem {
id: string
filename: string
file_type: string
size?: number
status: string
created_at: string
updated_at: string
}
// Chunk types
export interface Chunk {
id: string
name?: string
content: string
summary?: string
word_count?: number
file_id?: string
created_at: string
updated_at: string
}
// Question types
export interface Question {
id: string
content: string
answer?: string
question_type?: string
chunk_id?: string
source: string
created_at: string
updated_at: string
}
// Dataset types
export interface Dataset {
id: string
name: string
description?: string
dataset_type?: string
question_count?: number
created_at: string
updated_at: string
}
// Dialog props
export interface DialogProps {
visible: boolean
loading?: boolean
}
export interface DeleteDialogProps extends DialogProps {
itemName?: string
itemType?: string
}

View File

@@ -0,0 +1,7 @@
/**
* Type exports
*/
export * from './api'
export * from './project'
export * from './model'
export * from './common'

47
frontend/src/shared/types/model.d.ts vendored Normal file
View File

@@ -0,0 +1,47 @@
/**
* Model Configuration Types
*/
export interface Model {
id: string
provider: ModelProvider
model_type: ModelType
model_name: string
api_key?: string
api_base?: string
is_default: 'true' | 'false'
connection_status?: 'untested' | 'connected' | 'disconnected'
created_at?: string
updated_at?: string
}
export interface ModelConfig {
id: string
provider: ModelProvider
model_type: ModelType
model_name: string
api_key?: string
api_base?: string
is_default: 'true' | 'false'
connection_status?: 'untested' | 'connected' | 'disconnected'
created_at?: string
updated_at?: string
}
export type ModelProvider = 'minimax' | 'glm' | 'openai' | 'ali'
export type ModelType = 'chat' | 'vlm' | 'embedding' | 'rerank'
export interface ModelCreate {
provider: ModelProvider
model_type: ModelType
model_name: string
api_key: string
api_base?: string
is_default: boolean
}
export interface ProviderOption {
value: ModelProvider
label: string
abbr: string
}

24
frontend/src/shared/types/project.d.ts vendored Normal file
View File

@@ -0,0 +1,24 @@
/**
* Project Types
*/
export interface Project {
id: string
name: string
description?: string
type: string
created_at: string
updated_at: string
}
export interface ProjectCreate {
name: string
description: string
type: string
}
export interface ProjectUpdate {
name?: string
description?: string
type?: string
}

Some files were not shown because too many files have changed in this diff Show More