376 lines
7.2 KiB
Markdown
376 lines
7.2 KiB
Markdown
|
|
# 知识库创建 API
|
|||
|
|
|
|||
|
|
## 基础信息
|
|||
|
|
|
|||
|
|
| 项目 | 说明 |
|
|||
|
|
|------|------|
|
|||
|
|
| 基础URL | `http://localhost:8082` |
|
|||
|
|
| 前端页面 | Knowledge Base 创建弹窗 |
|
|||
|
|
|
|||
|
|
## 接口列表
|
|||
|
|
|
|||
|
|
### 1. 创建知识库
|
|||
|
|
|
|||
|
|
**请求**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
POST /api/knowledge/create
|
|||
|
|
Content-Type: application/json
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| 参数 | 类型 | 必填 | 说明 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| name | String | 是 | 知识库名称 |
|
|||
|
|
| description | String | 否 | 知识库描述 |
|
|||
|
|
| llm_model_id | String | 是 | LLM 模型 ID(来自 model 表) |
|
|||
|
|
| embedding_model_id | String | 是 | Embedding 模型 ID(来自 model 表) |
|
|||
|
|
| parsing_config | Object | 是 | 解析配置 |
|
|||
|
|
| - engine | String | 是 | 解析引擎:markitdown / docling |
|
|||
|
|
| - docling_url | String | 条件必填 | Docling 服务 URL(engine=docling 时必填) |
|
|||
|
|
| - enable_pdf | Boolean | 否 | 是否启用 PDF 解析(默认 true) |
|
|||
|
|
| - pandoc | Boolean | 否 | 是否启用 Pandoc(默认 true) |
|
|||
|
|
| storage_config | Object | 否 | 存储配置(默认 local) |
|
|||
|
|
| - type | String | 是 | 存储类型:local / minio / s3 |
|
|||
|
|
| - endpoint | String | 否 | MinIO Endpoint(如 minio:9000) |
|
|||
|
|
| - access_key_id | String | 否 | MinIO Access Key ID |
|
|||
|
|
| - secret_access_key | String | 否 | MinIO Secret Access Key |
|
|||
|
|
| - bucket | String | 否 | MinIO Bucket 名称 |
|
|||
|
|
|
|||
|
|
**请求示例**
|
|||
|
|
|
|||
|
|
本地存储:
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"name": "产品文档知识库",
|
|||
|
|
"description": "用于存储产品手册和文档",
|
|||
|
|
"llm_model_id": "model_001",
|
|||
|
|
"embedding_model_id": "model_002",
|
|||
|
|
"parsing_config": {
|
|||
|
|
"engine": "markitdown",
|
|||
|
|
"enable_pdf": true,
|
|||
|
|
"pandoc": true
|
|||
|
|
},
|
|||
|
|
"storage_config": {
|
|||
|
|
"type": "local"
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
使用 Docling + MinIO:
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"name": "产品文档知识库",
|
|||
|
|
"description": "用于存储产品手册和文档",
|
|||
|
|
"llm_model_id": "model_001",
|
|||
|
|
"embedding_model_id": "model_002",
|
|||
|
|
"parsing_config": {
|
|||
|
|
"engine": "docling",
|
|||
|
|
"docling_url": "http://localhost:8501",
|
|||
|
|
"enable_pdf": true,
|
|||
|
|
"pandoc": true
|
|||
|
|
},
|
|||
|
|
"storage_config": {
|
|||
|
|
"type": "minio",
|
|||
|
|
"endpoint": "localhost:9000",
|
|||
|
|
"access_key_id": "minioadmin",
|
|||
|
|
"secret_access_key": "minioadmin",
|
|||
|
|
"bucket": "x-agents"
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**成功响应**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"id": "kb_abc123",
|
|||
|
|
"message": "Knowledge base created successfully"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**错误响应**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": false,
|
|||
|
|
"message": "LLM model not found"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 2. 获取知识库列表
|
|||
|
|
|
|||
|
|
**请求**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
GET /api/knowledge/list
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**响应**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"data": [
|
|||
|
|
{
|
|||
|
|
"id": "kb_001",
|
|||
|
|
"name": "产品文档知识库",
|
|||
|
|
"description": "用于存储产品手册",
|
|||
|
|
"llm_model_id": "model_001",
|
|||
|
|
"embedding_model_id": "model_002",
|
|||
|
|
"status": "active",
|
|||
|
|
"document_count": 15,
|
|||
|
|
"chunk_count": 156,
|
|||
|
|
"created_at": "2024-01-15T10:30:00Z",
|
|||
|
|
"updated_at": "2024-01-15T10:30:00Z"
|
|||
|
|
}
|
|||
|
|
]
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 3. 获取知识库详情
|
|||
|
|
|
|||
|
|
**请求**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
GET /api/knowledge/:id
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| 参数 | 类型 | 必填 | 说明 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| id | String | 是 | 知识库 ID |
|
|||
|
|
|
|||
|
|
**响应**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"data": {
|
|||
|
|
"id": "kb_001",
|
|||
|
|
"name": "产品文档知识库",
|
|||
|
|
"description": "用于存储产品手册",
|
|||
|
|
"llm_model_id": "model_001",
|
|||
|
|
"embedding_model_id": "model_002",
|
|||
|
|
"parsing_config": {
|
|||
|
|
"engine": "markitdown",
|
|||
|
|
"enable_pdf": true,
|
|||
|
|
"pandoc": true
|
|||
|
|
},
|
|||
|
|
"status": "active",
|
|||
|
|
"document_count": 15,
|
|||
|
|
"chunk_count": 156,
|
|||
|
|
"created_at": "2024-01-15T10:30:00Z",
|
|||
|
|
"updated_at": "2024-01-15T10:30:00Z"
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 4. 删除知识库
|
|||
|
|
|
|||
|
|
**请求**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
DELETE /api/knowledge/:id
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| 参数 | 类型 | 必填 | 说明 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| id | String | 是 | 知识库 ID |
|
|||
|
|
|
|||
|
|
**响应**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"message": "Knowledge base deleted"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 5. 获取知识库下的文档列表
|
|||
|
|
|
|||
|
|
**请求**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
GET /api/knowledge/:id/documents
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| 参数 | 类型 | 必填 | 说明 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| id | String | 是 | 知识库 ID |
|
|||
|
|
|
|||
|
|
**查询参数**
|
|||
|
|
|
|||
|
|
| 参数 | 类型 | 必填 | 说明 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| status | String | 否 | 过滤状态:all / parsed / parsing / failed |
|
|||
|
|
|
|||
|
|
**响应**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"data": [
|
|||
|
|
{
|
|||
|
|
"id": "doc_001",
|
|||
|
|
"knowledge_base_id": "kb_001",
|
|||
|
|
"name": "产品手册_v2.0.pdf",
|
|||
|
|
"file_key": "abc123.pdf",
|
|||
|
|
"file_url": "http://localhost:8082/files/abc123.pdf",
|
|||
|
|
"file_size": 2516582,
|
|||
|
|
"status": "parsed",
|
|||
|
|
"chunk_count": 156,
|
|||
|
|
"uploaded_at": "2024-01-15T10:30:00Z"
|
|||
|
|
}
|
|||
|
|
]
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 6. 上传文档到知识库
|
|||
|
|
|
|||
|
|
**请求**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
POST /api/knowledge/:id/documents
|
|||
|
|
Content-Type: multipart/form-data
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| 参数 | 类型 | 必填 | 说明 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| id | String | 是 | 知识库 ID |
|
|||
|
|
| file | File | 是 | 要上传的文件 |
|
|||
|
|
|
|||
|
|
**响应**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"dataid": "doc_001",
|
|||
|
|
": {
|
|||
|
|
" "name": "产品手册_v2.0.pdf",
|
|||
|
|
"status": "parsing"
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 7. 删除知识库文档
|
|||
|
|
|
|||
|
|
**请求**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
DELETE /api/knowledge/:id/documents/:doc_id
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| 参数 | 类型 | 必填 | 说明 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| id | String | 是 | 知识库 ID |
|
|||
|
|
| doc_id | String | 是 | 文档 ID |
|
|||
|
|
|
|||
|
|
**响应**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"message": "Document deleted"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 8. 重新解析文档
|
|||
|
|
|
|||
|
|
**请求**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
POST /api/knowledge/:id/documents/:doc_id/reparse
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| 参数 | 类型 | 必填 | 说明 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| id | String | 是 | 知识库 ID |
|
|||
|
|
| doc_id | String | 是 | 文档 ID |
|
|||
|
|
|
|||
|
|
**响应**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"message": "Document reparse started"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 9. 获取文档预览内容
|
|||
|
|
|
|||
|
|
**请求**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
GET /api/knowledge/:id/documents/:doc_id/preview
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| 参数 | 类型 | 必填 | 说明 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| id | String | 是 | 知识库 ID |
|
|||
|
|
| doc_id | String | 是 | 文档 ID |
|
|||
|
|
| page | Number | 否 | 页码(默认 1) |
|
|||
|
|
|
|||
|
|
**响应**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"data": {
|
|||
|
|
"total_pages": 3,
|
|||
|
|
"current_page": 1,
|
|||
|
|
"content": "第一章 产品介绍\n\n欢迎使用我们的产品手册..."
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 数据库表设计(参考)
|
|||
|
|
|
|||
|
|
### knowledge_base 表
|
|||
|
|
|
|||
|
|
| 字段 | 类型 | 说明 |
|
|||
|
|
|------|------|------|
|
|||
|
|
| id | String | 主键 |
|
|||
|
|
| name | String | 知识库名称 |
|
|||
|
|
| description | Text | 描述 |
|
|||
|
|
| llm_model_id | String | LLM 模型 ID |
|
|||
|
|
| embedding_model_id | String | Embedding 模型 ID |
|
|||
|
|
| parsing_config | JSON | 解析配置 |
|
|||
|
|
| storage_config | JSON | 存储配置(包含 type, endpoint, access_key_id, secret_access_key, bucket) |
|
|||
|
|
| status | String | 状态:active / inactive |
|
|||
|
|
| document_count | Integer | 文档数量 |
|
|||
|
|
| chunk_count | Integer | 切片数量 |
|
|||
|
|
| created_at | Timestamp | 创建时间 |
|
|||
|
|
| updated_at | Timestamp | 更新时间 |
|
|||
|
|
|
|||
|
|
### knowledge_document 表
|
|||
|
|
|
|||
|
|
| 字段 | 类型 | 说明 |
|
|||
|
|
|------|------|------|
|
|||
|
|
| id | String | 主键 |
|
|||
|
|
| knowledge_base_id | String | 知识库 ID |
|
|||
|
|
| name | String | 文档名称 |
|
|||
|
|
| file_key | String | 文件存储 key |
|
|||
|
|
| file_url | String | 文件访问 URL(本地路径或 MinIO 预签名 URL) |
|
|||
|
|
| file_size | BigInteger | 文件大小 |
|
|||
|
|
| status | String | 状态:parsing / parsed / failed |
|
|||
|
|
| chunk_count | Integer | 切片数量 |
|
|||
|
|
| uploaded_at | Timestamp | 上传时间 |
|