feat: 新增 account 和 plan 目录
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"zh": {
|
||||
"name": "B站/YouTube 字幕提取",
|
||||
"description": "从 Bilibili(B站)和 YouTube 视频中提取字幕和转录文本,支持大会员内容、多语言字幕、内容摘要和视频问答。"
|
||||
}
|
||||
}
|
||||
484
account/admin/skills/bilibili-watcher/SKILL.md
Normal file
484
account/admin/skills/bilibili-watcher/SKILL.md
Normal file
@@ -0,0 +1,484 @@
|
||||
---
|
||||
name: openakita/skills@bilibili-watcher
|
||||
description: Extract subtitles and transcripts from Bilibili and YouTube videos. Use when the user wants to get subtitles from B站 (Bilibili) or YouTube, extract Chinese/Japanese video transcripts, watch member-only Bilibili content, or perform Q&A on video content. Supports dual-platform subtitle extraction with yt-dlp.
|
||||
license: MIT
|
||||
metadata:
|
||||
author: openakita
|
||||
version: "1.0.0"
|
||||
based_on: openclaw/skills/bilibili-youtube-watcher
|
||||
---
|
||||
|
||||
# Bilibili & YouTube Watcher — 双平台字幕提取
|
||||
|
||||
从 Bilibili(B站)和 YouTube 视频中提取字幕/转录文本,支持多语言、会员视频和内容问答。
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
- User shares a Bilibili link and wants subtitles or a summary
|
||||
- User shares a YouTube link and wants transcript extraction via yt-dlp
|
||||
- User needs subtitles from member-only (大会员) Bilibili videos
|
||||
- User wants to search or query content within a video's transcript
|
||||
- User wants to compare subtitles across languages
|
||||
- User needs to extract subtitles from videos with hardcoded subs (OCR not included — only soft subs)
|
||||
- User wants batch subtitle extraction from a playlist or series
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Install yt-dlp
|
||||
|
||||
yt-dlp is a feature-rich command-line audio/video downloader that also extracts subtitles.
|
||||
|
||||
**Via pip (recommended):**
|
||||
```bash
|
||||
pip install yt-dlp
|
||||
```
|
||||
|
||||
**Via package manager:**
|
||||
```bash
|
||||
# macOS
|
||||
brew install yt-dlp
|
||||
|
||||
# Ubuntu/Debian
|
||||
sudo apt install yt-dlp
|
||||
|
||||
# Windows (scoop)
|
||||
scoop install yt-dlp
|
||||
```
|
||||
|
||||
**Verify installation:**
|
||||
```bash
|
||||
yt-dlp --version
|
||||
```
|
||||
|
||||
### Optional: ffmpeg
|
||||
|
||||
Some subtitle formats require ffmpeg for conversion:
|
||||
|
||||
```bash
|
||||
# macOS
|
||||
brew install ffmpeg
|
||||
|
||||
# Ubuntu/Debian
|
||||
sudo apt install ffmpeg
|
||||
|
||||
# Windows (scoop)
|
||||
scoop install ffmpeg
|
||||
```
|
||||
|
||||
### Cookie Setup for Bilibili Member Videos
|
||||
|
||||
Bilibili member-only (大会员) content requires authentication cookies.
|
||||
|
||||
**Method 1: Export cookies from browser**
|
||||
|
||||
Install a browser extension like "Get cookies.txt LOCALLY" and export cookies for `bilibili.com`:
|
||||
|
||||
```bash
|
||||
# Use the exported cookies file
|
||||
yt-dlp --cookies cookies.txt "https://www.bilibili.com/video/BV..."
|
||||
```
|
||||
|
||||
**Method 2: Use browser cookies directly**
|
||||
|
||||
```bash
|
||||
# yt-dlp can read cookies from your browser
|
||||
yt-dlp --cookies-from-browser chrome "https://www.bilibili.com/video/BV..."
|
||||
yt-dlp --cookies-from-browser firefox "https://www.bilibili.com/video/BV..."
|
||||
yt-dlp --cookies-from-browser edge "https://www.bilibili.com/video/BV..."
|
||||
```
|
||||
|
||||
> **Security note**: Cookie files contain your login session. Do not share them or commit them to version control.
|
||||
|
||||
---
|
||||
|
||||
## Instructions
|
||||
|
||||
### Step 1: Identify the Platform and URL Format
|
||||
|
||||
#### Bilibili URL Formats
|
||||
|
||||
| Format | Example |
|
||||
|---|---|
|
||||
| Standard BV | `https://www.bilibili.com/video/BV1xx411c7mD` |
|
||||
| With page | `https://www.bilibili.com/video/BV1xx411c7mD?p=2` |
|
||||
| Short link | `https://b23.tv/aBcDeFg` |
|
||||
| Bangumi | `https://www.bilibili.com/bangumi/play/ep12345` |
|
||||
| Old AV format | `https://www.bilibili.com/video/av12345` |
|
||||
| Mobile | `https://m.bilibili.com/video/BV1xx411c7mD` |
|
||||
|
||||
#### YouTube URL Formats
|
||||
|
||||
| Format | Example |
|
||||
|---|---|
|
||||
| Standard | `https://www.youtube.com/watch?v=VIDEO_ID` |
|
||||
| Short | `https://youtu.be/VIDEO_ID` |
|
||||
| Embed | `https://www.youtube.com/embed/VIDEO_ID` |
|
||||
| Shorts | `https://www.youtube.com/shorts/VIDEO_ID` |
|
||||
|
||||
### Step 2: Extract Subtitles
|
||||
|
||||
#### Bilibili Subtitle Extraction
|
||||
|
||||
**List available subtitles:**
|
||||
```bash
|
||||
yt-dlp --list-subs "https://www.bilibili.com/video/BV..."
|
||||
```
|
||||
|
||||
**Download subtitles only (no video):**
|
||||
```bash
|
||||
# Download all available subtitles
|
||||
yt-dlp --write-sub --skip-download "https://www.bilibili.com/video/BV..."
|
||||
|
||||
# Download auto-generated subtitles as well
|
||||
yt-dlp --write-sub --write-auto-sub --skip-download "https://www.bilibili.com/video/BV..."
|
||||
|
||||
# Download specific language (Chinese)
|
||||
yt-dlp --write-sub --sub-lang zh-CN --skip-download "https://www.bilibili.com/video/BV..."
|
||||
|
||||
# Convert to SRT format
|
||||
yt-dlp --write-sub --sub-lang zh-CN --convert-subs srt --skip-download "https://www.bilibili.com/video/BV..."
|
||||
```
|
||||
|
||||
**Member-only videos (require cookies):**
|
||||
```bash
|
||||
yt-dlp --cookies-from-browser chrome --write-sub --skip-download "https://www.bilibili.com/video/BV..."
|
||||
```
|
||||
|
||||
#### YouTube Subtitle Extraction
|
||||
|
||||
**List available subtitles:**
|
||||
```bash
|
||||
yt-dlp --list-subs "https://www.youtube.com/watch?v=VIDEO_ID"
|
||||
```
|
||||
|
||||
**Download subtitles:**
|
||||
```bash
|
||||
# English subtitles
|
||||
yt-dlp --write-sub --sub-lang en --skip-download "https://www.youtube.com/watch?v=VIDEO_ID"
|
||||
|
||||
# Auto-generated subtitles
|
||||
yt-dlp --write-auto-sub --sub-lang en --skip-download "https://www.youtube.com/watch?v=VIDEO_ID"
|
||||
|
||||
# Multiple languages
|
||||
yt-dlp --write-sub --sub-lang "en,zh-Hans,ja" --skip-download "URL"
|
||||
|
||||
# Convert to plain text (SRT format)
|
||||
yt-dlp --write-auto-sub --sub-lang en --convert-subs srt --skip-download "URL"
|
||||
```
|
||||
|
||||
### Step 3: Parse Subtitle Files
|
||||
|
||||
yt-dlp downloads subtitles in various formats. Here's how to parse common ones:
|
||||
|
||||
```python
|
||||
import re
|
||||
import json
|
||||
|
||||
def parse_srt(filepath: str) -> list[dict]:
|
||||
"""Parse SRT subtitle file into structured segments."""
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
|
||||
segments = []
|
||||
blocks = content.strip().split('\n\n')
|
||||
|
||||
for block in blocks:
|
||||
lines = block.strip().split('\n')
|
||||
if len(lines) >= 3:
|
||||
time_match = re.match(
|
||||
r'(\d{2}):(\d{2}):(\d{2}),(\d{3}) --> (\d{2}):(\d{2}):(\d{2}),(\d{3})',
|
||||
lines[1]
|
||||
)
|
||||
if time_match:
|
||||
h, m, s = int(time_match[1]), int(time_match[2]), int(time_match[3])
|
||||
start_sec = h * 3600 + m * 60 + s
|
||||
text = ' '.join(lines[2:]).strip()
|
||||
# Remove HTML tags from auto-generated subs
|
||||
text = re.sub(r'<[^>]+>', '', text)
|
||||
if text:
|
||||
segments.append({
|
||||
'start': start_sec,
|
||||
'text': text
|
||||
})
|
||||
|
||||
return segments
|
||||
|
||||
|
||||
def parse_json3(filepath: str) -> list[dict]:
|
||||
"""Parse YouTube JSON3 subtitle format."""
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
data = json.load(f)
|
||||
|
||||
segments = []
|
||||
for event in data.get('events', []):
|
||||
start_ms = event.get('tStartMs', 0)
|
||||
segs = event.get('segs', [])
|
||||
text = ''.join(s.get('utf8', '') for s in segs).strip()
|
||||
if text and text != '\n':
|
||||
segments.append({
|
||||
'start': start_ms / 1000,
|
||||
'text': text
|
||||
})
|
||||
|
||||
return segments
|
||||
|
||||
|
||||
def segments_to_text(segments: list[dict]) -> str:
|
||||
"""Convert segments to plain text with timestamps."""
|
||||
lines = []
|
||||
for seg in segments:
|
||||
minutes = int(seg['start'] // 60)
|
||||
seconds = int(seg['start'] % 60)
|
||||
lines.append(f"[{minutes:02d}:{seconds:02d}] {seg['text']}")
|
||||
return '\n'.join(lines)
|
||||
```
|
||||
|
||||
### Step 4: Summarize or Query the Content
|
||||
|
||||
Once you have the transcript text, generate summaries or answer questions about the content.
|
||||
|
||||
**For summarization**, combine all subtitle text and apply a structured prompt:
|
||||
|
||||
```
|
||||
Based on the following video transcript, provide:
|
||||
|
||||
1. **概要** (Executive Summary): 2-3 sentences in the video's language
|
||||
2. **要点** (Key Points): Bulleted list with timestamps [MM:SS]
|
||||
3. **详细笔记** (Detailed Notes): Organized by topic sections
|
||||
4. **问答** (Q&A): Answer any specific questions the user has
|
||||
|
||||
Transcript:
|
||||
{full_transcript_text}
|
||||
```
|
||||
|
||||
**For Q&A**, search the transcript for relevant segments first, then answer based on context.
|
||||
|
||||
---
|
||||
|
||||
## Workflows
|
||||
|
||||
### Workflow 1: Quick Bilibili Subtitle Extraction
|
||||
|
||||
User says: "提取这个B站视频的字幕: https://www.bilibili.com/video/BV..."
|
||||
|
||||
1. Run `yt-dlp --list-subs` to check available subtitles
|
||||
2. Download Chinese subtitles: `yt-dlp --write-sub --sub-lang zh-CN --convert-subs srt --skip-download URL`
|
||||
3. Parse the SRT file
|
||||
4. Present the clean transcript to the user
|
||||
|
||||
### Workflow 2: Bilibili Member Video
|
||||
|
||||
User says: "这是大会员视频,帮我提取字幕"
|
||||
|
||||
1. Inform user that cookies are needed
|
||||
2. Use `--cookies-from-browser chrome` (or user's preferred browser)
|
||||
3. Extract subtitles with authentication
|
||||
4. If cookies fail, guide user to export cookies.txt manually
|
||||
|
||||
### Workflow 3: YouTube Multi-Language
|
||||
|
||||
User says: "Get both English and Chinese subtitles from this YouTube video"
|
||||
|
||||
1. List available subtitle languages
|
||||
2. Download both `en` and `zh-Hans` subtitles
|
||||
3. Parse both files
|
||||
4. Present side-by-side or merged view
|
||||
|
||||
### Workflow 4: Video Content Q&A
|
||||
|
||||
User says: "视频里有没有提到关于 X 的内容?"
|
||||
|
||||
1. Extract full transcript
|
||||
2. Search for keywords related to X
|
||||
3. Return matching segments with timestamps
|
||||
4. Provide a concise answer based on the matching content
|
||||
|
||||
### Workflow 5: Batch Playlist Extraction
|
||||
|
||||
User provides a playlist or series URL:
|
||||
|
||||
1. Use `yt-dlp --flat-playlist` to list all videos
|
||||
2. Extract subtitles from each video sequentially
|
||||
3. Save each transcript as a separate file
|
||||
4. Generate a combined index with video titles and file paths
|
||||
|
||||
### Workflow 6: Bilibili Bangumi (番剧) Subtitles
|
||||
|
||||
User shares a bangumi URL:
|
||||
|
||||
1. Bangumi often has official multi-language subtitles
|
||||
2. Use `--list-subs` to show all available languages
|
||||
3. Download preferred language(s)
|
||||
4. Note: Some bangumi require 大会员 cookies
|
||||
|
||||
---
|
||||
|
||||
## Output Format
|
||||
|
||||
### Transcript Output
|
||||
|
||||
```markdown
|
||||
# 📝 Subtitles: [Video Title]
|
||||
|
||||
**Platform**: Bilibili / YouTube
|
||||
**Language**: 中文 (zh-CN)
|
||||
**Duration**: ~XX minutes
|
||||
**Subtitle Type**: Manual / Auto-generated
|
||||
|
||||
---
|
||||
|
||||
[00:00] 大家好,欢迎来到今天的视频
|
||||
[00:05] 今天我们要讨论的话题是...
|
||||
[00:12] 首先我们来看一下背景
|
||||
...
|
||||
```
|
||||
|
||||
### Summary Output
|
||||
|
||||
```markdown
|
||||
# 📋 Video Summary: [Title]
|
||||
|
||||
## 概要
|
||||
[2-3 sentence summary in the video's language]
|
||||
|
||||
## 要点
|
||||
- **[00:00]** 开场介绍和主题说明
|
||||
- **[02:15]** 第一个核心观点
|
||||
- **[08:30]** 关键论据和数据
|
||||
- **[15:00]** 实际演示
|
||||
- **[22:45]** 总结与下一步
|
||||
|
||||
## 详细笔记
|
||||
|
||||
### 第一部分: [主题] (00:00 - 05:30)
|
||||
[详细内容笔记]
|
||||
|
||||
### 第二部分: [主题] (05:30 - 12:00)
|
||||
[详细内容笔记]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
### 1. Bilibili Geo-Restrictions
|
||||
|
||||
**Problem**: Some Bilibili content is restricted to mainland China.
|
||||
|
||||
**Solutions**:
|
||||
- Use a proxy or VPN with a Chinese IP: `yt-dlp --proxy socks5://127.0.0.1:1080 URL`
|
||||
- Set the `--geo-bypass` flag: `yt-dlp --geo-bypass URL`
|
||||
- For persistent issues, use `--geo-bypass-country CN`
|
||||
|
||||
```bash
|
||||
yt-dlp --geo-bypass-country CN --write-sub --skip-download "URL"
|
||||
```
|
||||
|
||||
### 2. Member-Only Content Without Cookies
|
||||
|
||||
**Problem**: `yt-dlp` returns an error or empty subtitles for 大会员 videos.
|
||||
|
||||
**Solution**: Always check if the video requires 大会员 access. If so, cookies are mandatory:
|
||||
|
||||
```bash
|
||||
# If this fails:
|
||||
yt-dlp --list-subs "URL"
|
||||
# ERROR: This video requires premium membership
|
||||
|
||||
# Try with cookies:
|
||||
yt-dlp --cookies-from-browser chrome --list-subs "URL"
|
||||
```
|
||||
|
||||
If browser cookie extraction fails (common on Linux), export cookies manually to a `cookies.txt` file.
|
||||
|
||||
### 3. No Subtitles Available
|
||||
|
||||
**Problem**: Many Bilibili videos, especially older ones or user-generated content, have no subtitles at all.
|
||||
|
||||
**Solution**: Inform the user clearly. Unlike YouTube, Bilibili does not always generate auto-subtitles. The video may only have hardcoded (burned-in) subtitles which require OCR — beyond the scope of this skill.
|
||||
|
||||
### 4. yt-dlp Version Issues
|
||||
|
||||
**Problem**: Bilibili frequently changes its API, causing older yt-dlp versions to fail.
|
||||
|
||||
**Solution**: Always ensure yt-dlp is up to date:
|
||||
|
||||
```bash
|
||||
pip install -U yt-dlp
|
||||
# or
|
||||
yt-dlp -U
|
||||
```
|
||||
|
||||
### 5. Subtitle Format Inconsistencies
|
||||
|
||||
**Problem**: Different videos return subtitles in different formats (SRT, VTT, JSON3, ASS).
|
||||
|
||||
**Solution**: Use `--convert-subs srt` to normalize all subtitles to SRT format:
|
||||
|
||||
```bash
|
||||
yt-dlp --write-sub --convert-subs srt --skip-download "URL"
|
||||
```
|
||||
|
||||
### 6. Rate Limiting on Bilibili
|
||||
|
||||
**Problem**: Rapid successive requests may get temporarily blocked.
|
||||
|
||||
**Solutions**:
|
||||
- Add delays between batch requests: `--sleep-requests 2`
|
||||
- Use `--sleep-interval 5` for playlists
|
||||
- Limit concurrent downloads: `--max-downloads 10`
|
||||
|
||||
### 7. Short Link Resolution
|
||||
|
||||
**Problem**: Bilibili short links (`b23.tv`) may not resolve properly.
|
||||
|
||||
**Solution**: yt-dlp handles most redirects automatically, but if it fails:
|
||||
|
||||
```bash
|
||||
# Manually resolve the short link first
|
||||
curl -sI "https://b23.tv/aBcDeFg" | grep -i location
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Multi-Language Subtitle Support
|
||||
|
||||
### Common Language Codes
|
||||
|
||||
| Platform | Language | Code |
|
||||
|---|---|---|
|
||||
| Bilibili | 中文 | `zh-CN`, `zh` |
|
||||
| Bilibili | English | `en` |
|
||||
| Bilibili | 日本語 | `ja` |
|
||||
| YouTube | English | `en` |
|
||||
| YouTube | 中文(简体) | `zh-Hans` |
|
||||
| YouTube | 中文(繁体) | `zh-Hant` |
|
||||
| YouTube | 日本語 | `ja` |
|
||||
| YouTube | 한국어 | `ko` |
|
||||
|
||||
### Checking Available Languages
|
||||
|
||||
```bash
|
||||
# Bilibili
|
||||
yt-dlp --list-subs "https://www.bilibili.com/video/BV..."
|
||||
|
||||
# YouTube
|
||||
yt-dlp --list-subs "https://www.youtube.com/watch?v=..."
|
||||
```
|
||||
|
||||
The output shows both manual and auto-generated subtitle tracks with their language codes.
|
||||
|
||||
---
|
||||
|
||||
## Platform Comparison
|
||||
|
||||
| Feature | Bilibili | YouTube |
|
||||
|---|---|---|
|
||||
| Auto-generated subtitles | Rare | Common |
|
||||
| Manual subtitles | Common (CC) | Common |
|
||||
| Multi-language subs | Some (bangumi) | Many |
|
||||
| Cookie auth needed | 大会员 content | Age-restricted |
|
||||
| Geo-restrictions | Some content CN-only | Varies |
|
||||
| Subtitle formats | SRT, JSON | SRT, VTT, JSON3 |
|
||||
| Playlist support | Yes (multi-page) | Yes |
|
||||
| Rate limiting | Moderate | Moderate |
|
||||
Reference in New Issue
Block a user