Anthropic Claudeソフトウェア開発⭐ リポ 0品質スコア 50/100

web-scraper

Name: web-scraper
Author: guia-matthieu

Webサイトから構造化データを抽出します。競合他社の価格収集、商品一覧のスクレイピング、連絡先情報の取得、調査データの収集、Webサイトの変更監視などの用途に活用できます。

description の原文を見る

Extract structured data from websites. Use when: collecting competitor pricing; scraping product listings; extracting contact information; gathering research data; monitoring website changes

SKILL.md 本文

Web Scraper

BeautifulSoup と requests を使用してウェブサイトから構造化データを抽出 - あらゆるウェブページを使用可能なデータに変換します。

このスキルをいつ使用するか

競合調査 - 価格設定、機能、ポジショニングをスクレイピング
リード生成 - ディレクトリから連絡先情報を抽出
コンテンツ監査 - 見出し、リンク、メタデータを取得
価格監視 - 競合企業の価格変動を追跡
データ収集 - 複数のソースから研究データを収集

Claude が実施することと、あなたが決めること

Claude が実施すること	あなたが決めること
分析フレームワークを構築	戦略的優先事項
市場データを統合	競争ポジショニング
機会を特定	リソース配分
戦略的オプションを作成	最終戦略選択
実装アプローチを提案	実行の意思決定

依存関係

pip install beautifulsoup4 requests pandas click lxml

コマンド

要素をスクレイピング

python scripts/main.py scrape https://example.com --selector "h1,h2,p"
python scripts/main.py scrape https://example.com --selector ".product-price"

リンクを抽出

python scripts/main.py links https://example.com
python scripts/main.py links https://example.com --internal-only

メールアドレスを抽出

python scripts/main.py emails https://example.com
python scripts/main.py emails https://example.com --depth 2

構造化データを抽出

python scripts/main.py structured https://example.com/article --schema article
python scripts/main.py structured https://example.com/product --schema product

例

例1: 競合企業の価格設定をスクレイピング

python scripts/main.py scrape https://competitor.com/pricing --selector ".price,.plan-name"

# 出力:
# Extracted 6 elements
# 1. Starter - $29/mo
# 2. Pro - $99/mo
# 3. Enterprise - Contact us

例2: 記事コンテンツを抽出

python scripts/main.py structured https://blog.example.com/post --schema article

# 出力: article_data.json
# {
#   "title": "How to Scale Your Startup",
#   "author": "Jane Doe",
#   "date": "2024-01-15",
#   "content": "...",
#   "word_count": 1523
# }

CSSセレクターリファレンス

セレクター	説明	例
`tag`	要素タイプ	`h1`, `p`, `div`
`.class`	クラス名	`.price`, `.title`
`#id`	要素ID	`#main-content`
`tag.class`	クラス付きタグ	`div.product`
`tag[attr]`	属性を持つ	`a[href]`
`parent > child`	直下の子要素	`ul > li`
`tag1, tag2`	複数選択	`h1, h2, h3`

倫理的なスクレイピングガイドライン

robots.txt をチェック - サイトのスクレイピングポリシーを尊重
レート制限 - サーバーに過負荷をかけない(1～2 req/秒)
自己識別 - わかりやすい User-Agent を使用
キャッシュリクエスト - 変更されていないページを再スクレイピングしない
利用規約を確認 - スクレイピングが許可されているか確認

スキルの範囲

このスキルが得意なこと

戦略分析の構築
市場機会の特定
戦略的フレームワークの作成
競争データの統合

このスキルができないこと

市場調査の代替
戦略的成功の保証
競合企業の非公開情報の把握
経営判断の実施

スキルメタデータ

モード: centaur

category: automation
subcategory: data-extraction
dependencies: [beautifulsoup4, requests, pandas]
difficulty: intermediate
time_saved: 5+ hours/week

ライセンス: MIT(寛容ライセンスのため全文を引用しています) · 原本リポジトリ

詳細情報

作者: guia-matthieu
リポジトリ: guia-matthieu/clawfu-skills
ライセンス: MIT
最終更新: 不明

GitHubで原本を見る →フィードバックを送る

Source: https://github.com/guia-matthieu/clawfu-skills / ライセンス: MIT

web-scraper

SKILL.md 本文

Web Scraper

このスキルをいつ使用するか

Claude が実施することと、あなたが決めること

依存関係

コマンド

要素をスクレイピング

リンクを抽出

メールアドレスを抽出

構造化データを抽出

例

例1: 競合企業の価格設定をスクレイピング

例2: 記事コンテンツを抽出

CSSセレクターリファレンス

倫理的なスクレイピングガイドライン

スキルの範囲

このスキルが得意なこと

このスキルができないこと

関連スキル

スキルメタデータ

詳細情報

関連スキル

doubt-driven-development

apprun-skills

desloppify

debugging-and-error-recovery

test-driven-development

incremental-implementation

SKILL.md 本文

Web Scraper

このスキルをいつ使用するか

Claude が実施することと、あなたが決めること

依存関係

コマンド

要素をスクレイピング

リンクを抽出

メールアドレスを抽出

構造化データを抽出

例

例1: 競合企業の価格設定をスクレイピング

例2: 記事コンテンツを抽出

CSSセレクター リファレンス

倫理的なスクレイピングガイドライン

スキルの範囲

このスキルが得意なこと

このスキルができないこと

関連スキル

スキルメタデータ

詳細情報

関連スキル

doubt-driven-development

apprun-skills

desloppify

debugging-and-error-recovery

test-driven-development

incremental-implementation

CSSセレクターリファレンス