汎用音声・動画・メディア⭐ リポ 2品質スコア 59/100

markitdown

Name: markitdown
Author: luokai0

ファイルとOfficeドキュメントをMarkdownに変換できます。PDF、DOCX、PPTX、XLSX、画像（OCR対応）、音声（文字起こし対応）、HTML、CSV、JSON、XML、ZIP、YouTubeのURL、EPubなど、多くの形式に対応しています。

description の原文を見る

Convert files and office documents to Markdown. Supports PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs and more.

SKILL.md 本文

MarkItDown - ファイルからMarkdownへの変換

概要

MarkItDownはMicrosoftが開発したPythonツールで、様々なファイル形式をMarkdownに変換します。Markdownはトークン効率が良く、現代の言語モデルに適した形式であるため、ドキュメントをLLMフレンドリーなテキスト形式に変換するのに特に有用です。

主な利点：

ドキュメントを整理されたMarkdownに変換
LLM処理に最適化されたトークン効率的な形式
15以上のファイル形式に対応
AIによる画像説明の拡張オプション
画像とスキャン文書向けOCR
オーディオファイルの音声文字起こし

科学図表によるビジュアル強化

このスキルでドキュメントを作成する場合、常にビジュアルコミュニケーションを強化するための科学図表の追加を検討してください。

ドキュメントに図表やダイアグラムが含まれていない場合：

scientific-schematicsスキルを使用してAI駆動の出版品質ダイアグラムを生成してください
希望するダイアグラムを自然言語で説明するだけです
Nano Banana Proが自動的にスケマティック図を生成、確認、洗練します

**新しいドキュメントの場合：**科学図表はデフォルトで生成され、テキストで説明されている重要な概念、ワークフロー、アーキテクチャ、または関係を視覚的に表現します。

スケマティック図の生成方法：

python scripts/generate_schematic.py "your diagram description" -o figures/output.png

AIは自動的に以下を実行します：

適切なフォーマットで出版品質の画像を作成
複数の繰り返しで確認と洗練
アクセシビリティを確保（色覚異常対応、高コントラスト）
figures/ディレクトリに出力を保存

スケマティック図を追加するタイミング：

ドキュメント変換ワークフローダイアグラム
ファイル形式アーキテクチャ図
OCR処理パイプラインダイアグラム
統合ワークフロー可視化
システムアーキテクチャダイアグラム
データフロー図
ビジュアル化から利益を得られる複雑な概念

スケマティック図の作成に関する詳細なガイダンスについては、scientific-schematicsスキルのドキュメントを参照してください。

対応フォーマット

フォーマット	説明	備考
PDF	ポータブルドキュメント形式	完全テキスト抽出
DOCX	Microsoft Word	テーブル、フォーマット保持
PPTX	PowerPoint	スライドとノート
XLSX	Excelスプレッドシート	テーブルとデータ
Images	JPEG、PNG、GIF、WebP	EXIFメタデータ + OCR
Audio	WAV、MP3	メタデータ + 文字起こし
HTML	Webページ	クリーンな変換
CSV	カンマ区切り値	テーブル形式
JSON	JSONデータ	構造化表現
XML	XMLドキュメント	構造化形式
ZIP	アーカイブファイル	内容を反復処理
EPUB	電子書籍	完全テキスト抽出
YouTube	ビデオURL	文字起こしを取得

クイックスタート

インストール

# 全機能を含めてインストール
pip install 'markitdown[all]'

# またはソースから
git clone https://github.com/microsoft/markitdown.git
cd markitdown
pip install -e 'packages/markitdown[all]'

コマンドラインの使用

# 基本的な変換
markitdown document.pdf > output.md

# 出力ファイルを指定
markitdown document.pdf -o output.md

# パイプで内容を渡す
cat document.pdf | markitdown > output.md

# プラグインを有効化
markitdown --list-plugins  # 利用可能なプラグインをリスト表示
markitdown --use-plugins document.pdf -o output.md

Python API

from markitdown import MarkItDown

# 基本的な使用法
md = MarkItDown()
result = md.convert("document.pdf")
print(result.text_content)

# ストリームから変換
with open("document.pdf", "rb") as f:
    result = md.convert_stream(f, file_extension=".pdf")
    print(result.text_content)

高度な機能

1. AIによる画像説明の拡張

OpenRouterを経由してLLMを使用して詳細な画像説明を生成します（PPTXと画像ファイル用）：

from markitdown import MarkItDown
from openai import OpenAI

# OpenRouterクライアントを初期化（OpenAI互換API）
client = OpenAI(
    api_key="your-openrouter-api-key",
    base_url="https://openrouter.ai/api/v1"
)

md = MarkItDown(
    llm_client=client,
    llm_model="anthropic/claude-opus-4.5",  # 科学ビジョンに推奨
    llm_prompt="Describe this image in detail for scientific documentation"
)

result = md.convert("presentation.pptx")
print(result.text_content)

2. Azure Document Intelligence

Microsoft Document Intelligenceによる高度なPDF変換：

# コマンドライン
markitdown document.pdf -o output.md -d -e "<document_intelligence_endpoint>"

# Python API
from markitdown import MarkItDown

md = MarkItDown(docintel_endpoint="<document_intelligence_endpoint>")
result = md.convert("complex_document.pdf")
print(result.text_content)

3. プラグインシステム

MarkItDownは機能を拡張するためのサードパーティプラグインに対応しています：

# インストール済みプラグインをリスト表示
markitdown --list-plugins

# プラグインを有効化
markitdown --use-plugins file.pdf -o output.md

GitHubでハッシュタグ #markitdown-plugin を使用してプラグインを検索してください。

オプション依存関係

サポートするファイル形式を制御します：

# 特定の形式をインストール
pip install 'markitdown[pdf, docx, pptx]'

# 利用可能なすべてのオプション：
# [all]                  - すべてのオプション依存関係
# [pptx]                 - PowerPointファイル
# [docx]                 - Wordドキュメント
# [xlsx]                 - Excelスプレッドシート
# [xls]                  - 以前のExcelファイル
# [pdf]                  - PDFドキュメント
# [outlook]              - Outlookメッセージ
# [az-doc-intel]         - Azure Document Intelligence
# [audio-transcription]  - WAVおよびMP3の文字起こし
# [youtube-transcription] - YouTubeビデオの文字起こし

一般的なユースケース

1. 科学論文をMarkdownに変換

from markitdown import MarkItDown

md = MarkItDown()

# PDFペーパーを変換
result = md.convert("research_paper.pdf")
with open("paper.md", "w") as f:
    f.write(result.text_content)

2. Excelからデータを抽出して分析

from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("data.xlsx")

# 結果はMarkdownテーブル形式になります
print(result.text_content)

3. 複数のドキュメントを処理

from markitdown import MarkItDown
import os
from pathlib import Path

md = MarkItDown()

# ディレクトリ内のすべてのPDFを処理
pdf_dir = Path("papers/")
output_dir = Path("markdown_output/")
output_dir.mkdir(exist_ok=True)

for pdf_file in pdf_dir.glob("*.pdf"):
    result = md.convert(str(pdf_file))
    output_file = output_dir / f"{pdf_file.stem}.md"
    output_file.write_text(result.text_content)
    print(f"Converted: {pdf_file.name}")

4. AI説明付きでPowerPointを変換

from markitdown import MarkItDown
from openai import OpenAI

# OpenRouterを使用して複数のAIモデルにアクセス
client = OpenAI(
    api_key="your-openrouter-api-key",
    base_url="https://openrouter.ai/api/v1"
)

md = MarkItDown(
    llm_client=client,
    llm_model="anthropic/claude-opus-4.5",  # プレゼンテーション向けに推奨
    llm_prompt="Describe this slide image in detail, focusing on key visual elements and data"
)

result = md.convert("presentation.pptx")
with open("presentation.md", "w") as f:
    f.write(result.text_content)

5. 異なる形式でバッチ変換

from markitdown import MarkItDown
from pathlib import Path

md = MarkItDown()

# 変換するファイル
files = [
    "document.pdf",
    "spreadsheet.xlsx",
    "presentation.pptx",
    "notes.docx"
]

for file in files:
    try:
        result = md.convert(file)
        output = Path(file).stem + ".md"
        with open(output, "w") as f:
            f.write(result.text_content)
        print(f"✓ Converted {file}")
    except Exception as e:
        print(f"✗ Error converting {file}: {e}")

6. YouTubeビデオの文字起こしを抽出

from markitdown import MarkItDown

md = MarkItDown()

# YouTubeビデオを文字起こしに変換
result = md.convert("https://www.youtube.com/watch?v=VIDEO_ID")
print(result.text_content)

Dockerの使用

# イメージをビルド
docker build -t markitdown:latest .

# 変換を実行
docker run --rm -i markitdown:latest < ~/document.pdf > output.md

ベストプラクティス

1. 正しい変換方法を選択

シンプルなドキュメント：基本的な MarkItDown() を使用
複雑なPDF：Azure Document Intelligenceを使用
ビジュアルコンテンツ：AI画像説明を有効化
スキャン文書：OCR依存関係がインストールされていることを確認

2. エラーを適切に処理

from markitdown import MarkItDown

md = MarkItDown()

try:
    result = md.convert("document.pdf")
    print(result.text_content)
except FileNotFoundError:
    print("File not found")
except Exception as e:
    print(f"Conversion error: {e}")

3. 大きなファイルを効率的に処理

from markitdown import MarkItDown

md = MarkItDown()

# 大きなファイルの場合、ストリーミングを使用
with open("large_file.pdf", "rb") as f:
    result = md.convert_stream(f, file_extension=".pdf")
    
    # チャンク単位で処理または直接保存
    with open("output.md", "w") as out:
        out.write(result.text_content)

4. トークン効率に最適化

Markdownの出力はすでにトークン効率が良いですが、以下の方法で最適化できます：

過度な空白を削除
類似セクションを統合
不要なメタデータを削除

from markitdown import MarkItDown
import re

md = MarkItDown()
result = md.convert("document.pdf")

# 余分な空白をクリーンアップ
clean_text = re.sub(r'\n{3,}', '\n\n', result.text_content)
clean_text = clean_text.strip()

print(clean_text)

科学ワークフローとの統合

レビュー用に文献を変換

from markitdown import MarkItDown
from pathlib import Path

md = MarkItDown()

# 文献フォルダ内のすべての論文を変換
papers_dir = Path("literature/pdfs")
output_dir = Path("literature/markdown")
output_dir.mkdir(exist_ok=True)

for paper in papers_dir.glob("*.pdf"):
    result = md.convert(str(paper))
    
    # メタデータ付きで保存
    output_file = output_dir / f"{paper.stem}.md"
    content = f"# {paper.stem}\n\n"
    content += f"**Source**: {paper.name}\n\n"
    content += "---\n\n"
    content += result.text_content
    
    output_file.write_text(content)

# AI拡張変換と図表用
from openai import OpenAI

client = OpenAI(
    api_key="your-openrouter-api-key",
    base_url="https://openrouter.ai/api/v1"
)

md_ai = MarkItDown(
    llm_client=client,
    llm_model="anthropic/claude-opus-4.5",
    llm_prompt="Describe scientific figures with technical precision"
)

テーブルを抽出して分析

from markitdown import MarkItDown
import re

md = MarkItDown()
result = md.convert("data_tables.xlsx")

# Markdownテーブルは解析または直接使用できます
print(result.text_content)

トラブルシューティング

よくある問題

不足している依存関係：機能固有のパッケージをインストール
```
pip install 'markitdown[pdf]'  # PDF対応用
```

バイナリファイルエラー：ファイルをバイナリモードで開くことを確認

with open("file.pdf", "rb") as f:  # 「rb」に注意
    result = md.convert_stream(f, file_extension=".pdf")

OCRが動作しない：tesseractをインストール

# macOS
brew install tesseract

# Ubuntu
sudo apt-get install tesseract-ocr

パフォーマンスの考慮事項

PDFファイル：大きなPDFは時間がかかる場合があります。サポートされている場合はページ範囲を検討してください
画像OCR：OCR処理はCPU集約的です
オーディオ文字起こし：追加のコンピュートリソースが必要です
AI画像説明：APIコールが必要です（コストが発生する場合があります）

次のステップ

完全なAPI仕様については references/api_reference.md を参照してください
フォーマット固有の詳細については references/file_formats.md を確認してください
オートメーション例については scripts/batch_convert.py を確認してください
AI拡張変換については scripts/convert_with_ai.py を探索してください

リソース

MarkItDown GitHub: https://github.com/microsoft/markitdown
PyPI: https://pypi.org/project/markitdown/
OpenRouter: https://openrouter.ai (AI拡張変換用)
OpenRouterAPIキー: https://openrouter.ai/keys
OpenRouterモデル: https://openrouter.ai/models
MCPサーバー: markitdown-mcp (Claude Desktop統合用)
プラグイン開発: packages/markitdown-sample-plugin を参照

ライセンス: MIT(寛容ライセンスのため全文を引用しています) · 原本リポジトリ

詳細情報

作者: luokai0
リポジトリ: luokai0/ai-agent-skills-by-luo-kai
ライセンス: MIT
最終更新: 2026/5/5

GitHubで原本を見る →フィードバックを送る

Source: https://github.com/luokai0/ai-agent-skills-by-luo-kai / ライセンス: MIT