Anthropic Claudeソフトウェア開発⭐ リポ 0品質スコア 50/100

python-resilience

Name: python-resilience
Author: wshobson

自動リトライ・指数バックオフ・タイムアウト・耐障害性デコレータなど、Pythonのレジリエンスパターンを提供します。リトライロジックの追加、タイムアウトの実装、耐障害性サービスの構築、または一時的な障害への対処が必要な場合に使用してください。

description の原文を見る

Python resilience patterns including automatic retries, exponential backoff, timeouts, and fault-tolerant decorators. Use when adding retry logic, implementing timeouts, building fault-tolerant services, or handling transient failures.

SKILL.md 本文

Python Resilience Patterns

一時的な障害、ネットワークの問題、サービスの停止を適切に処理する、フォルトトレランスな Python アプリケーションを構築します。レジリエンスパターンは、依存関係が信頼できない場合でもシステムを実行し続けます。

このスキルを使用する場合

外部サービス呼び出しにリトライロジックを追加する
ネットワーク操作にタイムアウトを実装する
フォルトトレランスマイクロサービスを構築する
レート制限とバックプレッシャーを処理する
インフラストラクチャデコレータを作成する
サーキットブレーカーを設計する

コアコンセプト

1. 一時的な障害と永続的な障害

一時的なエラー（ネットワークタイムアウト、一時的なサービス問題）はリトライします。永続的なエラー（無効な認証情報、不正なリクエスト）はリトライしません。

2. 指数バックオフ

リトライ間の待機時間を増加させて、復帰中のサービスを圧倒しないようにします。

3. ジッター

複数のクライアントが同時にリトライするときの「thundering herd」を防ぐため、バックオフにランダム性を追加します。

4. 制限されたリトライ

試行回数と総期間の両方を上限に設定して、無限リトライループを防ぎます。

クイックスタート

from tenacity import retry, stop_after_attempt, wait_exponential_jitter

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential_jitter(initial=1, max=10),
)
def call_external_service(request: dict) -> dict:
    return httpx.post("https://api.example.com", json=request).json()

基本パターン

パターン 1: Tenacity を使用した基本リトライ

本番レベルのリトライロジックには tenacity ライブラリを使用します。より簡単なケースの場合は、組み込みのリトライ機能または軽量なカスタム実装を検討してください。

from tenacity import (
    retry,
    stop_after_attempt,
    stop_after_delay,
    wait_exponential_jitter,
    retry_if_exception_type,
)

TRANSIENT_ERRORS = (ConnectionError, TimeoutError, OSError)

@retry(
    retry=retry_if_exception_type(TRANSIENT_ERRORS),
    stop=stop_after_attempt(5) | stop_after_delay(60),
    wait=wait_exponential_jitter(initial=1, max=30),
)
def fetch_data(url: str) -> dict:
    """Fetch data with automatic retry on transient failures."""
    response = httpx.get(url, timeout=30)
    response.raise_for_status()
    return response.json()

パターン 2: 適切なエラーのみをリトライ

特定の一時的な例外をホワイトリストに登録します。以下はリトライしないでください：

ValueError, TypeError - これらはバグで、一時的な問題ではありません
AuthenticationError - 無効な認証情報が有効になることはありません
HTTP 4xx エラー（429 を除く）- クライアントエラーは永続的です

from tenacity import retry, retry_if_exception_type
import httpx

# Define what's retryable
RETRYABLE_EXCEPTIONS = (
    ConnectionError,
    TimeoutError,
    httpx.ConnectTimeout,
    httpx.ReadTimeout,
)

@retry(
    retry=retry_if_exception_type(RETRYABLE_EXCEPTIONS),
    stop=stop_after_attempt(3),
    wait=wait_exponential_jitter(initial=1, max=10),
)
def resilient_api_call(endpoint: str) -> dict:
    """Make API call with retry on network issues."""
    return httpx.get(endpoint, timeout=10).json()

パターン 3: HTTP ステータスコードのリトライ

一時的な問題を示す特定の HTTP ステータスコードをリトライします。

from tenacity import retry, retry_if_result, stop_after_attempt
import httpx

RETRY_STATUS_CODES = {429, 502, 503, 504}

def should_retry_response(response: httpx.Response) -> bool:
    """Check if response indicates a retryable error."""
    return response.status_code in RETRY_STATUS_CODES

@retry(
    retry=retry_if_result(should_retry_response),
    stop=stop_after_attempt(3),
    wait=wait_exponential_jitter(initial=1, max=10),
)
def http_request(method: str, url: str, **kwargs) -> httpx.Response:
    """Make HTTP request with retry on transient status codes."""
    return httpx.request(method, url, timeout=30, **kwargs)

パターン 4: 例外とステータスリトライの組み合わせ

ネットワーク例外と HTTP ステータスコードの両方を処理します。

from tenacity import (
    retry,
    retry_if_exception_type,
    retry_if_result,
    stop_after_attempt,
    wait_exponential_jitter,
    before_sleep_log,
)
import logging
import httpx

logger = logging.getLogger(__name__)

TRANSIENT_EXCEPTIONS = (
    ConnectionError,
    TimeoutError,
    httpx.ConnectError,
    httpx.ReadTimeout,
)
RETRY_STATUS_CODES = {429, 500, 502, 503, 504}

def is_retryable_response(response: httpx.Response) -> bool:
    return response.status_code in RETRY_STATUS_CODES

@retry(
    retry=(
        retry_if_exception_type(TRANSIENT_EXCEPTIONS) |
        retry_if_result(is_retryable_response)
    ),
    stop=stop_after_attempt(5),
    wait=wait_exponential_jitter(initial=1, max=30),
    before_sleep=before_sleep_log(logger, logging.WARNING),
)
def robust_http_call(
    method: str,
    url: str,
    **kwargs,
) -> httpx.Response:
    """HTTP call with comprehensive retry handling."""
    return httpx.request(method, url, timeout=30, **kwargs)

高度なパターン

パターン 5: リトライ試行のログ記録

デバッグとアラートのためにリトライ動作を追跡します。

from tenacity import retry, stop_after_attempt, wait_exponential
import structlog

logger = structlog.get_logger()

def log_retry_attempt(retry_state):
    """Log detailed retry information."""
    exception = retry_state.outcome.exception()
    logger.warning(
        "Retrying operation",
        attempt=retry_state.attempt_number,
        exception_type=type(exception).__name__,
        exception_message=str(exception),
        next_wait_seconds=retry_state.next_action.sleep if retry_state.next_action else None,
    )

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, max=10),
    before_sleep=log_retry_attempt,
)
def call_with_logging(request: dict) -> dict:
    """External call with retry logging."""
    ...

パターン 6: タイムアウトデコレータ

一貫したタイムアウト処理のための再利用可能なタイムアウトデコレータを作成します。

import asyncio
from functools import wraps
from typing import TypeVar, Callable

T = TypeVar("T")

def with_timeout(seconds: float):
    """Decorator to add timeout to async functions."""
    def decorator(func: Callable[..., T]) -> Callable[..., T]:
        @wraps(func)
        async def wrapper(*args, **kwargs) -> T:
            return await asyncio.wait_for(
                func(*args, **kwargs),
                timeout=seconds,
            )
        return wrapper
    return decorator

@with_timeout(30)
async def fetch_with_timeout(url: str) -> dict:
    """Fetch URL with 30 second timeout."""
    async with httpx.AsyncClient() as client:
        response = await client.get(url)
        return response.json()

パターン 7: デコレータ経由のクロスカッティングコンサーン

デコレータをスタックして、インフラストラクチャをビジネスロジックから分離します。

from functools import wraps
from typing import TypeVar, Callable
import structlog

logger = structlog.get_logger()
T = TypeVar("T")

def traced(name: str | None = None):
    """Add tracing to function calls."""
    def decorator(func: Callable[..., T]) -> Callable[..., T]:
        span_name = name or func.__name__

        @wraps(func)
        async def wrapper(*args, **kwargs) -> T:
            logger.info("Operation started", operation=span_name)
            try:
                result = await func(*args, **kwargs)
                logger.info("Operation completed", operation=span_name)
                return result
            except Exception as e:
                logger.error("Operation failed", operation=span_name, error=str(e))
                raise
        return wrapper
    return decorator

# Stack multiple concerns
@traced("fetch_user_data")
@with_timeout(30)
@retry(stop=stop_after_attempt(3), wait=wait_exponential_jitter())
async def fetch_user_data(user_id: str) -> dict:
    """Fetch user with tracing, timeout, and retry."""
    ...

パターン 8: テスト性のための依存性注入

コンストラクタ経由でインフラストラクチャコンポーネントを渡して、簡単にテストできるようにします。

from dataclasses import dataclass
from typing import Protocol

class Logger(Protocol):
    def info(self, msg: str, **kwargs) -> None: ...
    def error(self, msg: str, **kwargs) -> None: ...

class MetricsClient(Protocol):
    def increment(self, metric: str, tags: dict | None = None) -> None: ...
    def timing(self, metric: str, value: float) -> None: ...

@dataclass
class UserService:
    """Service with injected infrastructure."""

    repository: UserRepository
    logger: Logger
    metrics: MetricsClient

    async def get_user(self, user_id: str) -> User:
        self.logger.info("Fetching user", user_id=user_id)
        start = time.perf_counter()

        try:
            user = await self.repository.get(user_id)
            self.metrics.increment("user.fetch.success")
            return user
        except Exception as e:
            self.metrics.increment("user.fetch.error")
            self.logger.error("Failed to fetch user", user_id=user_id, error=str(e))
            raise
        finally:
            elapsed = time.perf_counter() - start
            self.metrics.timing("user.fetch.duration", elapsed)

# Easy to test with fakes
service = UserService(
    repository=FakeRepository(),
    logger=FakeLogger(),
    metrics=FakeMetrics(),
)

パターン 9: フェイルセーフなデフォルト値

重要でない操作が失敗した場合、適切に性能を低下させます。

from typing import TypeVar
from collections.abc import Callable

T = TypeVar("T")

def fail_safe(default: T, log_failure: bool = True):
    """Return default value on failure instead of raising."""
    def decorator(func: Callable[..., T]) -> Callable[..., T]:
        @wraps(func)
        async def wrapper(*args, **kwargs) -> T:
            try:
                return await func(*args, **kwargs)
            except Exception as e:
                if log_failure:
                    logger.warning(
                        "Operation failed, using default",
                        function=func.__name__,
                        error=str(e),
                    )
                return default
        return wrapper
    return decorator

@fail_safe(default=[])
async def get_recommendations(user_id: str) -> list[str]:
    """Get recommendations, return empty list on failure."""
    ...

ベストプラクティス概要

一時的なエラーのみをリトライ - バグや認証失敗はリトライしません
指数バックオフを使用 - サービスが復帰するための時間を与えます
ジッターを追加 - 同期されたリトライからの thundering herd を防ぎます
総期間に上限を設定 - stop_after_attempt(5) | stop_after_delay(60)
すべてのリトライをログ記録 - 黙ったリトライはシステム問題を隠します
デコレータを使用 - リトライロジックをビジネスロジックから分離します
依存性を注入 - インフラストラクチャをテスト可能にします
すべてのネットワーク呼び出しにタイムアウトを設定 - すべてのネットワーク呼び出しにはタイムアウトが必要です
適切に失敗 - 重要でないパスに対してキャッシュまたはデフォルト値を返します
リトライレートを監視 - 高いリトライレートは根本的な問題を示しています

ライセンス: MIT(寛容ライセンスのため全文を引用しています) · 原本リポジトリ

詳細情報

作者: wshobson
リポジトリ: wshobson/agents
ライセンス: MIT
最終更新: 不明

GitHubで原本を見る →フィードバックを送る

Source: https://github.com/wshobson/agents / ライセンス: MIT