Anthropic Claude音声・動画・メディア⭐ リポ 1品質スコア 58/100

deepgram-core-workflow-b

Name: deepgram-core-workflow-b
Author: Brmbobo

Deepgramを使ったリアルタイムストリーミング文字起こしを実装できます。ライブ文字起こし、音声インターフェース、リアルタイムオーディオ処理アプリケーションの構築時に活用できます。「deepgramストリーミング」「リアルタイム文字起こし」「ライブ文字起こし」「websocket文字起こし」「音声ストリーミング」といったフレーズでトリガーされます。

description の原文を見る

Implement real-time streaming transcription with Deepgram. Use when building live transcription, voice interfaces, or real-time audio processing applications. Trigger with phrases like "deepgram streaming", "real-time transcription", "live transcription", "websocket transcription", "voice streaming".

SKILL.md 本文

Deepgram Core Workflow B: リアルタイムストリーミング

概要

Deepgram の WebSocket API を使用してリアルタイムストリーミング文字起こしを実装し、ライブオーディオ処理を行います。

前提条件

deepgram-install-auth セットアップの完了
WebSocket パターンの理解
オーディオ入力ソース（マイクロフォンまたはストリーム）

手順

ステップ 1: WebSocket 接続の設定

Deepgram でライブ文字起こし接続を初期化します。

ステップ 2: ストリームオプションの設定

暫定結果、エンドポイント、言語オプションを設定します。

ステップ 3: イベントの処理

トランスクリプトイベントと接続ライフサイクルのハンドラを実装します。

ステップ 4: オーディオデータのストリーミング

WebSocket 接続にオーディオチャンクを送信します。

出力

ライブ文字起こし WebSocket クライアント
リアルタイム結果用のイベントハンドラ
オーディオストリーミングパイプライン
グレースフルな接続管理

エラーハンドリング

エラー	原因	解決方法
Connection Closed	ネットワーク中断	自動再接続を実装
Buffer Overflow	過度なオーディオデータ	サンプルレートまたはチャンクサイズを削減
No Transcripts	無音オーディオ	オーディオレベルとフォーマットを確認
High Latency	ネットワーク/処理遅延	暫定結果を使用

例

TypeScript WebSocket クライアント

// services/live-transcription.ts
import { createClient, LiveTranscriptionEvents } from '@deepgram/sdk';

export interface LiveTranscriptionOptions {
  model?: 'nova-2' | 'nova' | 'enhanced' | 'base';
  language?: string;
  punctuate?: boolean;
  interimResults?: boolean;
  endpointing?: number;
  vadEvents?: boolean;
}

export class LiveTranscriptionService {
  private client;
  private connection: any = null;

  constructor(apiKey: string) {
    this.client = createClient(apiKey);
  }

  async start(
    options: LiveTranscriptionOptions = {},
    handlers: {
      onTranscript?: (transcript: string, isFinal: boolean) => void;
      onError?: (error: Error) => void;
      onClose?: () => void;
    } = {}
  ): Promise<void> {
    this.connection = this.client.listen.live({
      model: options.model || 'nova-2',
      language: options.language || 'en',
      punctuate: options.punctuate ?? true,
      interim_results: options.interimResults ?? true,
      endpointing: options.endpointing ?? 300,
      vad_events: options.vadEvents ?? true,
    });

    this.connection.on(LiveTranscriptionEvents.Open, () => {
      console.log('Deepgram connection opened');
    });

    this.connection.on(LiveTranscriptionEvents.Transcript, (data: any) => {
      const transcript = data.channel.alternatives[0].transcript;
      const isFinal = data.is_final;

      if (transcript && handlers.onTranscript) {
        handlers.onTranscript(transcript, isFinal);
      }
    });

    this.connection.on(LiveTranscriptionEvents.Error, (error: Error) => {
      console.error('Deepgram error:', error);
      handlers.onError?.(error);
    });

    this.connection.on(LiveTranscriptionEvents.Close, () => {
      console.log('Deepgram connection closed');
      handlers.onClose?.();
    });
  }

  send(audioData: Buffer): void {
    if (this.connection) {
      this.connection.send(audioData);
    }
  }

  async stop(): Promise<void> {
    if (this.connection) {
      this.connection.finish();
      this.connection = null;
    }
  }
}

ブラウザマイク統合

// services/microphone.ts
import { LiveTranscriptionService } from './live-transcription';

export async function startMicrophoneTranscription(
  onTranscript: (text: string, isFinal: boolean) => void
): Promise<{ stop: () => void }> {
  const service = new LiveTranscriptionService(process.env.DEEPGRAM_API_KEY!);

  // Get microphone access
  const stream = await navigator.mediaDevices.getUserMedia({ audio: true });

  // Create audio context and processor
  const audioContext = new AudioContext({ sampleRate: 16000 });
  const source = audioContext.createMediaStreamSource(stream);
  const processor = audioContext.createScriptProcessor(4096, 1, 1);

  // Start transcription
  await service.start({
    model: 'nova-2',
    interimResults: true,
    punctuate: true,
  }, {
    onTranscript,
    onError: console.error,
  });

  // Process audio
  processor.onaudioprocess = (event) => {
    const inputData = event.inputBuffer.getChannelData(0);
    const pcmData = new Int16Array(inputData.length);

    for (let i = 0; i < inputData.length; i++) {
      pcmData[i] = Math.max(-32768, Math.min(32767, inputData[i] * 32768));
    }

    service.send(Buffer.from(pcmData.buffer));
  };

  source.connect(processor);
  processor.connect(audioContext.destination);

  return {
    stop: () => {
      processor.disconnect();
      source.disconnect();
      stream.getTracks().forEach(track => track.stop());
      service.stop();
    }
  };
}

Node.js によるオーディオファイルストリーミング

// stream-file.ts
import { createReadStream } from 'fs';
import { LiveTranscriptionService } from './services/live-transcription';

async function streamAudioFile(filePath: string) {
  const service = new LiveTranscriptionService(process.env.DEEPGRAM_API_KEY!);

  const finalTranscripts: string[] = [];

  await service.start({
    model: 'nova-2',
    punctuate: true,
    interimResults: false,
  }, {
    onTranscript: (transcript, isFinal) => {
      if (isFinal) {
        finalTranscripts.push(transcript);
        console.log('Final:', transcript);
      }
    },
    onClose: () => {
      console.log('Complete transcript:', finalTranscripts.join(' '));
    },
  });

  // Stream file in chunks
  const stream = createReadStream(filePath, { highWaterMark: 4096 });

  for await (const chunk of stream) {
    service.send(chunk);
    // Pace the streaming to match real-time
    await new Promise(resolve => setTimeout(resolve, 100));
  }

  await service.stop();
}

Python ストリーミング例

# services/live_transcription.py
from deepgram import DeepgramClient, LiveTranscriptionEvents, LiveOptions
import asyncio

class LiveTranscriptionService:
    def __init__(self, api_key: str):
        self.client = DeepgramClient(api_key)
        self.connection = None
        self.transcripts = []

    async def start(self, on_transcript=None):
        self.connection = self.client.listen.asynclive.v("1")

        async def on_message(self, result, **kwargs):
            transcript = result.channel.alternatives[0].transcript
            if transcript and on_transcript:
                on_transcript(transcript, result.is_final)

        async def on_error(self, error, **kwargs):
            print(f"Error: {error}")

        self.connection.on(LiveTranscriptionEvents.Transcript, on_message)
        self.connection.on(LiveTranscriptionEvents.Error, on_error)

        options = LiveOptions(
            model="nova-2",
            language="en",
            punctuate=True,
            interim_results=True,
        )

        await self.connection.start(options)

    def send(self, audio_data: bytes):
        if self.connection:
            self.connection.send(audio_data)

    async def stop(self):
        if self.connection:
            await self.connection.finish()

自動再接続パターン

// services/resilient-live.ts
export class ResilientLiveTranscription {
  private service: LiveTranscriptionService;
  private reconnectAttempts = 0;
  private maxReconnectAttempts = 5;

  async connect(options: LiveTranscriptionOptions) {
    try {
      await this.service.start(options, {
        onClose: () => this.handleDisconnect(options),
        onError: (err) => console.error('Stream error:', err),
      });
      this.reconnectAttempts = 0;
    } catch (error) {
      await this.handleDisconnect(options);
    }
  }

  private async handleDisconnect(options: LiveTranscriptionOptions) {
    if (this.reconnectAttempts < this.maxReconnectAttempts) {
      this.reconnectAttempts++;
      const delay = Math.pow(2, this.reconnectAttempts) * 1000;
      console.log(`Reconnecting in ${delay}ms...`);
      await new Promise(r => setTimeout(r, delay));
      await this.connect(options);
    }
  }
}

リソース

次のステップ

deepgram-common-errors に進んでエラーハンドリングパターンを確認します。

ライセンス: MIT(寛容ライセンスのため全文を引用しています) · 原本リポジトリ

詳細情報

作者: Brmbobo
リポジトリ: Brmbobo/Web2podcast
ライセンス: MIT
最終更新: 2026/1/26

GitHubで原本を見る →フィードバックを送る

Source: https://github.com/Brmbobo/Web2podcast / ライセンス: MIT