Anthropic Claudeその他⭐ リポ 0品質スコア 50/100

faiss

Name: faiss
Author: davila7

Facebookが開発した高効率な類似検索・密ベクトルクラスタリングライブラリ。数十億規模のベクトルに対応し、GPUアクセラレーションやFlat・IVF・HNSWなど多様なインデックス形式をサポート。メタデータ不要の純粋な類似検索や大規模ベクトル検索、高速k-NN検索が必要な高パフォーマンスアプリケーションに最適。

description の原文を見る

Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). Use for fast k-NN search, large-scale vector retrieval, or when you need pure similarity search without metadata. Best for high-performance applications.

SKILL.md 本文

FAISS - 効率的な類似度検索

Facebookの大規模ベクトル類似度検索ライブラリ。

FAISSをいつ使うか

FAISSを使うべき場合:

大規模ベクトルデータセット（数百万～数十億）での高速類似度検索が必要
GPUアクセラレーションが必要
メタデータフィルタリング不要な純粋なベクトル類似度検索
高いスループット、低遅延が重要
埋め込みのオフライン/バッチ処理

メトリクス:

GitHubスター 31,700以上
Meta/Facebook AI Research
数十億のベクトルに対応
C++ + Pythonバインディング

代わりに他の選択肢を使うべき場合:

Chroma/Pinecone: メタデータフィルタリングが必要
Weaviate: フルデータベース機能が必要
Annoy: シンプルで機能が少ない場合

クイックスタート

インストール

# CPU only
pip install faiss-cpu

# GPU support
pip install faiss-gpu

基本的な使い方

import faiss
import numpy as np

# サンプルデータ作成（1000個のベクトル、128次元）
d = 128
nb = 1000
vectors = np.random.random((nb, d)).astype('float32')

# インデックス作成
index = faiss.IndexFlatL2(d)  # L2距離
index.add(vectors)             # ベクトルを追加

# 検索
k = 5  # 5つの最近傍を検索
query = np.random.random((1, d)).astype('float32')
distances, indices = index.search(query, k)

print(f"Nearest neighbors: {indices}")
print(f"Distances: {distances}")

インデックスタイプ

1. Flat（完全一致検索）

# L2（ユークリッド）距離
index = faiss.IndexFlatL2(d)

# 内積（正規化ベクトルの場合はコサイン類似度）
index = faiss.IndexFlatIP(d)

# 最遅いが最も正確

2. IVF（逆ファイル）- 高速近似検索

# 量子化器の作成
quantizer = faiss.IndexFlatL2(d)

# 100個のクラスタを持つIVFインデックス
nlist = 100
index = faiss.IndexIVFFlat(quantizer, d, nlist)

# データで学習
index.train(vectors)

# ベクトルを追加
index.add(vectors)

# 検索（nprobe = 検索するクラスタ数）
index.nprobe = 10
distances, indices = index.search(query, k)

3. HNSW（階層的NSW）- 最良の品質/速度

# HNSWインデックス
M = 32  # 層ごとの接続数
index = faiss.IndexHNSWFlat(d, M)

# 学習不要
index.add(vectors)

# 検索
distances, indices = index.search(query, k)

4. Product Quantization - メモリ効率的

# PQはメモリを16～32倍削減
m = 8   # 部分量子化器の数
nbits = 8
index = faiss.IndexPQ(d, m, nbits)

# 学習と追加
index.train(vectors)
index.add(vectors)

保存と読み込み

# インデックスを保存
faiss.write_index(index, "large.index")

# インデックスを読み込み
index = faiss.read_index("large.index")

# 引き続き使用
distances, indices = index.search(query, k)

GPUアクセラレーション

# 単一GPU
res = faiss.StandardGpuResources()
index_cpu = faiss.IndexFlatL2(d)
index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu)  # GPU 0

# マルチGPU
index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)

# CPUより10～100倍高速

LangChain統合

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

# FAISSベクトルストアを作成
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())

# 保存
vectorstore.save_local("faiss_index")

# 読み込み
vectorstore = FAISS.load_local(
    "faiss_index",
    OpenAIEmbeddings(),
    allow_dangerous_deserialization=True
)

# 検索
results = vectorstore.similarity_search("query", k=5)

LlamaIndex統合

from llama_index.vector_stores.faiss import FaissVectorStore
import faiss

# FAISSインデックスを作成
d = 1536
faiss_index = faiss.IndexFlatL2(d)

vector_store = FaissVectorStore(faiss_index=faiss_index)

ベストプラクティス

適切なインデックスタイプを選択 - 10K未満ならFlat、10K～1MならIVF、品質重視ならHNSW
コサイン検索時は正規化 - 正規化ベクトルでIndexFlatIPを使用
大規模データセットではGPUを使用 - 10～100倍高速化
学習済みインデックスを保存 - 学習は計算コストが高い
nprobe/ef_searchをチューニング - 速度と精度のバランスを取る
メモリ使用量を監視 - 大規模データセットではPQを使用
バッチ処理でクエリを実行 - GPU使用率が向上

パフォーマンス

インデックスタイプ	ビルド時間	検索時間	メモリ	精度
Flat	高速	低速	高	100%
IVF	中程度	高速	中程度	95-99%
HNSW	低速	最速	高	99%
PQ	中程度	高速	低	90-95%

リソース

GitHub: https://github.com/facebookresearch/faiss ⭐ 31,700+
Wiki: https://github.com/facebookresearch/faiss/wiki
License: MIT

ライセンス: MIT(寛容ライセンスのため全文を引用しています) · 原本リポジトリ

詳細情報

作者: davila7
リポジトリ: davila7/claude-code-templates
ライセンス: MIT
最終更新: 不明

GitHubで原本を見る →フィードバックを送る

Source: https://github.com/davila7/claude-code-templates / ライセンス: MIT