Anthropic ClaudeLLM・AI開発⭐ リポ 0品質スコア 50/100

ce-gemini-imagegen

Name: ce-gemini-imagegen
Author: everyinc

テキストプロンプトからの画像生成や既存画像の編集を行う際に使用するスキルで、Gemini API（Nano Banana Pro）を活用します。ロゴ生成、スタイル変換、ステッカー作成、商品モックアップなど幅広い画像生成・編集タスクに対応し、複数回のやり取りによる段階的な修正や複数の参照画像を組み合わせた合成もサポートします。

description の原文を見る

This skill should be used when generating and editing images using the Gemini API (Nano Banana Pro). It applies when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.

SKILL.md 本文

Gemini Image Generation (Nano Banana Pro)

Google の Gemini API を使用して画像を生成・編集します。環境変数 GEMINI_API_KEY が設定されている必要があります。

デフォルトモデル

モデル	解像度	最適用途
`gemini-3-pro-image-preview`	1K-4K	すべての画像生成（デフォルト）

注記: 常にこの Pro モデルを使用してください。明示的に別のモデルをリクエストされた場合のみ異なるモデルを使用してください。

クイックリファレンス

デフォルト設定

モデル: gemini-3-pro-image-preview
解像度: 1K（デフォルト、選択肢：1K、2K、4K）
アスペクト比: 1:1（デフォルト）

利用可能なアスペクト比

1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9

利用可能な解像度

1K（デフォルト）、2K、4K

コア API パターン

import os
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])

# 基本的な生成（1K、1:1 - デフォルト）
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Your prompt here"],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

for part in response.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = part.as_image()
        image.save("output.png")

カスタム解像度とアスペクト比

from google.genai import types

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[prompt],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",  # Wide format
            image_size="2K"       # Higher resolution
        ),
    )
)

解像度の例

# 1K（デフォルト）- 高速、プレビューに適している
image_config=types.ImageConfig(image_size="1K")

# 2K - 品質と速度のバランス
image_config=types.ImageConfig(image_size="2K")

# 4K - 最高品質、遅い
image_config=types.ImageConfig(image_size="4K")

アスペクト比の例

# スクエア（デフォルト）
image_config=types.ImageConfig(aspect_ratio="1:1")

# ランドスケープワイド
image_config=types.ImageConfig(aspect_ratio="16:9")

# 超ワイドパノラマ
image_config=types.ImageConfig(aspect_ratio="21:9")

# ポートレート
image_config=types.ImageConfig(aspect_ratio="9:16")

# 写真標準
image_config=types.ImageConfig(aspect_ratio="4:3")

画像の編集

既存の画像をテキストプロンプトと一緒に渡します：

from PIL import Image

img = Image.open("input.png")
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Add a sunset to this scene", img],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

マルチターン精緻化

チャットを使用して反復的に編集します：

from google.genai import types

chat = client.chats.create(
    model="gemini-3-pro-image-preview",
    config=types.GenerateContentConfig(response_modalities=['TEXT', 'IMAGE'])
)

response = chat.send_message("Create a logo for 'Acme Corp'")
# 最初の画像を保存...

response = chat.send_message("Make the text bolder and add a blue gradient")
# 精緻化された画像を保存...

プロンプティングのベストプラクティス

写真リアルなシーン

カメラの詳細を含めます：レンズの種類、照明、角度、ムード。

"A photorealistic close-up portrait, 85mm lens, soft golden hour light, shallow depth of field"

スタイル化されたアート

スタイルを明示的に指定します：

"A kawaii-style sticker of a happy red panda, bold outlines, cel-shading, white background"

画像内のテキスト

フォントスタイルと配置について明示的にします：

"Create a logo with text 'Daily Grind' in clean sans-serif, black and white, coffee bean motif"

製品モックアップ

照明設定と表面を説明します：

"Studio-lit product photo on polished concrete, three-point softbox setup, 45-degree angle"

高度な機能

Google Search Grounding

リアルタイムデータに基づいて画像を生成します：

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Visualize today's weather in Tokyo as an infographic"],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]
    )
)

複数の参照画像（最大 14）

複数のソースから要素を組み合わせます：

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[
        "Create a group photo of these people in an office",
        Image.open("person1.png"),
        Image.open("person2.png"),
        Image.open("person3.png"),
    ],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

重要：ファイル形式とメディアタイプ

重要: Gemini API はデフォルトで JPEG 形式で画像を返します。保存時は、メディアタイプの不一致を避けるため、常に .jpg 拡張子を使用してください。

# 正しい - .jpg 拡張子を使用（Gemini は JPEG を返します）
image.save("output.jpg")

# 誤り - 「Image does not match media type」エラーが発生します
image.save("output.png")  # JPEG を PNG 拡張子で作成します！

PNG への変換（必要な場合）

PNG 形式が特に必要な場合：

from PIL import Image

# Gemini で生成
for part in response.parts:
    if part.inline_data:
        img = part.as_image()
        # 明示的な形式を指定して PNG で保存
        img.save("output.png", format="PNG")

画像形式の確認

file コマンドで実際の形式と拡張子を確認します：

file image.png
# 出力が「JPEG image data」を表示する場合 - .jpg にリネームしてください！

注記

すべての生成画像には SynthID ウォーターマークが含まれます
Gemini はデフォルトで JPEG 形式 を返します - 常に .jpg 拡張子を使用してください
画像のみモード（responseModalities: ["IMAGE"]）は Google Search grounding では機能しません
編集の場合は、変更を会話的に説明してください。モデルはセマンティックマスキングを理解します
速度のためにデフォルトで 1K 解像度を使用してください。品質が重要な場合は 2K/4K を使用してください

ライセンス: MIT(寛容ライセンスのため全文を引用しています) · 原本リポジトリ

詳細情報

作者: everyinc
リポジトリ: everyinc/compound-engineering-plugin
ライセンス: MIT
最終更新: 不明

GitHubで原本を見る →フィードバックを送る

Source: https://github.com/everyinc/compound-engineering-plugin / ライセンス: MIT