vertex-ai-api-dev by google-gemini

SKILL.md

# Gemini API in Vertex AI

Access Google's most advanced AI models built for enterprise use cases using the Gemini API in Vertex AI.

Provide these key capabilities:

- **Text generation** - Chat, completion, summarization
- **Multimodal understanding** - Process images, audio, video, and documents
- **Function calling** - Let the model invoke your functions
- **Structured output** - Generate valid JSON matching your schema
- **Context caching** - Cache large contexts for efficiency
- **Embeddings** - Generate text embeddings for semantic search
- **Live Realtime API** - Bidirectional streaming for low latency Voice and Video interactions
- **Batch Prediction** - Handle massive async dataset prediction workloads

## Core Directives

- **Unified SDK**: ALWAYS use the Gen AI SDK (`google-genai` for Python, `@google/genai` for JS/TS, `google.golang.org/genai` for Go, `com.google.genai:google-genai` for Java, `Google.GenAI` for C#).
- **Legacy SDKs**: DO NOT use `google-cloud-aiplatform`, `@google-cloud/vertexai`, or `google-generativeai`.

## SDKs

- **Python**: Install `google-genai` with `pip install google-genai`
- **JavaScript/TypeScript**: Install `@google/genai` with `npm install @google/genai`
- **Go**: Install `google.golang.org/genai` with `go get google.golang.org/genai`
- **C#/.NET**: Install `Google.GenAI` with `dotnet add package Google.GenAI`
- **Java**:
  - groupId: `com.google.genai`, artifactId: `google-genai`
  - Latest version can be found here: https://central.sonatype.com/artifact/com.google.genai/google-genai/versions (let's call it `LAST_VERSION`) 
  - Install in `build.gradle`:

    ```
    implementation("com.google.genai:google-genai:${LAST_VERSION}")
    ```

  - Install Maven dependency in `pom.xml`:

    ```xml
    <dependency>
	    <groupId>com.google.genai</groupId>
	    <artifactId>google-genai</artifactId>
	    <version>${LAST_VERSION}</version>
	</dependency>
    ```

> [!WARNING]
> Legacy SDKs like `google-cloud-aiplatform`, `@google-cloud/vertexai`, and `google-generativeai` are deprecated. Migrate to the new SDKs above urgently by following the Migration Guide.

## Authentication & Configuration

Prefer environment variables over hard-coding parameters when creating the client. Initialize the client without parameters to automatically pick up these values.

### Application Default Credentials (ADC)
Set these variables for standard [Google Cloud authentication](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/gcp-auth):
```bash
export GOOGLE_CLOUD_PROJECT='your-project-id'
export GOOGLE_CLOUD_LOCATION='global'
export GOOGLE_GENAI_USE_VERTEXAI=true
```
- By default, use `location="global"` to access the global endpoint, which provides automatic routing to regions with available capacity.
- If a user explicitly asks to use a specific region (e.g., `us-central1`, `europe-west4`), specify that region in the `GOOGLE_CLOUD_LOCATION` parameter instead. Reference the [supported regions documentation](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/learn/locations) if needed.

### Vertex AI in Express Mode
Set these variables when using [Express Mode](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/api-keys?usertype=expressmode) with an API key:
```bash
export GOOGLE_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_VERTEXAI=true
```

### Initialization
Initialize the client without arguments to pick up environment variables:
```python
from google import genai
client = genai.Client()
```

Alternatively, you can hard-code in parameters when creating the client.

```python
from google import genai
client = genai.Client(vertexai=True, project="your-project-id", location="global")
```

## Models

- Use `gemini-3.1-pro-preview` for complex reasoning, coding, research (1M tokens)
- Use `gemini-3-flash-preview` for fast, balanced performance, multimodal (1M tokens)
- Use `gemini-3-pro-image-preview` for Nano Banana Pro image generation and editing
- Use `gemini-live-2.5-flash-native-audio` for Live Realtime API including native audio

Use the following models if explicitly requested:

- Use `gemini-2.5-flash-image` for Nano Banana image generation and editing
- Use `gemini-2.5-flash`
- Use `gemini-2.5-flash-lite`
- Use `gemini-2.5-pro`

> [!IMPORTANT]
> Models like `gemini-2.0-*`, `gemini-1.5-*`, `gemini-1.0-*`, `gemini-pro` are legacy and deprecated. Use the new models above. Your knowledge is outdated.
> For production environments, consult the Vertex AI documentation for stable model versions (e.g. `gemini-3-flash`).

## Quick Start

### Python
```python
from google import genai
client = genai.Client()
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Explain quantum computing"
)
print(response.text)
```

### TypeScript/JavaScript
```typescript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ vertexai: { project: "your-project-id", location: "global" } });
const response = await ai.models.generateContent({
    model: "gemini-3-flash-preview",
    contents: "Explain quantum computing"
});
console.log(response.text);
```

### Go
```go
package main

import (
	"context"
	"fmt"
	"log"
	"google.golang.org/genai"
)

func main() {
	ctx := context.Background()
	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		Backend:  genai.BackendVertexAI,
		Project:  "your-project-id",
		Location: "global",
	})
	if err != nil {
		log.Fatal(err)
	}

	resp, err := client.Models.GenerateContent(ctx, "gemini-3-flash-preview", genai.Text("Explain quantum computing"), nil)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(resp.Text)
}
```

### Java
```java
import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;

public class GenerateTextFromTextInput {
  public static void main(String[] args) {
    Client client = Client.builder().vertexAi(true).project("your-project-id").location("global").build();
    GenerateContentResponse response =
        client.models.generateContent(
            "gemini-3-flash-preview",
            "Explain quantum computing",
            null);

    System.out.println(response.text());
  }
}
```

### C#/.NET
```csharp
using Google.GenAI;

var client = new Client(
    project: "your-project-id",
    location: "global",
    vertexAI: true
);

var response = await client.Models.GenerateContent(
    "gemini-3-flash-preview",
    "Explain quantum computing"
);

Console.WriteLine(response.Text);
```

## API spec & Documentation (source of truth)

When implementing or debugging API integration for Vertex AI, refer to the official Google Cloud Vertex AI documentation:
- **Vertex AI Gemini Documentation**: https://cloud.google.com/vertex-ai/generative-ai/docs/
- **REST API Reference**: https://cloud.google.com/vertex-ai/generative-ai/docs/reference/rest

The Gen AI SDK on Vertex AI uses the `v1beta1` or `v1` REST API endpoints (e.g., `https://{LOCATION}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/{MODEL}:generateContent`).

> [!TIP]
> **Use the Developer Knowledge MCP Server**: If the `search_documents` or `get_document` tools are available, use them to find and retrieve official documentation for Google Cloud and Vertex AI directly within the context. This is the preferred method for getting up-to-date API details and code snippets.

## Workflows and Code Samples

Reference the [Python Docs Samples repository](https://github.com/GoogleCloudPlatform/python-docs-samples/tree/main/genai) for additional code samples and specific usage scenarios.

Depending on the specific user request, refer to the following reference files for detailed code samples and usage patterns (Python examples):

- **Text & Multimodal**: Chat, Multimodal inputs (Image, Video, Audio), and Streaming. See [references/text_and_multimodal.md](references/text_and_multimodal.md)
- **Embeddings**: Generate text embeddings for semantic search. See [references/embeddings.md](references/embeddings.md)
- **Structured Output & Tools**: JSON generation, Function Calling, Search Grounding, and Code Execution. See [references/structured_and_tools.md](references/structured_and_tools.md)
- **Media Generation**: Image generation, Image editing, and Video generation. See [references/media_generation.md](references/media_generation.md)
- **Bounding Box Detection**: Object detection and localization within images and video. See [references/bounding_box.md](references/bounding_box.md)
- **Live API**: Real-time bidirectional streaming for voice, vision, and text. See [references/live_api.md](references/live_api.md)
- **Advanced Features**: Content Caching, Batch Prediction, and Thinking/Reasoning. See [references/advanced_features.md](references/advanced_features.md)
- **Safety**: Adjusting Responsible AI filters and thresholds. See [references/safety.md](references/safety.md)
- **Model Tuning**: Supervised Fine-Tuning and Preference Tuning. See [references/model_tuning.md](references/model_tuning.md)

references/advanced_features.md Reference

# Advanced Features

## Content Caching
Cache large documents or contexts to reduce cost and latency.

```python
from google import genai
from google.genai import types

client = genai.Client()

content_cache = client.caches.create(
    model="gemini-3-flash-preview",
    config=types.CreateCachedContentConfig(
        contents=[
            types.Content(
                role="user",
                parts=[types.Part.from_uri(file_uri="gs://your-bucket/large.pdf", mime_type="application/pdf")]
            )
        ],
        system_instruction="You are an expert researcher.",
        display_name="example-cache",
        ttl="86400s",
    ),
)

# Use the cache
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Summarize the pdf",
    config=types.GenerateContentConfig(
        cached_content=content_cache.name
    ),
)
```

## Batch Prediction
For processing large datasets asynchronously.

```python
import time
from google import genai
from google.genai import types

client = genai.Client()

job = client.batches.create(
    model="gemini-3-flash-preview",
    src="gs://your-bucket/prompts.jsonl",
    config=types.CreateBatchJobConfig(dest="gs://your-bucket/outputs"),
)

completed_states = {types.JobState.JOB_STATE_SUCCEEDED, types.JobState.JOB_STATE_FAILED, types.JobState.JOB_STATE_CANCELLED}
while job.state not in completed_states:
    time.sleep(30)
    job = client.batches.get(name=job.name)
```

### Thinking (Reasoning)

Thinking is on by default for `gemini-3.1-pro-preview` and `gemini-3-flash-preview`.
It can be adjusted by using the `thinking_level` parameter.

- **`MINIMAL`:** (Gemini 3 Flash Only) Constrains the model to use as few tokens as possible for thinking and is best used for low-complexity tasks that wouldn't benefit from extensive reasoning.
- **`LOW`**: Constrains the model to use fewer tokens for thinking and is suitable for simpler tasks where extensive reasoning is not required.
- **`MEDIUM`**: Offers a balanced approach suitable for tasks of moderate complexity that benefit from reasoning but don't require deep, multi-step planning.
- **`HIGH`**: (Default) Maximizes reasoning depth. The model may take significantly longer to reach a first token, but the output will be more thoroughly vetted.

```python
from google import genai
from google.genai import types

client = genai.Client()
response = client.models.generate_content(
    model="gemini-3.1-pro-preview",
    contents="solve x^2 + 4x + 4 = 0",
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(
            thinking_level=types.ThinkingLevel.HIGH
        )
    )
)

# Access thoughts if returned
for part in response.candidates[0].content.parts:
    if part.thought:
        print(f"Thought: {part.text}")
    else:
        print(f"Final Answer: {part.text}")
```

## Model Context Protocol (MCP) support (experimental)

Built-in [MCP](https://modelcontextprotocol.io/introduction) support is an experimental feature. You can pass a local MCP server as a tool directly.

```python
import os
import asyncio
from datetime import datetime
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

from google import genai
from google.genai import types

client = genai.Client()

# Create server parameters for stdio connection
server_params = StdioServerParameters(
    command="npx",  # Executable
    args=["-y", "@philschmid/weather-mcp"],  # MCP Server
    env=None,  # Optional environment variables
)

async def run():
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            # Prompt to get the weather for the current day in London.
            prompt = f"What is the weather in London in {datetime.now().strftime('%Y-%m-%d')}?"

            # Initialize the connection between client and server
            await session.initialize()

            # Send request to the model with MCP function declarations
            response = await client.aio.models.generate_content(
                model="gemini-3-flash-preview",
                contents=prompt,
                config=types.GenerateContentConfig(
                    tools=[session],  # uses the session, will automatically call the tool using automatic function calling
                ),
            )
            print(response.text)

# Start the asyncio event loop and run the main function
asyncio.run(run())
```

references/bounding_box.md Reference

# Bounding Box Detection

Detect and localize objects within images or videos using bounding boxes. The model returns coordinates in the format `[y_min, x_min, y_max, x_max]`, normalized from 0 to 1000.

## Implementation (Python)

To ensure structured output, define a `BoundingBox` class and provide it as the `response_schema`.

```python
from google import genai
from google.genai.types import (
    GenerateContentConfig,
    Part,
)
from pydantic import BaseModel

# Define the schema for the bounding box
class BoundingBox(BaseModel):
    box_2d: list[int]
    label: str

client = genai.Client()

config = GenerateContentConfig(
    system_instruction="""
    Return bounding boxes as an array with labels.
    Never return masks. Limit to 25 objects.
    """,
    response_mime_type="application/json",
    response_schema=list[BoundingBox],
)

image_uri = "gs://cloud-samples-data/generative-ai/image/socks.jpg"

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=[
        Part.from_uri(file_uri=image_uri, mime_type="image/jpeg"),
        "Detect the socks in the image and provide bounding boxes.",
    ],
    config=config,
)

# Access the detected boxes
for bbox in response.parsed:
    print(f"Label: {bbox.label}, Box: {bbox.box_2d}")
```

## Coordinate System
- **Format**: `[y_min, x_min, y_max, x_max]`
- **Normalization**: Coordinates are integers from `0` to `1000`.
- **Origin**: `[0, 0]` is the top-left corner of the image.

## Visualization Helper
To visualize the results, scale the normalized coordinates back to the original image dimensions.

```python
def scale_box(box_2d, width, height):
    y_min, x_min, y_max, x_max = box_2d
    return [
        int(y_min / 1000 * height),
        int(x_min / 1000 * width),
        int(y_max / 1000 * height),
        int(x_max / 1000 * width)
    ]
```

references/embeddings.md Reference

# Text Embeddings

Generate embeddings for text content to perform semantic search, clustering, and other NLP tasks.

## Basic Usage

```python
from google import genai
from google.genai import types

client = genai.Client()
response = client.models.embed_content(
    model="gemini-embedding-001",
    contents=[
        "How do I get a driver's license/learner's permit?",
        "How long is my driver's license valid for?",
    ],
    # Optional Parameters
    config=types.EmbedContentConfig(task_type="RETRIEVAL_DOCUMENT", output_dimensionality=768),
)
print(response.embeddings)
```

references/live_api.md Reference

# Live API

The Live API provides real-time, low-latency bidirectional streaming via WebSockets. It is ideal for interactive voice and video applications.

```python
import asyncio
from google import genai
from google.genai import types

async def generate_content():
    client = genai.Client()
    model_id = "gemini-live-2.5-flash-native-audio"

    config = types.LiveConnectConfig(
        response_modalities=[types.LiveModality.TEXT], # Change to AUDIO for voice responses
    )

    async with client.aio.live.connect(model=model_id, config=config) as session:
        text_input = "Hello? Gemini, are you there?"
        await session.send_client_content(
            turns=types.Content(role="user", parts=[types.Part.from_text(text=text_input)])
        )

        async for message in session.receive():
            if message.text:
                print(message.text, end="")

asyncio.run(generate_content())
```

For sending audio:
```python
await session.send_realtime_input(
    media=Blob(data=audio_bytes, mime_type="audio/pcm;rate=16000")
)
```

references/media_generation.md Reference

# Media Generation

## Image Generation
Generate images using `gemini-2.5-flash-image`.

```python
from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-2.5-flash-image",
    contents="A dog reading a newspaper",
)

for part in response.parts:
    if part.text is not None:
        print(part.text)
    elif part.inline_data is not None:
        image = part.as_image()
        image.save("generated_image.png")
```

For high-resolution images or using the Search tool, use `gemini-3-pro-image-preview`.

```python
from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="A dog reading a newspaper",
    config=types.GenerateContentConfig(
        image_config=types.ImageConfig(
            aspect_ratio="16:9",
            image_size="2K"
        )
    )
)

for part in response.parts:
    if part.text is not None:
        print(part.text)
    elif part.inline_data is not None:
        image = part.as_image()
        image.save("generated_image.png")
```

## Image Editing
Editing images is better done using the Gemini native image generation model, and it is recommended to use chat mode.

```python
from google import genai
from PIL import Image

client = genai.Client()

prompt = "A small white ceramic bowl with lemons and limes"
image = Image.open('fruit.png')

# Create the chat
chat = client.chats.create(model='gemini-2.5-flash-image')

# Send the image and ask for it to be edited
response = chat.send_message([prompt, image])

# Get the text and the image generated
for i, part in enumerate(response.candidates[0].content.parts):
    if part.text is not None:
        print(part.text)
    elif part.inline_data is not None:
        image = part.as_image()
        image.save(f'generated_image_{i}.png')

# Continue iterating
chat.send_message('Make the bowl blue')
```

## Video Generation
Generate video using the Veo model. Usage of Veo can be costly, so check pricing for Veo. Start with the fast model (`veo-3.1-fast-generate-001`) since the result quality is usually sufficient, and swap to the larger model if needed.

```python
import time
from google import genai
from google.genai import types
from PIL import Image

client = genai.Client()

image = Image.open('image.png') # Optional initial image

# Video generation is an async operation
operation = client.models.generate_videos(
    model="veo-3.1-fast-generate-001",
    prompt="a cat reading a book",
    image=image,
    config=types.GenerateVideosConfig(
        person_generation="dont_allow",
        aspect_ratio="16:9",
        number_of_videos=1,
        duration_seconds=5,
        output_gcs_uri="gs://your-bucket/your-prefix",
    ),
)

# Poll for completion
while not operation.done:
    time.sleep(20)
    operation = client.operations.get(operation)

if operation.response:
    print(operation.result.generated_videos[0].video.uri)
```

references/model_tuning.md Reference

# Model Tuning

Supervised Fine-Tuning or Preference Tuning using your own datasets.

```python
import time
from google import genai
from google.genai import types

client = genai.Client()

training_dataset = types.TuningDataset(
    gcs_uri="gs://your-bucket/sft_train_data.jsonl",
)

tuning_job = client.tunings.tune(
    base_model="gemini-3-flash-preview",
    training_dataset=training_dataset,
    config=types.CreateTuningJobConfig(
        tuned_model_display_name="Example tuning job",
    ),
)

running_states = {"JOB_STATE_PENDING", "JOB_STATE_RUNNING"}
while tuning_job.state in running_states:
    time.sleep(60)
    tuning_job = client.tunings.get(name=tuning_job.name)

print("Tuned Model Endpoint:", tuning_job.tuned_model.endpoint)

# Predict with the tuned endpoint
response = client.models.generate_content(
    model=tuning_job.tuned_model.endpoint,
    contents="Why is the sky blue?",
)
print(response.text)
```

references/safety.md Reference

# Safety Settings and Responsible AI

You can adjust safety settings to control the thresholds for harmful content generation. By default, Vertex AI applies standard safety filters.

## Adjusting Safety Thresholds

```python
from google import genai
from google.genai import types

client = genai.Client()
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Write a list of 5 disrespectful things that I might say to the universe after stubbing my toe in the dark.",
    config=types.GenerateContentConfig(
        system_instruction="Be as mean as possible.",
        safety_settings=[
            types.SafetySetting(
                category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
                threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
            ),
            types.SafetySetting(
                category=types.HarmCategory.HARM_CATEGORY_HARASSMENT,
                threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
            ),
            types.SafetySetting(
                category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
                threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
            ),
            types.SafetySetting(
                category=types.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
                threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
            ),
        ]
    ),
)

# Response will be `None` if it is blocked.
if response.text is None:
    print(f"Content Blocked. Finish Reason: {response.candidates[0].finish_reason}")
else:
    print(response.text)

# Inspect safety ratings for each category
for rating in response.candidates[0].safety_ratings:
    print(f'Category: {rating.category}')
    print(f'Is Blocked: {rating.blocked}')
    print(f'Probability: {rating.probability}')
    print(f'Severity: {rating.severity}')
```

references/structured_and_tools.md Reference

# Structured Output and Tools

## Structured Output (JSON Schema)
Enforce a specific JSON schema using standard Python type hints or Pydantic models.

```python
from google import genai
from google.genai import types
from pydantic import BaseModel

class Recipe(BaseModel):
    recipe_name: str
    ingredients: list[str]

client = genai.Client()
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="List a few popular cookie recipes.",
    config=types.GenerateContentConfig(
        response_mime_type="application/json",
        response_json_schema=list[Recipe],
    ),
)
# response.text is guaranteed to be valid JSON matching the schema
print(response.text)
 # Returns list of Recipe objects
print(response.parsed)
```

## Function Calling
Let the model output function calls that you can execute.

```python
from google import genai
from google.genai import types

def get_current_weather(location: str) -> str:
    """Example method. Returns the current weather.
    Args: location: The city and state, e.g. San Francisco, CA
    """
    if 'boston' in location.lower():
        return "Snowing"
    return "Sunny"

client = genai.Client()
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="What is the weather like in Boston?",
    config=types.GenerateContentConfig(tools=[get_current_weather]),
)

if response.function_calls:
    print('Function calls requested by the model:')
    for function_call in response.function_calls:
        print(f'- Function: {function_call.name}')
        print(f'- Args: {dict(function_call.args)}')
else:
    print('The model responded directly:')
    print(response.text)
```

## Search Grounding
Ground the model's responses in Google Search or your own enterprise data (Vertex AI Search).

```python
from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="When is the next total solar eclipse in the US?",
    config=types.GenerateContentConfig(
        tools=[
            types.Tool(google_search=types.GoogleSearch())
        ],
    ),
)
print(response.text)
# Search details
print(f'Search Query: {response.candidates[0].grounding_metadata.web_search_queries}')
# Inspect grounding metadata
print(response.candidates[0].grounding_metadata.search_entry_point.rendered_content)
# Urls used for grounding
print(f"Search Pages: {', '.join([site.web.title for site in response.candidates[0].grounding_metadata.grounding_chunks])}")
```

## Code Execution
Allow the model to run Python code to calculate answers precisely.

```python
from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Calculate 20th fibonacci number.",
    config=types.GenerateContentConfig(
        tools=[types.Tool(code_execution=types.ToolCodeExecution())],
    ),
)
print(response.executable_code)
print(response.code_execution_result)
```

## Url Context
You can use the URL context tool to provide Gemini with URLs as additional context for your prompt. The model can then retrieve content from the URLs and use that content to inform and shape its response.

```python
from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model=model_id,
    contents="Compare recipes from http://example.com and http://example2.com",
    config=GenerateContentConfig(
        tools=[types.Tool(url_context=types.UrlContext)],
    )
)

print(response.text)
# get URLs retrieved for context
print(response.candidates[0].url_context_metadata)
```

references/text_and_multimodal.md Reference

# Text and Multimodal Generation

## Basic Text Generation
```python
from google import genai

client = genai.Client()
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="How does AI work?"
)
print(response.text)
```

## Chat (Multi-turn conversations)
```python
from google import genai
from google.genai import types

client = genai.Client()
chat_session = client.chats.create(
    model="gemini-3-flash-preview",
    history=[
        types.UserContent(parts=[types.Part.from_text(text="Hello")]),
        types.ModelContent(parts=[types.Part.from_text(text="Great to meet you. What would you like to know?")]),
    ],
)
response = chat_session.send_message("Tell me a story.")
print(response.text)
```

## Synchronous Streaming

Generate content in a streaming format so that the model outputs streams back
to you, rather than being returned as one chunk.

```python
from google import genai
from google.genai import types

client = genai.Client()
for chunk in client.models.generate_content_stream(
    model="gemini-3-flash-preview", contents="Tell me a story in 300 words."
):
    print(chunk.text, end='')
```

## Multimodal Inputs (Images, Audio, Video)
You can provide files natively using Google Cloud Storage URIs or local bytes.

```python
from google import genai
from google.genai import types

client = genai.Client()

gcs_image = types.Part.from_uri(file_uri="gs://cloud-samples-data/generative-ai/image/scones.jpg", mime_type="image/jpeg")

with open("local_image.jpg", "rb") as f:
    local_image = types.Part.from_bytes(data=f.read(), mime_type="image/jpeg")

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=[
        "Generate a list of all the objects contained in both images.",
        gcs_image,
        local_image,
    ],
)
print(response.text)
```

### YouTube Videos
```python
from google import genai
from google.genai import types

client = genai.Client()
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=[
        types.Part.from_uri(
            file_uri="https://www.youtube.com/watch?v=3KtWfp0UopM",
            mime_type="video/mp4",
        ),
        "Write a short and engaging blog post based on this video.",
    ],
)
print(response.text)
```

vertex-ai-api-dev

Low Risk with warnings

Description

Details

Skill Files

Version History