Streaming
Streaming lets you receive Claude's response as it's generated, token by token. Instead of waiting for the full response, you get incremental updates. This is useful for building chat interfaces where you want to show text as it appears.
Basic streaming
Use createStreamed() instead of create(). The parameters are the same:
$stream = $client->messages()->createStreamed([
'model' => 'claude-sonnet-4-6',
'max_tokens' => 1024,
'messages' => [
['role' => 'user', 'content' => 'Hello!'],
],
]);
foreach ($stream as $response) {
echo $response->toArray()['type']; // event type
}
The stream returns a StreamResponse object that you can iterate over. Each iteration yields a CreateStreamedResponse with the event data.
Event types
As you iterate, you'll receive events in this order:
| Event | When it fires | What it contains |
|---|---|---|
message_start |
Once, at the start | Full message envelope (id, model, role) and initial usage |
content_block_start |
Start of each content block | Block type and index (e.g., text, tool_use, thinking) |
content_block_delta |
Multiple times per block | Incremental content (text chunks, JSON fragments, thinking) |
content_block_stop |
End of each content block | Block index |
message_delta |
Once, near the end | Stop reason and final usage |
The API also sends message_stop and ping events, but the client handles those internally. Your foreach loop only receives the five event types above. If an error occurs mid-stream, the client throws an ErrorException (see Error Handling).
Here's a practical example that prints text as it arrives:
$stream = $client->messages()->createStreamed([
'model' => 'claude-sonnet-4-6',
'max_tokens' => 1024,
'messages' => [
['role' => 'user', 'content' => 'Tell me a short story.'],
],
]);
foreach ($stream as $response) {
if ($response->type === 'content_block_delta'
&& $response->delta->type === 'text_delta') {
echo $response->delta->text;
}
}
Full event sequence
Here's what a complete streamed text response looks like when you call toArray() on each event:
// 1. message_start
[
'type' => 'message_start',
'message' => [
'id' => 'msg_01SX1jLtTXgtJwB2EpSRNutG',
'type' => 'message',
'role' => 'assistant',
'content' => [],
'model' => 'claude-sonnet-4-6',
'stop_reason' => null,
'stop_sequence' => null,
],
'usage' => [
'input_tokens' => 9,
'output_tokens' => 1,
'cache_creation_input_tokens' => null,
'cache_read_input_tokens' => null,
],
]
// 2. content_block_start
[
'type' => 'content_block_start',
'index' => 0,
'content_block_start' => [
'type' => 'text',
'text' => '',
],
]
// 3-N. content_block_delta (repeated for each chunk)
[
'type' => 'content_block_delta',
'index' => 0,
'delta' => [
'type' => 'text_delta',
'text' => 'Hello',
],
]
// Final: message_delta
[
'type' => 'message_delta',
'delta' => [
'stop_reason' => 'end_turn',
'stop_sequence' => null,
],
'usage' => [
'input_tokens' => null,
'output_tokens' => 12,
'cache_creation_input_tokens' => null,
'cache_read_input_tokens' => null,
],
]
Streaming with tool use
When Claude calls a tool during streaming, you'll see tool_use content blocks with input_json_delta events. The tool input arrives as partial JSON fragments that you'd concatenate:
$stream = $client->messages()->createStreamed([
'model' => 'claude-sonnet-4-6',
'max_tokens' => 1024,
'tools' => [
[
'name' => 'get_weather',
'description' => 'Get the current weather in a given location',
'input_schema' => [
'type' => 'object',
'properties' => [
'location' => [
'type' => 'string',
'description' => 'The city and state, e.g. San Francisco, CA',
],
],
'required' => ['location'],
],
],
],
'messages' => [
['role' => 'user', 'content' => 'What is the weather like in San Francisco?'],
],
]);
foreach ($stream as $response) {
$response->toArray();
}
The tool use events look like this:
// Tool block start
[
'type' => 'content_block_start',
'index' => 1,
'content_block_start' => [
'id' => 'toolu_01RDFRXpbNUGrZ1xQy443s5Q',
'type' => 'tool_use',
'name' => 'get_weather',
'input' => [],
],
]
// Tool input arrives as JSON fragments
[
'type' => 'content_block_delta',
'index' => 1,
'delta' => [
'type' => 'input_json_delta',
'partial_json' => '{"location": "San Francisco, CA"}',
],
]
See Tool Use for the full tool call workflow.
Streaming with thinking
When extended thinking is enabled, the stream includes thinking content blocks before the text response:
$stream = $client->messages()->createStreamed([
'model' => 'claude-opus-4-6',
'max_tokens' => 16000,
'thinking' => [
'type' => 'adaptive',
],
'messages' => [
['role' => 'user', 'content' => 'What is the greatest common divisor of 1071 and 462?'],
],
]);
foreach ($stream as $response) {
// Thinking block start
$response->content_block_start->type; // 'thinking'
// Thinking content arrives incrementally
$response->delta->type; // 'thinking_delta'
$response->delta->thinking; // 'I need to find the GCD using the Euclidean algorithm...'
// Signature sent before thinking block closes
$response->delta->type; // 'signature_delta'
$response->delta->signature; // 'EqQBCgIYAhIM1gbcDa9GJwZA2b3hGgxBdjrkzLoky3dl...'
// Then text content follows
$response->delta->type; // 'text_delta'
$response->delta->text; // 'The greatest common divisor of 1071 and 462 is **21**.'
}
When using 'display' => 'omitted', no thinking_delta events are emitted. You'll only get the signature_delta followed by text deltas, which gives a faster time-to-first-text-token.
Meta information on streams
You can access rate limit headers and request metadata on the stream object itself:
$stream = $client->messages()->createStreamed([
'model' => 'claude-sonnet-4-6',
'max_tokens' => 1024,
'messages' => [
['role' => 'user', 'content' => 'Hello!'],
],
]);
$stream->meta(); // MetaInformation object with rate limits
See Meta Information for details on what's available.
For the full event format and streaming specification, see the Streaming guide on the Anthropic docs.