Kafka BKM by matt-welch · Pull Request #21 · intel/optimization-zone

matt-welch · 2026-02-12T22:40:59Z

I'd like to submit this Kafka optimization guide to the optimization zone.

- Needs review

…ance - Separate cloud-agnostic topology guidance from AWS-specific examples - Add "Data Movement in Kafka" section explaining producer write path, consumer read path, and partition leader/follower architecture - Generalize cloud topology section to use vCPU counts instead of AWS-specific instance types - Consolidate AWS deployment details into dedicated example section - Update single-node topology with complete system specifications: * 2 sockets, 192 cores/socket (96 physical + HT) * 6 NUMA nodes (3 per socket, 32 cores/node, SNC enabled) * 3 brokers pinned to dedicated NUMA nodes with 16 logical CPUs each - Add multi-cloud examples (AWS m8i, GCP C4) for Intel Xeon 6 guidance - Remove confusing NUMA notation and provide clear CPU pinning examples

- Add disclaimer note about 4.2 RC testing - Update all version references to specify "release candidate (RC)" - Remove TODO placeholders and update docs URLs to Kafka 4.1 - Clarify vm.dirty_background_bytes applies system-wide

rsiyer-intel · 2026-02-12T23:24:07Z

Need to add a link in the main README under Software https://github.com/intel/optimization-zone/blob/main/README.md

rsiyer-intel · 2026-02-13T19:09:30Z

software/kafka/README.md

+
+## Single-node BIOS Configuration Recommendations
+If the user has access to the BIOS for a system, here are some parameters that can be changed to improve Kafka performance.
+- **Sub-NUMA CLustering (SNC)**: enabls multiple NUMA nodes so each broker can run on its own NUMA node 


typo - enables

Also,
Sub-NUMA Clustering

rsiyer-intel · 2026-02-13T19:14:16Z

software/kafka/README.md

+net.ipv4.tcp_wmem='4096 65536 16777216'
+
+################################################################
+#setting the system to performance mode for best possible perf #


Comment spacing

setting

rsiyer-intel · 2026-02-13T19:19:53Z

software/kafka/README.md

+  - Example: Cloud instance with 16 vCPUs
+    - `num.network.threads=6`: should be less than or equal to half the CPU cores assigned to a broker
+    - `num.io.threads=8`: should be less than or equal to the count of CPU cores assigned to a broker
+    - `num.replica.fetchers=2`: increased beyond the default of 2 to improve replication latency


It says increased beyond the default of 2, but it is set to 2.

rsiyer-intel · 2026-02-13T19:40:49Z

software/kafka/README.md

+Another potential resource bottleneck in a cloud deployment can be the storage bandwidth of volumes in their default configuration. It's usually possible to increase the I/O operations per second (IOPS) and bandwidth for a volume at creation time. It's recommended that these volumes be configured with high IOPS and throughput where possible. If storage performance of a single volume that's been configured for maximum throughput is still insufficient to meet an SLA, additional volumes may be attached to brokers or the brokers may be moved to instances with direct-attached NVMes. 
+As with other system resources, storage telemetry should be monitored to ensure individual devices are not operating beyond their allotted steady-state performance.
+Scaling storage when hitting instance resource limits is somewhat more flexible than scaling the network because, in addition to the possibility of growing the cluster capacity with scale-out of additional brokers, additional storage volumes can usually be added to brokers to increase their storage capacity.
+An alternative to adding volumes would be to scale up the brokers to systems with direct-attached NVMe's that enable high-performance storage.


minor - NVMes

matt-welch added 4 commits January 27, 2026 09:47

Draft of Kafka BKM guide

6938d20

- Needs review

docs(kafka): Update BKM for Kafka 4.2 RC usage

6a7f7bf

- Add disclaimer note about 4.2 RC testing - Update all version references to specify "release candidate (RC)" - Remove TODO placeholders and update docs URLs to Kafka 4.1 - Clarify vm.dirty_background_bytes applies system-wide

Merge branch 'main' of github.com:intel/optimization-zone

3fdd455

Add link to Kafka under main README

9d5e204

rsiyer-intel reviewed Feb 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka BKM#21

Kafka BKM#21
matt-welch wants to merge 5 commits intointel:mainfrom
matt-welch:kafka-bkm

matt-welch commented Feb 12, 2026

Uh oh!

rsiyer-intel commented Feb 12, 2026

Uh oh!

rsiyer-intel Feb 13, 2026

Uh oh!

rsiyer-intel Feb 13, 2026

Uh oh!

rsiyer-intel Feb 13, 2026

Uh oh!

rsiyer-intel Feb 13, 2026

Uh oh!

rsiyer-intel Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

matt-welch commented Feb 12, 2026

Uh oh!

rsiyer-intel commented Feb 12, 2026

Uh oh!

rsiyer-intel Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

rsiyer-intel Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

rsiyer-intel Feb 13, 2026

Choose a reason for hiding this comment

setting

Uh oh!

rsiyer-intel Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

rsiyer-intel Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants