带有 REACTOR kafka 的 SpringBoot：增加 2 个 CPU pod 上的消息消费吞吐量答案

【问题标题】：SpringBoot with REACTOR kafka : increase message consumption thorughput on a 2CPUs pod带有 REACTOR kafka 的 SpringBoot：增加 2 个 CPU pod 上的消息消费吞吐量
【发布时间】：2023-02-02 14:56:10
【问题描述】：

请问一个关于带有 reactor kafka 的 SpringBoot 3 应用程序的小问题。

我有一个小型反应式 kafka 消费者应用程序，它使用来自 kafka 的消息并处理消息。

该应用程序正在使用一个主题the-topic，其中有三分区.

该应用程序是docker化的，出于资源消耗限制的原因，该应用程序只能使用2个CPU（请耐心等待）。为了让事情变得更困难，我只能拥有一个独特的例子这个应用程序正在运行。

该应用程序非常简单：

     <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-webflux</artifactId>
        </dependency>
        <dependency>
            <groupId>io.projectreactor.kafka</groupId>
            <artifactId>reactor-kafka</artifactId>
        </dependency>
    </dependencies>

@Configuration
public class MyKafkaConfiguration {

    @Bean
    public KafkaReceiver<String, String> reactiveKafkaConsumerTemplate(KafkaProperties kafkaProperties) {
        kafkaProperties.setBootstrapServers(List.of("my-kafka.com:9092"));
        kafkaProperties.getConsumer().setGroupId("should-i-do-something-here");
        final ReceiverOptions<String, String> basicReceiverOptions = ReceiverOptions.create(kafkaProperties.buildConsumerProperties());
        basicReceiverOptions.subscription(Collections.singletonList("the-topic"));
        return new DefaultKafkaReceiver<>(ConsumerFactory.INSTANCE, basicReceiverOptions);
    }

}

@Service
public class MyConsumer implements CommandLineRunner {

    @Autowired
    private KafkaReceiver<String, String> kafkaReceiver;


    @Override
    public void run(String... args) {
        myConsumer().subscribe();
    }

    public Flux<String> myConsumer() {
        return kafkaReceiver.receive()
                .flatMap(oneMessage -> consume(oneMessage))
                .doOnNext(abc -> System.out.println("successfully consumed {}={}" + abc))
                .doOnError(throwable -> System.out.println("something bad happened while consuming : {}" + throwable.getMessage()));
    }

    private Mono<String> consume(ConsumerRecord<String, String> oneMessage) {
        // this first line is a heavy in memory computation which transforms the incoming message to a data to be saved.
        // it is very intensive computation, but has been tested NON BLOCKING by different tools, and takes 1 second :D
        String transformedStringCPUIntensiveNonButNonBLocking = transformDataNonBlockingWithIntensiveOperation(oneMessage);
        //then, just saved the correct transformed data into any REACTIVE repository :)
        return myReactiveRepository.save(transformedStringCPUIntensiveNonButNonBLocking);
    }

}

如果我正确理解项目反应堆，并且由于我的资源限制，我将最多有 2 个反应堆核心。

这这里的consume方法已经过非阻塞测试, 但需要一秒钟来处理消息。

因此，我每秒只能消费 2 条消息吗？（希望不会）

消息可以按任何顺序使用，我希望通过这个应用程序最大化吞吐量。

请问如何在这些限制下最大化此应用程序的并行度/吞吐量？

谢谢

【问题讨论】：

那么，您的消费方法是 CPU 密集型的吗？如果是，则您无能为力，因为它需要全天候使用 CPU 来完成这项工作。但是，如果您发现您的 CPU 没有被完全使用，那么可能是您的消费函数以某种方式阻塞了。你能提供一些关于什么的信息吗消耗做？对于它是非阻塞的，这意味着它只执行内存计算。否则，如果它向数据库或网络服务发送数据，它就会阻塞。

标签： java spring-boot apache-kafka spring-kafka reactor-kafka

【解决方案1】：

如果你想以并行方式处理来自 Flux 发布者的消息，你必须使用 flatMap 运算符，因为 map 运算符通过按 1 请求项目以同步方式运行。

当您使用 flatMap 运算符时，您可以依赖 Reactor 并让他控制并发性，或者您可以通过以下方式指定所需的并发性并发参数（即flatMap(it -> consume(), YOUR_CONCURRENCY_VALUE)

如果您的 consume() 方法不是发布者：

您可以使用 Mono.fromCallable() 将其包装在 Mono 中，并将其发布到专为阻塞任务设计的调度程序上：

.publishOn(Schedulers.boundedElastic())

但最好将所有消费者代码重写为反应类型，否则你将失去使用反应器的好处。

【讨论】：

谢谢@Vladen 的回答！ consume 方法已经是非阻塞的（blockhound 测试）。为了最大化吞吐量，2CPU 上 YOUR_CONCURRENCY_VALUE 的神奇价值是多少？
256 是 flatMap 并发的默认值，但是可以通过测试找到适合您的用例的正确值

【解决方案2】：

我们可以应用 Little's Law 来计算处理所需吞吐量所需的并发数。

workers >= throughput x latency，在我们的例子中，workers 是并行处理的许多消息

例如，要以 60 秒的延迟每秒处理 100 条消息，我们需要并发处理 100 x 60 = 6000。在“传统”阻塞应用程序中，我们需要相同数量的线程。在反应式应用程序中，相同的工作负载只能由多个线程处理，因此内存要少得多。即使一条消息需要 30-60 秒来处理，线程也不会被阻塞，因为所有 IO 操作都是异步的。要扩展处理，您需要减少延迟或增加并发性。

在我们的例子中，我们需要并行处理 6000。使用 3 个分区，您可以让 3 个消费者每个并行处理 2000 条消息。

默认情况下，flatMap 并行处理Queues.SMALL_BUFFER_SIZE = 256 消息，但您可以将其设为可配置。

kafkaReceiver.receive()
    .flatMap(oneMessage -> consume(oneMessage), concurrency)

很难说一个应用程序可以处理多少消息，您需要运行负载测试才能了解最大吞吐量。尝试最大化此数字以了解您查看指标的限制。如果应用程序无法处理此类负载，您将需要增加分区数量并部署更多消费者。

【讨论】：