翻译kafka的消费者组
2015-04-10 11:28
127 查看
Messaging traditionally has two models: queuing and publish-subscribe.
In a queue, a pool of consumers may read from a server and each message goes to one of them; in publish-subscribe the message is broadcast to all consumers.
Kafka offers a single consumer abstraction that generalizes both of these—the consumer group.
Consumers label themselves with a consumer group name, and each message published to a topic is delivered to one consumer instance
within each subscribing consumer group. Consumer instances can be in separate processes or on separate machines.
传统的消息系统有两种模式:队列和订阅
队列就是一组消费者池从某个server读取队列中的某一个;而消息订阅是广播给所有的消费者。
kafka提供了一种消费者抽象的方式来实现,这个就是消费者组,它能实现两种方式。
消费者都在某个消费者组名里被标记。每个给发布到topic上的消息都给投递到某个消费者组里的消费者实例上。消费者实例可能是分布的进程里或者分布的物理机器上。
If all the consumer instances have the same consumer group, then this works just like a traditional queue balancing load over the consumers.
如果所有的消费者实例拥有一样的消费者组,那么这就像传统的平衡所有消费者的队列。
If all the consumer instances have different consumer groups, then this works like publish-subscribe and all messages are broadcast to all consumers.
如果所有的消费者有不同的组,那么就像消息订阅那么样广播给所有的消息者。
More commonly, however, we have found that topics have a small number of consumer groups, one for each “logical subscriber”.
Each group is composed of many consumer instances for scalability and fault tolerance.
This is nothing more than publish-subscribe semantics where the subscriber is cluster of consumers instead of a single process.
但是更普遍地情况, 我们已经发现了主题拥有了一些消费者组,每个都是逻辑订阅。
每个组是有许多的可扩展的能容错的消费者实例组成。
这无非是发布-订阅语义的用户而不是一个单一的消费者群的过程。
Kafka has stronger ordering guarantees than a traditional messaging system, too.
A traditional queue retains messages in-order on the server,
and if multiple consumers consume from the queue then the server hands out messages in the order they are stored.
However, although the server hands out messages in order, the messages are delivered asynchronously to consumers,
so they may arrive out of order on different consumers. This effectively means the ordering of the messages is lost in the presence of parallel consumption.
Messaging systems often work around this by having a notion of “exclusive consumer”
that allows only one process to consume from a queue, but of course this means
that there is no parallelism in processing.
传统的消息在服务器上保持消息有序, 如果多个消费者从队列中消费消息, 然后服务器把保存的消息发出去。
但是, 服务器顺序发消息时, 是被异步发送给消费者的 所以他们可能不是顺序到达不同的消费者。 这意味着有序的消息会在并行消费的情况下丢失。
消息系统经常遇到一个概念“独家消费”, 只允许一个进程从队列中消费, 这样的话意味着没有处理的并行性。
Kafka does it better. By having a notion of parallelism—the partition—within the topics, Kafka is able to provide both ordering guarantees
and load balancing over a pool of consumer processes. This is achieved by assigning the partitions in the topic to the consumers
in the consumer group so that each partition is consumed by exactly one consumer in the group. By doing this we ensure that the consumer
is the only reader of that partition and consumes the data in order. Since there are many partitions this still balances the load over many consumer instances.
Note however that there cannot be more consumer instances than partitions.
kafka做的漂亮,通过主题内置并行分区的概念,kafka能提供既有顺序保证又有负载均衡的消费能力。 这是通过在主题中设置分区给消费者组的消费者而实现的。那样的话每个分区能被有且只有一个组里的消费者实例消费。
这样的话, 我们能保证消费者是分区的唯一的用户,并且顺序消费。 由于有了多个分区,这样仍然可以平衡多个消费者实例的负载。
注意仍然不能让消费者实例的数量多于分区的数量。
Kafka only provides a total order over messages within a partition, not between different partitions in a topic.
Per-partition ordering combined with the ability to partition data by key is sufficient for most applications.
However, if you require a total order over messages this can be achieved with a topic that has only one partition,
though this will mean only one consumer process.
kafka在某个分区上只提供了完整顺序的消息, 不是在主题的不同分区之间。
组合了通过键的分区数据功能的预分区是对大多数应用程序就足够了。 但是如果你需要一个全局的消息有序的话, 你可能要把这个topic弄在一个分区里才行,这也意味着只能有一个消费者进程。
In a queue, a pool of consumers may read from a server and each message goes to one of them; in publish-subscribe the message is broadcast to all consumers.
Kafka offers a single consumer abstraction that generalizes both of these—the consumer group.
Consumers label themselves with a consumer group name, and each message published to a topic is delivered to one consumer instance
within each subscribing consumer group. Consumer instances can be in separate processes or on separate machines.
传统的消息系统有两种模式:队列和订阅
队列就是一组消费者池从某个server读取队列中的某一个;而消息订阅是广播给所有的消费者。
kafka提供了一种消费者抽象的方式来实现,这个就是消费者组,它能实现两种方式。
消费者都在某个消费者组名里被标记。每个给发布到topic上的消息都给投递到某个消费者组里的消费者实例上。消费者实例可能是分布的进程里或者分布的物理机器上。
If all the consumer instances have the same consumer group, then this works just like a traditional queue balancing load over the consumers.
如果所有的消费者实例拥有一样的消费者组,那么这就像传统的平衡所有消费者的队列。
If all the consumer instances have different consumer groups, then this works like publish-subscribe and all messages are broadcast to all consumers.
如果所有的消费者有不同的组,那么就像消息订阅那么样广播给所有的消息者。
More commonly, however, we have found that topics have a small number of consumer groups, one for each “logical subscriber”.
Each group is composed of many consumer instances for scalability and fault tolerance.
This is nothing more than publish-subscribe semantics where the subscriber is cluster of consumers instead of a single process.
但是更普遍地情况, 我们已经发现了主题拥有了一些消费者组,每个都是逻辑订阅。
每个组是有许多的可扩展的能容错的消费者实例组成。
这无非是发布-订阅语义的用户而不是一个单一的消费者群的过程。
Kafka has stronger ordering guarantees than a traditional messaging system, too.
A traditional queue retains messages in-order on the server,
and if multiple consumers consume from the queue then the server hands out messages in the order they are stored.
However, although the server hands out messages in order, the messages are delivered asynchronously to consumers,
so they may arrive out of order on different consumers. This effectively means the ordering of the messages is lost in the presence of parallel consumption.
Messaging systems often work around this by having a notion of “exclusive consumer”
that allows only one process to consume from a queue, but of course this means
that there is no parallelism in processing.
传统的消息在服务器上保持消息有序, 如果多个消费者从队列中消费消息, 然后服务器把保存的消息发出去。
但是, 服务器顺序发消息时, 是被异步发送给消费者的 所以他们可能不是顺序到达不同的消费者。 这意味着有序的消息会在并行消费的情况下丢失。
消息系统经常遇到一个概念“独家消费”, 只允许一个进程从队列中消费, 这样的话意味着没有处理的并行性。
Kafka does it better. By having a notion of parallelism—the partition—within the topics, Kafka is able to provide both ordering guarantees
and load balancing over a pool of consumer processes. This is achieved by assigning the partitions in the topic to the consumers
in the consumer group so that each partition is consumed by exactly one consumer in the group. By doing this we ensure that the consumer
is the only reader of that partition and consumes the data in order. Since there are many partitions this still balances the load over many consumer instances.
Note however that there cannot be more consumer instances than partitions.
kafka做的漂亮,通过主题内置并行分区的概念,kafka能提供既有顺序保证又有负载均衡的消费能力。 这是通过在主题中设置分区给消费者组的消费者而实现的。那样的话每个分区能被有且只有一个组里的消费者实例消费。
这样的话, 我们能保证消费者是分区的唯一的用户,并且顺序消费。 由于有了多个分区,这样仍然可以平衡多个消费者实例的负载。
注意仍然不能让消费者实例的数量多于分区的数量。
Kafka only provides a total order over messages within a partition, not between different partitions in a topic.
Per-partition ordering combined with the ability to partition data by key is sufficient for most applications.
However, if you require a total order over messages this can be achieved with a topic that has only one partition,
though this will mean only one consumer process.
kafka在某个分区上只提供了完整顺序的消息, 不是在主题的不同分区之间。
组合了通过键的分区数据功能的预分区是对大多数应用程序就足够了。 但是如果你需要一个全局的消息有序的话, 你可能要把这个topic弄在一个分区里才行,这也意味着只能有一个消费者进程。
相关文章推荐
- 【翻译】Flink + Kafka 0.11端到端精确一次处理语义实现
- kafka-clients 0.10 消息消费者
- kafka生产者、消费者java示例
- kafka文档(3)---- 配置选项翻译
- 基于Kafka的生产者消费者消息处理本地调试
- kafka 生产者给消费者发送消息报 class kafka.common.LeaderNotAvailableException
- kafka中partition和消费者对应关系
- kafka文档翻译(一)
- kafka 消费者代码示例
- 关于Kafka 的 consumer 消费者手动提交详解
- kafka文档(3)---- 配置选项翻译
- kafka官方文档翻译-design
- Kafka分区与消费者的关系
- kafka生产者和消费者的javaAPI demo
- kafka消费者脚本无法启动问题
- kafka消费者报错:kafka.common.ConsumerRebalanceFailedException
- 云计算设计模式翻译(四):Competing Consumers Pattern(消费者竞争模式)
- 记一次Kafka消费者拉取数据不均匀问题
- kafka 生产者给消费者发送消息报 class kafka.common.LeaderNotAvailableException
- kafka(java客户端)消费者取不到消息,生产者消息也没发送成功