您的位置:首页 > 其它

翻译kafka的消费者组

2015-04-10 11:28 127 查看
Messaging traditionally has two models: queuing and publish-subscribe.

In a queue, a pool of consumers may read from a server and each message goes to one of them; in publish-subscribe the message is broadcast to all consumers.

Kafka offers a single consumer abstraction that generalizes both of these—the consumer group.

Consumers label themselves with a consumer group name, and each message published to a topic is delivered to one consumer instance

within each subscribing consumer group. Consumer instances can be in separate processes or on separate machines.

传统的消息系统有两种模式:队列和订阅

队列就是一组消费者池从某个server读取队列中的某一个;而消息订阅是广播给所有的消费者。

kafka提供了一种消费者抽象的方式来实现,这个就是消费者组,它能实现两种方式。

消费者都在某个消费者组名里被标记。每个给发布到topic上的消息都给投递到某个消费者组里的消费者实例上。消费者实例可能是分布的进程里或者分布的物理机器上。

If all the consumer instances have the same consumer group, then this works just like a traditional queue balancing load over the consumers.

如果所有的消费者实例拥有一样的消费者组,那么这就像传统的平衡所有消费者的队列。

If all the consumer instances have different consumer groups, then this works like publish-subscribe and all messages are broadcast to all consumers.

如果所有的消费者有不同的组,那么就像消息订阅那么样广播给所有的消息者。

More commonly, however, we have found that topics have a small number of consumer groups, one for each “logical subscriber”.

Each group is composed of many consumer instances for scalability and fault tolerance.

This is nothing more than publish-subscribe semantics where the subscriber is cluster of consumers instead of a single process.

但是更普遍地情况, 我们已经发现了主题拥有了一些消费者组,每个都是逻辑订阅。

每个组是有许多的可扩展的能容错的消费者实例组成。

这无非是发布-订阅语义的用户而不是一个单一的消费者群的过程。

Kafka has stronger ordering guarantees than a traditional messaging system, too.

A traditional queue retains messages in-order on the server,

and if multiple consumers consume from the queue then the server hands out messages in the order they are stored.

However, although the server hands out messages in order, the messages are delivered asynchronously to consumers,

so they may arrive out of order on different consumers. This effectively means the ordering of the messages is lost in the presence of parallel consumption.

Messaging systems often work around this by having a notion of “exclusive consumer”

that allows only one process to consume from a queue, but of course this means

that there is no parallelism in processing.

传统的消息在服务器上保持消息有序, 如果多个消费者从队列中消费消息, 然后服务器把保存的消息发出去。

但是, 服务器顺序发消息时, 是被异步发送给消费者的 所以他们可能不是顺序到达不同的消费者。 这意味着有序的消息会在并行消费的情况下丢失。

消息系统经常遇到一个概念“独家消费”, 只允许一个进程从队列中消费, 这样的话意味着没有处理的并行性。

Kafka does it better. By having a notion of parallelism—the partition—within the topics, Kafka is able to provide both ordering guarantees

and load balancing over a pool of consumer processes. This is achieved by assigning the partitions in the topic to the consumers

in the consumer group so that each partition is consumed by exactly one consumer in the group. By doing this we ensure that the consumer

is the only reader of that partition and consumes the data in order. Since there are many partitions this still balances the load over many consumer instances.

Note however that there cannot be more consumer instances than partitions.

kafka做的漂亮,通过主题内置并行分区的概念,kafka能提供既有顺序保证又有负载均衡的消费能力。 这是通过在主题中设置分区给消费者组的消费者而实现的。那样的话每个分区能被有且只有一个组里的消费者实例消费。

这样的话, 我们能保证消费者是分区的唯一的用户,并且顺序消费。 由于有了多个分区,这样仍然可以平衡多个消费者实例的负载。

注意仍然不能让消费者实例的数量多于分区的数量。

Kafka only provides a total order over messages within a partition, not between different partitions in a topic.

Per-partition ordering combined with the ability to partition data by key is sufficient for most applications.

However, if you require a total order over messages this can be achieved with a topic that has only one partition,

though this will mean only one consumer process.

kafka在某个分区上只提供了完整顺序的消息, 不是在主题的不同分区之间。

组合了通过键的分区数据功能的预分区是对大多数应用程序就足够了。 但是如果你需要一个全局的消息有序的话, 你可能要把这个topic弄在一个分区里才行,这也意味着只能有一个消费者进程。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  Kafka-分布式消