您的位置:首页 > 其它

[Erlang危机](3.3)丢弃数据上篇

2014-11-06 20:16 155 查看
  [b]原创文章,转载请注明出处:服务器非业余研究http://blog.csdn.net/erlib 作者Sunface[/b]


Discarding Data

丢弃数据

When nothing can slow down outside of your Erlang system and things can’t be scaled up, you must either drop data or crash (which drops data that was in flight, for most cases, but with more violence).

 当没有办法减缓外部数据的输入,并且系统性能也不能再提高时,要么就丢弃一些数据,要么就崩溃(崩溃也会放弃那些未保存的数据,只不过很多情况下这种措施显得很暴力)。

It’s a sad reality that nobody really wants to deal with. Programmers, software engineers, and computer scientists are trained to purge the useless data, and keep everything that’s useful. Success comes through optimization, not giving up.

 悲剧的是没有人会想去放弃数据,程序员,软件工程师,或计算机科学家都被告之要过滤掉无用数据,保证系统都是有用的数据,因此成功一般是来自于优化,而不是放弃。

However, there’s a point that can be reached where the data that comes in does so at a rate faster than it goes out, even if the Erlang system on its own is able to do everything fast enough. In some cases, It’s the component after it that blocks.

 然而,必须要指出的一点是:在某种特殊情况下,数据处理速度会慢于数据输入速度。甚至当Erlang系统本身处理速度完全足够的时,恰好某些情况组合着一起发生了,然后系统就阻塞了。

If you don’t have the option of limiting how much data you receive, you then have to drop messages to avoid crashing.

 如果你实在无法限制数据的输入时,就只能选择丢弃一些数据以避免系统的崩溃。

Random Drop

随机丢弃

Randomly dropping messages is the easiest way to do such a thing, and might also be the most robust implementation, due to its simplicity.
The trick is to define some threshold value between 0.0 and 1.0 and to fetch a random number in that range:
   随机丢弃消息是最简单的实现方式,但正是因为简单,可能也是最粗暴的实现方式。这个方法的难点在于如何定义0.0~1.0之间的阈值(丢弃总数据1%-100%之间一个值    -Sunface),并随机取阈值范围内的一个数据:

-------------------------------------------------------------------—----
-module(drop).

-export([random/1]).


random(Rate) ->

  %%设置随机数种子

  maybe_seed()
,
  random:uniform() =< Rate.%%若随机数小于阈值则返回true

maybe_seed() ->

  case get(random_seed) of

    undefined -> 

      random:seed(erlang:now());

    {X,X,X} -> 

      random:seed(erlang:now());

    _ -> 

      ok

end.

-------------------------------------------------------------------—----

If you aim to keep 95% of the messages you send, the authorization could be written by a call to case drop:random(0.95) of true -> send(); false -> drop() end, or a shorter drop:random(0.95) andalso send() if you don’t need to do anything specific when dropping a message.

 如果想保留95%系统收到的数据,那么就可以调用
-------------------------------------------------------------------—----
case drop:random(0.95) of

  ture -> 

    send();

  false -> 

    drop()

end.

-------------------------------------------------------------------—----
 如果丢弃数据不需要处理,就可以更简洁一些:
-------------------------------------------------------------------—----
drop:random(0.95) andalso send().

-------------------------------------------------------------------—----

The maybe_seed() function will check that a valid seed is present in the process dictionary and use it rather than a crappy one, but only if it has not been defined before, in order to avoid calling now() (a monotonic function that requires a global lock) too often.
There is one ‘gotcha’ to this method, though: the random drop must ideally be done at the producer level rather than at the queue (the receiver) level.

 maybe_seed() 函数会在进程字典里面检查是否存在一个有效的随机数种子,并使用它。但这只针对种子未被定义的情况,用来避免每次都要调用一个now()(这个函数是有一个系统全局锁的,详见Erlang取当前时间的瓶颈以及解决方案 - Sunface)。maybe_seed()函数包含了'gotcha',试想:理想状况下,随机丢弃应该发生在具体的消息处理进程,而不是分发消息队列的进程。

The best way to avoid overloading a queue is to not send data its way in the first place. Because there are no bounded mailboxes in Erlang, dropping in the receiving process only guarantees that this process will be spinning wildly, trying to get rid of messages, and fighting the schedulers to do actual work.

On the other hand, dropping at the producer level is guaranteed to distribute the work equally across all processes.

 然后防止消息队列过载的最好方法是:发送消息的时候就选择丢弃。因为Elrang的进程邮箱并没有限制大小,如果是在接收消息时才丢弃就只能让这个进程疯狂地运转来处理这些消息,会大大增加调度器的压力。然而从另一个方面来说,在处理消息时才丢弃能保证所有消息处理进程的工作分配是均等的。

This can give place to interesting optimizations where the working process or a given monitor process15 uses values in an ETS table or application:set_env/3 to dynamically increase and decrease the threshold to be used with the random number.
  有一种有趣的优化:工作进程或者监控进程可以使用ETS、application:set_env/3来动态增加或减少数据丢弃的阈值。

This allows control over how many messages are dropped based on overload, and the configuration data can be fetched by any process rather efficiently by using application:get_env/2.
Similar techniques could also be used to implement different drop ratios for different message priorities, rather than trying to sort it all out at the consumer level.

 消息丢弃的阈值可以通过过载情况来确定,并能通过任意进程调用application:get_env/2来高效获取最新设定的阈值。类似的技术可以实现对不同的消息优先级设定不同的丢弃率,而不是统一标准解决一切。

[15] Any process tasked with checking the load of specific processes using heuristics such as process_info(Pid, message_queue_len) could be a monitor

[注15]:任何进程都可以使用如process_info(Pid,message_queue_len)的函数来监控另一个进程,所以就叫监控进程。

Queue Buffers

队列缓冲区

Queue buffers are a good alternative when you want more control over the messages you get rid of than with random drops, particularly when you expect overload to be coming in bursts rather than a constant stream in need of thinning.

 当你不想随机丢弃消息而是控制这些消息时,可以选择队列缓冲区,特别是在你的系统经常是突发性的过载而不是持续性的信息流冲击。

Even though the regular mailbox for a process has the form of a queue, you’ll generally want to pull all the messages out of it as soon as possible. A queue buffer will need two processes to be safe:
• The regular process you’d work with (likely a gen_server);
• A new process that will do nothing but buffer the messages. Messages from the outside should go to this process.
  通常来说默认的进程信箱都有队列,你平时要做的是把所有的消息尽快的取出。与之不同的是一个队列缓冲则需要两个进程:
 • 常规的工作处理进程(就像gen_server)。
 • 一个只实现消息缓冲功能的新进程,外部的消息应当先进入到这个进程中。

To make things work, the buffer process only has to remove all the messages it can from its mail box and put them in a queue data structure 16 it manages on its own.

 为了让其工作正常,缓冲进程需要把这些信息依次置入到一个由它来管理的队列缓存数据结构16中。

Whenever the server is ready to do more work, it can ask the buffer process to send it a given number of messages that it can work on. The buffer process picks them from its queue, forwards them to the server, and goes back to accumulating data.

 无论何时消息处理进程准备好工作时,可以向消息缓冲进程获取指定数目的消息(符合消息处理进程的负载能力)。缓冲进程从队列缓存中拿出这些消息,并推向消息处理进程,然后继续执行自己的职责——将外部消息置入队列缓存中。

Whenever the queue grows beyond a certain size 17 and you receive a new message, you can then pop the oldest one and push the new one in there, dropping the oldest elements as you go. 18

 当队列缓存增长超过一定规模时17,这时你又收到了一个新消息,可以弹出最老的消息,并把这个新来的消息放在那,甚至可以把最老的消息给丢弃掉18,这完全由你来设计。

This should keep the entire number of messages received to a rather stable size and provide a good amount of resistance to overload, somewhat similar to the functional version of a ring buffer.
The PO Box 19 library implements such a queue buffer.
 这样就能让消息流的长度控制在一个合理的范围内,让系统稳定的同时也能防止过载,有些类似于ring buff。19 
  PO Box19 就提供了这样一个缓冲队列。

[16] The queue module in Erlang provides a purely functional queue data structure that can work fine for such a buffer.
[17] To calculate the length of a queue, it is preferable to use a counter that gets incremented and decremented on each message sent or received, rather than iterating over the queue every time. It takes slightly more memory, but will tend to distribute the load of counting more evenly, helping predictability and avoiding more sudden build-ups in the buffer’s mailbox.
[18] You can alternatively make a queue that pops the newest message and queues up the oldest ones if you feel previous data is more important to keep.
[19] Available at: https://github.com/ferd/pobox, the library has been used in production for a long time in large scale products at Heroku and is considered mature

[注16]:这个队列数据结构可由Erlang内部提供的。
[注17]:计算消息的个数最后在收发消息时用一个计数器自己计数,不要每次遍历队列,这会使用更多的内存,并增加了不必要的负载。
[注18]:如果你觉得那老的消息更重要,你还可以又建一个队列来处理新来的消息。
[注19]: https://github.com/ferd/pobox 这个库很久之前大规模用于Heroku,是很成熟的库。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  erlang