F5 CMP architecture
2010-08-08 22:23
204 查看
Most manufacturers would simply attempt to use SMP to distribute TMOS process across multiple processors—with shared memory, network card, and special purpose processors. Others might attempt to run multiple instances of the TMM on different processors—still with the requisite shared memory, network card, and special-purpose processors. Instead, CMP(clustered multiprocessing) enables load balancing of multiple processing cores, each with its own dedicated memory, network interface, and special-purpose processors. Each core runs its own, completely independent TMM process. By separating the dependencies between the instances, CMP allows more of the traffic management process virtually the entire process to be parallelized. This provides a substantial benefit to the overall performance of the system.The hardware that enables CMP is comprised of two important, proprietary F5 technologies: the Disaggregator and the High Speed Bridge (HSB).
The Disaggregator acts as a hardware-based load balancer, distributing traffic flows between the independent TMM instances and managing flow affinity if or when necessary. Not only does this facilitate a near 1:1 linear performance growth (doubling the number of processing cores nearly doubles the computing power with no diminished returns), but it completely virtualizes the processing cores from the system and the other cores. This provides high availability and reliability in the event that any core becomes non-functional.
The HSB delivers direct, non-blocking communication between the TMM instances and the outside world without the loss normally associated with Ethernet interconnects. It also provides the streamlined message-passing interface that enables TMM instances to share information. This provides the unsurpassed throughput and interconnectivity of each processor’s dedicated network interfaces. It also mitigates the performance impact of inter-process communications in the few remaining instances where it takes place. The rules has been changed by CMP
The amount of performance increase that can be expected by parallelizing a process is a factor of the amount of the process that can truly be parallelized. If a process requiring 10 units of time can only be 50 percent parallelized, the process will never run in less than five units, even if the parallelized portion is processed instantly. As a result, the entire process can never be more than twice as fast.
Up until now, the game has been pretty simple—and widely understood. First, it was to optimize your code to run on a single processor as best you can and ride the “Intel power-curve.” Then, it was to optimize your code for SMP or AMP and then build your platforms with as many processing cores as possible. All the while, performance improvements have slowly dwindled to miniscule amounts.
CMP changes the rules of the game. Instead of working to continually improve the performance of a never-changing proportion of parallelized processes, CMP’s most basic tenant is to change that proportion. Continuing improvements in performance can only be realized by increasing the amount of the application delivery process that can be parallelized. Only parallelizing nearly all of that process can enable near 1:1 linear scaling—fully utilizing all the processing cores.
The Disaggregator acts as a hardware-based load balancer, distributing traffic flows between the independent TMM instances and managing flow affinity if or when necessary. Not only does this facilitate a near 1:1 linear performance growth (doubling the number of processing cores nearly doubles the computing power with no diminished returns), but it completely virtualizes the processing cores from the system and the other cores. This provides high availability and reliability in the event that any core becomes non-functional.
The HSB delivers direct, non-blocking communication between the TMM instances and the outside world without the loss normally associated with Ethernet interconnects. It also provides the streamlined message-passing interface that enables TMM instances to share information. This provides the unsurpassed throughput and interconnectivity of each processor’s dedicated network interfaces. It also mitigates the performance impact of inter-process communications in the few remaining instances where it takes place. The rules has been changed by CMP
The amount of performance increase that can be expected by parallelizing a process is a factor of the amount of the process that can truly be parallelized. If a process requiring 10 units of time can only be 50 percent parallelized, the process will never run in less than five units, even if the parallelized portion is processed instantly. As a result, the entire process can never be more than twice as fast.
Up until now, the game has been pretty simple—and widely understood. First, it was to optimize your code to run on a single processor as best you can and ride the “Intel power-curve.” Then, it was to optimize your code for SMP or AMP and then build your platforms with as many processing cores as possible. All the while, performance improvements have slowly dwindled to miniscule amounts.
CMP changes the rules of the game. Instead of working to continually improve the performance of a never-changing proportion of parallelized processes, CMP’s most basic tenant is to change that proportion. Continuing improvements in performance can only be realized by increasing the amount of the application delivery process that can be parallelized. Only parallelizing nearly all of that process can enable near 1:1 linear scaling—fully utilizing all the processing cores.
相关文章推荐
- JavaScript Application Architecture On The Road To 2015
- Vs2013在Linux开发中的应用(34):Ctrl + F5支持
- JAXB(Java Architecture for XML Binding)
- 【推荐】IDA sp-analysis failed 不能F5的 解决方案之(二)
- python中用cmp比较字典大小
- F5负载均衡的初识和基本配置
- 【转】qsort用法--完整版(解释了cmp)
- Aerospike-Architecture系列之数据模型(Data Model)
- Agile software architecture design document style..( sketches and no UMLs)
- Undefined symbols for architecture i386: "_SCNetworkReachabilityCreateWithAddress"
- 简单介绍Python2.x版本中的cmp()方法的使用
- missing required architecture i386 解决方法
- F5 bigip笔记--工作巡检过程中实际使用的命令整理
- Optimal Flexible Architecture(最优灵活架构)
- F5 刷新功能
- (调试。F5F6F8,F5进入方法内部,F6逐行执行,F8跳过方法)
- iOS 静态库终端运行错误Non-fat file: libSFLogin_SDK.a is architecture: arm64
- 【转载】Caffe (Convolution Architecture For Feature Extraction)
- Struts 2 - Architecture