Virtualized bridged networking with MacVTap
2016-02-03 13:10
399 查看
http://blog.csdn.net/hshl1214/article/details/50628947
options for networking a virtual machine, both on the Link Layer and the Network layer. Please refer to the documentation of the virtualization system you are using (e.g. QEMU, KVM, etc.) The references list below also contains pointers to additional information.
so let’s see what it all means.
The Macvlan driver is a separate Linux kernel driver that the Macvtap driver depends on. Macvlan makes it possible to create
virtual network interfaces that “cling on” a physical network interface. Each virtual interface has its own MAC address distinct from the physical interface’s MAC address. Frames sent to or from the virtual interfaces are mapped to the physical interface,
which is called the lower interface.
Tap interfaces
A Tap interface is a software-only interface. Instead of passing frames to and from a physical Ethernet card, the frames are read and written by a user space program. The kernel makes the Tap interface available via the /dev/tapN device
file, where N is the index of the network interface.
A Macvtap interface combines the properties of these two; it is an virtual interface with a tap-like software interface. A Macvtap interface can be created using the ip command:
This adds a new interface called macvtap0 as can be seen in the following listing:
The device file corresponding to the new macvtap interface with index 3 is /dev/tap3. This device file is created by udev.
A user space program can open this device file and use it to send and receive Ethernet frames over it. When the kernel transmits a frame via the interface macvtap0, instead of sending it to a physical Ethernet card, it makes it available for reading from
this file by the user space program. Correspondingly, when the user space program writes the content of an Ethernet frame to the file /dev/tap3, the kernel’s networking code sees the frame as if it had been received via the device macvtap0.
The user space program is normally an emulator like QEMU, which virtualizes network cards to the guest operating systems. When QEMU reads an Ethernet frame using the file descriptor, it emulates what a real network card would do. Typically it triggers an
interrupt in the virtual machine, and the guest operating system can then read the frame from the emulated network card. The exact details on how this is done is dependent on the emulator and the guest operating system, and is not the focus of this article.
Macvtap is implemented in the Linux kernel, and must be configured when compiling the kernel, either as a module or as a built-in feature. The setting can be found under Device Drivers → Network device support → MAC-VLAN based tap driver. The tap driver
is dependent on ‘MAC-VLAN support’ in the same category, so you need to enable that too.
A Macvtap device can function in one of three modes: Virtual Ethernet Port Aggregator (VEPA) mode, Bridge mode, and Private mode. The modes determine how the tap endpoints communicate between each other.
also known as ‘Hairpin’ mode. Reflective Relay means the switch can send back a frame on the same port it received it on. Unfortunately, most switches today do not yet support this mode.
Hairpin mode
but not from the outside network.
At a first glance, the VEPA mode seems a bit odd. What makes it a good idea to send out frames on the physical wire, only to be sent back to the Ethernet card via the same port on the switch? VEPA mode simplifies the task of the host computer by letting
the physical switch do the switching, which the switch is very good at. A further advantage is that network administrators can monitor traffic between virtual machines using familiar tools on a managed switch, which would not be possible if the data never
entered the switch.
Switches have not traditionally supported Reflective Relay mode, because the Spanning Tree Protocol (STP) has prevented it, and before the advent of virtualization it made no sense for a frame to be passed back through the same port.
Change the mode to ‘bridge’ if you don’t have a VEPA capable switch. Also make sure each tap interface has a unique and sensible value for the MAC address.
This directive causes libvirt to create a Macvtap device associated with the specified source device. Libvirt also opens the corresponding device file (as described above) and passes the file descriptor to QEMU. Thus, when using libvirt, there is no need
to create the tap interfaces by hand, as was shown in the example above.
configured by the same DHCP server as the physical machines. Note that the connection is at the data link layer (L2) and is thus independent of which network layer protocol is used on top of it. The network protocol can be IPv4, IPv6 or even IPX, if you wish.
Linux information for IBM systems — Virtualization blueprints
Libvirt Domain XML format
Tun/Tap interface tutorial (background information on the tap interface)
Wikibon — Edge Virtual Bridging
Introduction
A virtual machine typically needs to be connected to a network to be useful. Because a virtual machine runs as an application inside the host computer, connecting it to the outside world needs support from the host operating system. There are a number ofoptions for networking a virtual machine, both on the Link Layer and the Network layer. Please refer to the documentation of the virtualization system you are using (e.g. QEMU, KVM, etc.) The references list below also contains pointers to additional information.
MacVTap
In this article we’ll focus on a relatively new Linux device driver designed to ease the task of networking virtual machines: Mavtap. Macvtap is essentially a combination of the Macvlan driver and a Tap device. This probably does not say much to the uninitiated,so let’s see what it all means.
The Macvlan driver is a separate Linux kernel driver that the Macvtap driver depends on. Macvlan makes it possible to create
virtual network interfaces that “cling on” a physical network interface. Each virtual interface has its own MAC address distinct from the physical interface’s MAC address. Frames sent to or from the virtual interfaces are mapped to the physical interface,
which is called the lower interface.
Tap interfaces
A Tap interface is a software-only interface. Instead of passing frames to and from a physical Ethernet card, the frames are read and written by a user space program. The kernel makes the Tap interface available via the /dev/tapN device
file, where N is the index of the network interface.
A Macvtap interface combines the properties of these two; it is an virtual interface with a tap-like software interface. A Macvtap interface can be created using the ip command:
$ sudo ip link add link eth0 name macvtap0 type macvtap
This adds a new interface called macvtap0 as can be seen in the following listing:
$ ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN mode DEFAULT link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000 link/ether 00:1f:d0:15:7b:e6 brd ff:ff:ff:ff:ff:ff 3: macvtap0@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 500 link/ether 42:96:80:ee:2d:23 brd ff:ff:ff:ff:ff:ff
The device file corresponding to the new macvtap interface with index 3 is /dev/tap3. This device file is created by udev.
$ ls -l /dev/tap3 crw------- 1 root root 252, 1 Oct 21 12:10 /dev/tap3
A user space program can open this device file and use it to send and receive Ethernet frames over it. When the kernel transmits a frame via the interface macvtap0, instead of sending it to a physical Ethernet card, it makes it available for reading from
this file by the user space program. Correspondingly, when the user space program writes the content of an Ethernet frame to the file /dev/tap3, the kernel’s networking code sees the frame as if it had been received via the device macvtap0.
The user space program is normally an emulator like QEMU, which virtualizes network cards to the guest operating systems. When QEMU reads an Ethernet frame using the file descriptor, it emulates what a real network card would do. Typically it triggers an
interrupt in the virtual machine, and the guest operating system can then read the frame from the emulated network card. The exact details on how this is done is dependent on the emulator and the guest operating system, and is not the focus of this article.
Macvtap is implemented in the Linux kernel, and must be configured when compiling the kernel, either as a module or as a built-in feature. The setting can be found under Device Drivers → Network device support → MAC-VLAN based tap driver. The tap driver
is dependent on ‘MAC-VLAN support’ in the same category, so you need to enable that too.
A Macvtap device can function in one of three modes: Virtual Ethernet Port Aggregator (VEPA) mode, Bridge mode, and Private mode. The modes determine how the tap endpoints communicate between each other.
1. Virtual Ethernet Port Aggregator mode
In this mode, which is the default, data between endpoints on the same lower device are sent via the lower device (Ethernet card) to the physical switch the lower device is connected to. This mode requires that the switch supports ‘Reflective Relay’ mode,also known as ‘Hairpin’ mode. Reflective Relay means the switch can send back a frame on the same port it received it on. Unfortunately, most switches today do not yet support this mode.
Hairpin mode
2. Bridge mode
When the MacVTap device is in Bridge mode, the endpoints can communicate directly without sending the data out via the lower device. When using this mode, there is no need for the physical switch to support Reflective Relay mode.3. Private mode
In Private mode the nodes on the same MacVTap device can never talk to each other, regardless if the physical switch supports Reflective Relay mode or not. Use this mode when you want to isolate the virtual machines connected to the endpoints from each other,but not from the outside network.
At a first glance, the VEPA mode seems a bit odd. What makes it a good idea to send out frames on the physical wire, only to be sent back to the Ethernet card via the same port on the switch? VEPA mode simplifies the task of the host computer by letting
the physical switch do the switching, which the switch is very good at. A further advantage is that network administrators can monitor traffic between virtual machines using familiar tools on a managed switch, which would not be possible if the data never
entered the switch.
Switches have not traditionally supported Reflective Relay mode, because the Spanning Tree Protocol (STP) has prevented it, and before the advent of virtualization it made no sense for a frame to be passed back through the same port.
Using MacVTap with libvirt
If you are using the libvirt (libvirt.org) toolkit to manage your virtual machines, add a network interface definition like the following in your domain XML file:<devices> <interface type='direct'> <mac address='d0:0f:d0:0f:00:01'/> <source dev='eth0' mode='vepa'/> </interface> <!-- More devices here... --> </devices>
Change the mode to ‘bridge’ if you don’t have a VEPA capable switch. Also make sure each tap interface has a unique and sensible value for the MAC address.
This directive causes libvirt to create a Macvtap device associated with the specified source device. Libvirt also opens the corresponding device file (as described above) and passes the file descriptor to QEMU. Thus, when using libvirt, there is no need
to create the tap interfaces by hand, as was shown in the example above.
Conclusion
Connecting virtual machines to a virtual switch as described above makes them present on the local network just as if they were physical machines connected to the LAN. They belong to the same subnet as the physical machines and their IP addresses can beconfigured by the same DHCP server as the physical machines. Note that the connection is at the data link layer (L2) and is thus independent of which network layer protocol is used on top of it. The network protocol can be IPv4, IPv6 or even IPX, if you wish.
References
Kernelnewbies.org Linux Virtualization Wiki — MacVTapLinux information for IBM systems — Virtualization blueprints
Libvirt Domain XML format
Tun/Tap interface tutorial (background information on the tap interface)
Wikibon — Edge Virtual Bridging
相关文章推荐
- 设计模式:中介者模式
- spring batch 2: 搭建环境以及简单的Job
- vijos P1197 费解的开关 题解
- jquery form表单提交成功,回调方法
- 英文名Tyron的起始
- Xcode7使用插件的简单方法&&以及怎样下载到更早版本的Xcode
- Oracle未正确关闭引起的问题
- ZeroClipboard js复制文本(兼容全部浏览器)
- conflicting types for xx错误
- [C++]Hanoi
- Android性能优化系列---管理你的app内存(一)
- objective c 函数声明
- Wunder Fund Round 2016 (Div. 1 + Div. 2 combined) B. Guess the Permutation 水题
- 转:iOS应用如何实现64位的支持
- 最新 AFNetworking 3.0 简单实用封装
- Slickflow.NET 开源工作流引擎基础介绍(三) -- 基于HTML5/Bootstrap的Web流程设计器
- 从交互设计浅谈安卓开发有多痛苦,安卓程序员才最值得尊重
- Linux 下增大tomcat内存
- 淘宝分布式文件存储系统:TFS
- 【转载】我们本可以无师自通的做好期货