Solaris Xen Drop 66 – Xen System Administration
2007-07-24 11:32
381 查看
Overview
Introduction to the Hypervisor
The machine monitor within the Solaris Operating System can securely execute multiple virtual machines simultaneously, each running its own operating system, on a single physical system. Each virtual machine instance is called a domain. There are two kinds of domains. The control domain is called domain0, or dom0. A guest OS, or unprivileged domain, is called a domainU or domU. Unlike virtualization using zones, each domain runs a full instance of an operating system.A hypervisor is also known as a Virtual Machine Monitor (VMM).
How Hypervisors Work
A hypervisor is a software system that partitions a single physical machine into multiple virtual machines, to provide server consolidation and utility computing. Existing applications and binaries run unmodified.The hypervisor controls the MMU, CPU scheduling, and interrupt controller, presenting a virtual machine to guests.
The hypervisor separates the software from the hardware by forming a layer between the software running in the virtual machine and the hardware. This separation enables the hypervisor to control how guest operating systems running inside a virtual machine use hardware resources. A hypervisor provides a uniform view of underlying hardware. Machines from different vendors with different I/O subsystems appear the same, which means that virtual machines can run on any available computer. Thus, administrators can view hardware as a pool of resources that can run arbitrary services on demand. Because the hypervisor also encapsulates a virtual machine's software state, the hypervisor layer can map and remap virtual machines to available hardware resources at any time and also live migrate virtual machines across computers. These capabilities can also be used for load balancing among a collection of machines, dealing with hardware failures, and scaling systems. When a computer fails and must go offline or when a new machine comes online, the hypervisor layer can simply remap virtual machines accordingly. Virtual machines are also easy to replicate, which lets administrators bring new services online as needed.
Containment means that administrators can suspend virtual machines and resume them at any time, or checkpoint them and roll them back to a previous execution state. With this general-purpose undo capability, systems can more easily recover from crashes or configuration errors. Containment also supports a very general mobility model. Users can copy a suspended virtual machine over a network or store and transport it on removable media. The hypervisor can also provide total mediation of all interactions between the virtual machine and underlying hardware, thus allowing strong isolation between virtual machines and supporting the multiplexing of many virtual machines on a single hardware platform. The hypervisor can then consolidate a collection of virtual machines with low resources onto a single computer, thereby lowering hardware costs and space requirements. Strong isolation is also valuable for reliability and security. Applications that previously ran together on one machine can now be separated on different virtual machines. If one application experiences a fault, the other applications are isolated from this occurrence and will not be affected. Further, if a virtual machine is compromised, the incident is contained to only that compromised virtual machine.
Resource Virtualization
As a key component of virtual machines, the hypervisor provides a layer between software environments and physical hardware that is programmable and transparent to the software above it, while making efficient use of the hardware below it.Virtualization provides a way to bypass interoperability constraints. Virtualizing a system or component such as a processor, memory, or an I/O device at a given abstraction level maps its interface and visible resources onto the interface and resources of an underlying, possibly different, real system. Consequently, the real system appears as a different virtual system or even as multiple virtual systems.
Virtualization Types
There are two basic types of virtualization, full virtualization and paravirtualization. The hypervisor supports both models.In a full virtualization, the operating system is completely unaware that it is running in a virtualized environment. In the more lightweight paravirtualization, the operating system is both aware of the virtualization layer and modified to support it, which results in higher performance.
The paravirtualized domU operating system is ported to run on top of the hypervisor, and uses virtual network, disk, and console devices.
Since dom0 must work closely with the hypervisor layer, dom0 is always paravirtualized. DomUs can be either paravirtualized or fully virtualized, and a system can have both varieties running simultaneously.
A hardware virtual machine (HVM) domU runs an unmodified operating system. These hardware-assisted virtual machines take advantage of Intel-VT or AMD Secure Virtual Machine (SVM) processors.
About Domains
Dom0 and domU are separate entities. Other than by login, you cannot access a domU from dom0. A dom0 should be reserved for system management work associated with running a hypervisor. This means, for example, that users should not have logins on dom0. Dom 0 provides shared access to a physical network interface to the guest domains, which have no direct access to physical devices.A Solaris domU works like a normal Solaris Operating System. All of the usual tools are available.
Domain States
A domain can be in one of six states. States are shown in virt-manager screens and in xm list displays:Name ID Mem VCPUs State Time(s) Domain-0 0 2049 2 r----- 4138.5 sxc18 3 511 1 -b---- 765.5
The states are:
r, running
The domain is currently running on a CPU.
b, blocked
The domain is blocked, and not running or able to be run. This can be caused because the domain is waiting on IO (a traditional wait state) or it has gone to sleep because there was nothing running in it.
p, paused
The domain has been paused, usually through the administrator running xm pause. When in a paused state, the domain will still consume allocated resources like memory, but will not be eligible for scheduling by the hypervisor. Run xm unpause to place the domain in the running state.
c, crashed
The domain has crashed. Usually this state can only occur if the domain has been configured not to restart on crash. See xmdomain.cfg(5) for more information.
s, shutdown
The domain is shut down.
d, dying
The domain is in the process of shutting down or crashing.
SMF Hypervisor Services
In Solaris, all of the properties from xend-config.sxp have been put into SMF xctl/xend (config/*).To modify an existing property:
# svccfg -s xctl/xend listprop # svccfg -s xctl/xend setprop config/dom0-cpus = 1 # svcadm refresh xctl/xend
To create a new property.
# svccfg -s xctl/xend setprop config/vncpasswd = astring: /"password/" # svcadm refresh xctl/xend # svcadm restart xend # svcprop xctl/xend
Verify That the xctl Hypervisor Services Are Started
Become superuser, or assume the Primary Administrator role..Verify that the xctl services are running.
# svcs -a | grep xctl
If the system displays the following, the services are not running:
disabled 12:29:34 svc:/system/xctl/console:default disabled 12:29:34 svc:/system/xctl/xend:default disabled 12:29:34 svc:/system/xctl/store:default
If the services are not running, verify that you booted an i86xpv kernel.
# uname -i i86xpv
Reboot if necessary.
If the correct kernel is running, enable the services.
# svcadm enable xctl/store # svcadm enable xctl/xend # svcadm enable xctl/console
You are now ready to create guest domains (domUs).
How To Manage Guest (DomU) Domains
Example
Create a domU that uses the following .py file.: p5b-vm[1]#; cat guest.py name = "solaris" vcpus = 2 memory = "512" extra = "-k" root = "/dev/dsk/c0d0s0" disk = ['file:/tank/guests/solaris/disk.img,0,w'] vif = [''] on_xend_start = "start" on_xend_stop = "shutdown" on_shutdown = "destroy" on_reboot = "restart" on_crash = "destroy"
Notice the on_xend_start and on_xend_stop entries. Either of the two entries can be defined. Both default to "invalid" if not defined.
on_xend_start = "start"
on_xend_stop = "shutdown"
Create the domain, but don't start it.
: p5b-vm[1]#; xm new -f <path to the py file> : p5b-vm[1]#; xm list Name ID Mem VCPUs State Time(s) Domain-0 0 2254 2 r----- 113.3 solaris 512 1 0.0 : p5b-vm[1]#;
Now you can start, suspend, and resume the domain. If it is shut it down, it will still be in the list. And, it is set to auto boot in dom0 poweron.
: p5b-vm[1]#; xm start solaris : p5b-vm[1]#; xm list Name ID Mem VCPUs State Time(s) Domain-0 0 2254 2 r----- 116.4 solaris 5 512 1 r----- 4.2 : p5b-vm[1]#; xm suspend solaris : p5b-vm[1]#; xm list Name ID Mem VCPUs State Time(s) Domain-0 0 2254 2 r----- 129.4 solaris 1 1 31.2 : p5b-vm[1]#; xm resume solaris : p5b-vm[1]#; xm list Name ID Mem VCPUs State Time(s) Domain-0 0 2254 2 r----- 132.6 solaris 6 511 2 -b---- 0.1 : p5b-vm[1]#; : p5b-vm[1]#; xm shutdown solaris : p5b-vm[1]#; xm list Name ID Mem VCPUs State Time(s) Domain-0 0 2254 2 r----- 134.2 solaris 511 2 0.5 : p5b-vm[1]#;
When you suspend a domain, the state is saved in /var/lib/xend/domains. This can fill up / quickly. There will be aa SMF property in xend to change the base dir where domains live. You might want to create a link for now. You can also still use save/restore to specify where to save and load the guest image to and from.
If you modify cpus/memory from xm or virsh, these changes will be saved in the configuration file and persist across reboots.
If you want to modify other parameters on a domain that is shutdown, you can add the domains uuid to the original .py file and re-run the xm new command.
: p5b-vm[1]#; echo 'uuid = "6dd59cf5-a17c-f7dc-255e-4efddfffb008"' >> <path to py file> : p5b-vm[1]#; xm new -f <path to the py file>
Enable Live Migration
By default, xend listens only on the loopback address for requests from the localhost. If you want to allow other machines to live migrate to the machine, you must do the following:Listen on all addresses (or you can specify a particular interface IP)
# svccfg -s xend setprop config/xend-relocation-address = /"/"
Create list of hosts from which to accept migrations:
# svccfg -s xend setprop config/xend-relocation-hosts-allow = /"^flax$ ^localhost$/"
Update the config:
# svcadm refresh xend && svcadm restart xend
How to Debug On Xen
Debugging a Hung domU
First, connect to the domU console and verify the domain is not in kernel debugger (kmdb) or a similar state.If a domU appears hung, always use xm dump-core' to take a dump file. Place this in /net/mdb.eng/cores/ and report it when filing a bug. You can look at this file with mdb.
If you can reproduce the hang, make the following changes in /etc/system of the domU:
set cpr_debug=0x3 set xen_suspend_debug=1 set xdf:xdfdebug=0x40
and reproduce. Some debugging output should go to the dom0 console. This is useful for hangs involving save, restor, migrate, shutdown, and reboot operations. It's a good idea to do all testing with these set.
Try sending the domU an interrupt to get it to drop into kmdb. Gentle method is xm sysrq mydomu b. Or, you can use 'q' on the Xen console as described below.
Xen Console
Currently, the Xen console should be set to a serial port for this to work. Type 3 consecutive ctrl-A's on the xen console. You should see the following output on the console.(XEN) *** Serial input -> Xen (type 'CTRL-a' three times to switch input to dom0).
To exit the xen console (and get back to the solaris console), type 3 more ctrl-a's.
The following menu.lst example sets both Xen and dom0's console to serial port ttya.
title Solaris dom0 kernel /boot/$ISADIR/xen.gz com1=9600,8n1 console=com1 module /platform/i86xpv/kernel/$ISADIR/unix /platform/i86xpv/kernel/$ISADIR/unix -k -B console=ttya module /platform/i86pc/$ISADIR/boot_archive
The following commands are supported in the Xen console. Commonly used keys are:
C - force a Solaris dom0 crash dump (/var/crash/...)
q - put solaris dom0 and all solaris domUs at the kmdb prompt (assuming you booted with -k)
R - force reboot dom0 (i.e. machine is hung)
(XEN) 'h' pressed -> showing installed handlers (XEN) key '%' (ascii '25') => Trap to xendbg (XEN) key 'C' (ascii '43') => trigger a crashdump (XEN) key 'H' (ascii '48') => dump heap info (XEN) key 'N' (ascii '4e') => NMI statistics (XEN) key 'R' (ascii '52') => reboot machine (XEN) key 'a' (ascii '61') => dump timer queues (XEN) key 'd' (ascii '64') => dump registers (XEN) key 'h' (ascii '68') => show this message (XEN) key 'i' (ascii '69') => dump interrupt bindings (XEN) key 'm' (ascii '6d') => memory info (XEN) key 'n' (ascii '6e') => trigger an NMI (XEN) key 'q' (ascii '71') => dump domain (and guest debug) info (XEN) key 'r' (ascii '72') => dump run queues (XEN) key 't' (ascii '74') => display multi-cpu clock info (XEN) key 'u' (ascii '75') => dump numa info (XEN) key 'v' (ascii '76') => dump Intel's VMCS (XEN) key 'z' (ascii '7a') => print ioapic info
Event Channels
To dump out info on the event channels.> ::evtchns Type Evtchn IRQ IPL CPU Masked Pending ISR(s) ipi 1 256 15 0 0 0 xc_serv ipi 2 257 13 0 0 0 xc_serv ipi 3 258 11 0 0 0 poke_cpu virq:debug 4 259 15 0 0 0 xen_debug_handler pirq 5 9 9 0 0 0 acpi_wrapper_isr virq:timer 6 260 14 0 0 0 cbe_fire ipi 7 261 14 0 0 0 cbe_fire pirq 8 19 5 0 0 0 ata_intr pirq 9 16 9 0 0 0 pepb_intx_intr virq:console 10 262 9 0 0 0 xenconsintr_priv pirq 11 18 1 0 0 0 uhci_intr pirq 12 23 1 0 0 0 uhci_intr pirq 13 17 6 0 0 0 rge_intr ipi 14 258 11 1 0 0 poke_cpu ipi 15 257 13 1 0 0 xc_serv ipi 16 261 14 1 0 0 cbe_fire ipi 17 256 15 1 0 0 xc_serv virq:timer 18 260 14 1 0 0 cbe_fire device 19 263 1 0 0 0 evtchn_device_upcall evtchn 20 264 1 0 0 0 xenbus_intr device 21 263 1 0 0 0 evtchn_device_upcall device 22 263 1 0 0 0 evtchn_device_upcall pirq 23 22 9 1 0 0 audiohd_intr device 24 263 1 0 0 0 evtchn_device_upcall evtchn 25 265 6 0 0 0 intr evtchn 26 266 5 1 0 0 xdb_intr evtchn 27 267 5 0 0 0 xdb_intr >
Do get more information for Type=device: Pass in the event channel number for the array index. For this example, I'm looking at the following:
Type Evtchn IRQ IPL CPU Masked Pending ISR(s) device 19 263 1 0 0 0 evtchn_device_upcall
Using event channel 19 (0t19), dump evtsoftdata
> *(port_user+(0x8*(0t19)))::print struct evtsoftdata { dip = 0xfffffffec08afd68 ring = 0xfffffffec5a10000 ring_cons = 0x185 ring_prod = 0x185 ring_overflow = 0 evtchn_wait = { _opaque = 0 } evtchn_lock = { _opaque = [ 0 ] } evtchn_pollhead = { bsys_version = 0xc757f840 boot_mem = 0 bsys_alloc = 0 bsys_free = 0x1ec bsys_getproplen = 0xfffffffec757f608 bsys_getprop = 0 bsys_nextprop = 0xfffffffec08afd68 bsys_printf = 0 bsys_doint = 0xfffffffec73b4dc8 bsys_ealloc = 0xde00000000 } pid = 0x1ec } >
Also determine which user process is using this event channel.
> *(port_user+(0x8*(0t19)))::print struct evtsoftdata pid | ::pid2proc | ::print proc_t p_user.u_psargs p_user.u_psargs = [ "/usr/lib/xenstored --pid-file=/var/run/xenstore.pid" ] >
Current Issues and Potential Solutions
xend fails to start:[2007-05-04 14:46:08 100668] ERROR (SrvDaemon:353) Exception starting xend (not w ell-formed (invalid token): line 19, column 0) Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvDaemon.py", line 345, in run File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvServer.py", line 254, in create File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvRoot.py", line 40, in __init__ File "/usr/lib/python2.4/site-packages/xen/web/SrvDir.py", line 82, in get File "/usr/lib/python2.4/site-packages/xen/web/SrvDir.py", line 52, in getobj File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvNode.py", line 30, in __init__ File "/usr/lib/python2.4/site-packages/xen/xend/XendNode.py", line 658, in inst ance File "/usr/lib/python2.4/site-packages/xen/xend/XendNode.py", line 87, in __ini t__ File "/usr/lib/python2.4/site-packages/xen/xend/XendStateStore.py", line 104, i n load_state File "/var/tmp/pkgbuild-gbuild/SUNWPython-extra-2.4.2-build/usr/lib/python2.4/s ite-packages/_xmlplus/dom/minidom.py", line 1915, in parse File "/var/tmp/pkgbuild-gbuild/SUNWPython-extra-2.4.2-build/usr/lib/python2.4/s ite-packages/_xmlplus/dom/expatbuilder.py", line 926, in parse File "/var/tmp/pkgbuild-gbuild/SUNWPython-extra-2.4.2-build/usr/lib/python2.4/s ite-packages/_xmlplus/dom/expatbuilder.py", line 207, in parseFile ExpatError: not well-formed (invalid token): line 19, column 0 [2007-05-04 14:46:09 100676] INFO (SrvDaemon:331) Xend Daemon started [2007-05-04 14:46:09 100676] INFO (SrvDaemon:335) Xend changeset: Tue May 01 17:1 2:19 2007 -0700 15014:66538ef9ecc5. [2007-05-04 14:46:09 100676] INFO (SrvDaemon:342) Xend version: Unknown.
The failure to start is due to xend>/tt>'s state becoming corrupted. The solution is to do the following:
% rm -rf /var/lib/xend/state % svcadm clear xend
Debugging a Lost Disk Interrupt
In this case, a Linux guest is running. The guest hangs trying to read/write the disk. Nothing looks wrong in ::evtchns, so look at the disk backend driver. You can see below that the frontend's (xdf) producer index is req_prod = 0xb083, and that the backend's (xdb) consumer index isxr_sring.br.req_cons = 0xb063. So, there is work to do but the backend driver doesn't know about it. Dropping down to kmdb and forcing the backend's interrupt routine to run gets the domU going again.[0]> xdb_intr::call 0xfffffffed1e52000
# mdb -k Loading modules: [ unix genunix specfs dtrace xpv_psm scsi_vhci ufs ip hook neti sctp arp usba fctl nca lofs zfs random emlxs md crypto fcp ptm sppp ipc ] > ::evtchns Type Evtchn IRQ IPL CPU Masked Pending ISR(s) ipi 1 256 15 0 0 0 xc_serv ipi 2 257 13 0 0 0 xc_serv ipi 3 258 11 0 0 0 poke_cpu virq:debug 4 259 15 0 0 0 xen_debug_handler pirq 5 9 9 0 0 0 acpi_wrapper_isr virq:timer 6 260 14 0 0 0 cbe_fire ipi 7 261 14 0 0 0 cbe_fire pirq 8 16 5 0 0 0 mpt_intr virq:console 9 262 9 0 0 0 xenconsintr_priv pirq 10 20 1 0 0 0 ehci_intr pirq 11 21 1 0 0 0 ohci_intr ipi 12 258 11 1 0 0 poke_cpu ipi 13 257 13 1 0 0 xc_serv ipi 14 261 14 1 0 0 cbe_fire ipi 15 256 15 1 0 0 xc_serv virq:timer 16 260 14 1 0 0 cbe_fire ipi 17 258 11 2 0 0 poke_cpu ipi 18 257 13 2 0 0 xc_serv ipi 19 261 14 2 0 0 cbe_fire ipi 20 256 15 2 0 0 xc_serv virq:timer 21 260 14 2 0 0 cbe_fire ipi 22 258 11 3 0 0 poke_cpu ipi 23 257 13 3 0 0 xc_serv ipi 24 261 14 3 0 0 cbe_fire ipi 25 256 15 3 0 0 xc_serv virq:timer 26 260 14 3 0 0 cbe_fire ipi 27 258 11 4 0 0 poke_cpu ipi 28 257 13 4 0 0 xc_serv ipi 29 261 14 4 0 0 cbe_fire ipi 30 256 15 4 0 0 xc_serv virq:timer 31 260 14 4 0 0 cbe_fire ipi 32 258 11 5 0 0 poke_cpu ipi 33 257 13 5 0 0 xc_serv ipi 34 261 14 5 0 0 cbe_fire ipi 35 256 15 5 0 0 xc_serv virq:timer 36 260 14 5 0 0 cbe_fire ipi 37 258 11 6 0 0 poke_cpu ipi 38 257 13 6 0 0 xc_serv ipi 39 261 14 6 0 0 cbe_fire ipi 40 256 15 6 0 0 xc_serv virq:timer 41 260 14 6 0 0 cbe_fire ipi 42 258 11 7 0 0 poke_cpu ipi 43 257 13 7 0 0 xc_serv ipi 44 261 14 7 0 0 cbe_fire ipi 45 256 15 7 0 0 xc_serv virq:timer 46 260 14 7 0 0 cbe_fire pirq 47 17 6 0 0 0 e1000g_intr_pciexpress pirq 48 18 6 1 0 0 e1000g_intr_pciexpress evtchn 49 264 1 3 0 0 xenbus_intr device 50 263 1 0 0 0 evtchn_device_upcall device 51 263 1 0 0 0 evtchn_device_upcall pirq 52 40 1 4 0 0 emlxs_msi_intr pirq 53 41 1 5 0 0 emlxs_msi_intr device 54 263 1 0 0 0 evtchn_device_upcall device 55 263 1 0 0 0 evtchn_device_upcall evtchn 56 265 5 7 0 0 xdb_intr evtchn 57 266 6 0 0 0 xnb_intr > ::prtconf ! grep xdb fffffffec3955008 xdb, instance #0 (driver name: xdb) > fffffffec3955008::print struct dev_info devi_driver_data devi_driver_data = 0xfffffffed1e52000 > 0xfffffffed1e52000::print xdb_t xs_ring | ::print xendev_ring_t xr_sring.br { xr_sring.br.rsp_prod_pvt = 0xb063 xr_sring.br.req_cons = 0xb063 xr_sring.br.nr_ents = 0x20 xr_sring.br.sring = 0xfffffffed0802000 } > 0xfffffffed1e52000::print xdb_t xs_ring | ::print xendev_ring_t xr_sring.br.sring | ::print comif_sring_t { req_prod = 0xb083 req_event = 0xb064 rsp_prod = 0xb063 rsp_event = 0xb064 pad = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... ] ring = [ '/001' ] } >
相关文章推荐
- Solaris Xen Drop 66 – Enable VNC in Fedora HVM DomU
- Solaris Xen Drop 66 - Setup Solaris HVM domU on Linux dom0
- Solaris Xen Drop 66 - Release Notes
- Solaris Xen Drop 66 – Setup Dom0
- Solaris Xen Drop 66 – Setup Solaris DomU
- Solaris Xen Drop 66 – Setup Solaris HVM DomU
- Solaris Xen Drop 66 - Setup Fedora HVM DomU
- Solaris™ 10 System Administration Essentials-1 Installing the Solaris 10 Operating System
- Solaris™ 10 System Administration Essentials-2 Boot, Service Management, and Shutdown
- Solaris 10 System Administration Exam Prep: CX-310-200, Part I (2nd Edition)
- Solaris 10 System Administration Exam Prep 2
- Solaris™ 10 System Administration Essentials 3,4 -Software Management
- Linux System Administration
- Solaris ZFS administration guide
- Solaris 登陆时候报 Not on system console 提示,如何 处理
- android.system.ErrnoException: open failed: ENOENT (No such file or directory) 07-19 20:27:45.011 66
- Oracle Database 10g Release 2 (10.2.0.2) for Solaris Operating System (x86) for Solaris 10 x86安装
- Solaris 10升级版本对Xen提供全面的支持功能
- How to Perform System Boot and Shutdown Procedures for Solaris 10, Part D