Access the Linux kernel using the /proc filesystem
2012-02-29 11:08
627 查看
Access the Linux kernel using the /proc filesystem
This virtual filesystem opens a window of communication between the kernel and user spaceM. Tim Jones, Consultant Engineer, Emulex
Summary: The /proc filesystem is a virtual filesystem that permits a novel approach for communication between the Linux® kernel and user space. In the /proc filesystem, virtual files can be read from or written to as a means of communicating
with entities in the kernel, but unlike regular files, the content of these virtual files is dynamically created. This article introduces you to the /proc virtual filesystem and demonstrates its use.
Tags for this article: kernel, os, proc
Tag this!
Update My dW interests (Log
in |
What's this?)
Skip to help for Update My dW interests
Date: 14 Mar 2006
Level: Introductory
Comments: 3 (View | Add comment
- Sign in)
![](http://dw1.s81c.com/developerworks/i/stars120x20-4b.jpg)
Average rating (311 votes)
Rate this article
The /proc filesystem was originally developed to provide information on the processes in a system. But given the filesystem's usefulness, many elements of the kernel use it both to report information and to enable dynamic runtime configuration.
The /proc filesystem contains directories (as a way of organizing information) and virtual files. A virtual file can present information from the kernel to the user and also serve as a means of sending information from the user to the kernel. It's not actually
required to do both, but this article show you how to configure the filesystem for input and output.
A short article like this can't detail all the uses of /proc, but it does demonstrate a couple of uses to give you an idea of how powerful /proc can be. Listing 1 is an interactive tour of some of the /proc elements. It shows the root level of the /proc
filesystem. Note the series of numbered files on the left. Each of these is a directory representing a process in the system. Because the first process created in GNU/Linux is the
initprocess, it has a
process-idof 1. Next, performing an
lson the directory shows a list of files. Each file provides details on the particular process. For example, to see the command-line entry for
init, simply
catthe
cmdlinefile.
Some of the other interesting files in /proc are
cpuinfo, which identifies the type of processor and its speed;
pci, which shows the devices found on the PCI buses; and
modules, which identifies the modules that are currently loaded into the kernel.
Listing 1. Interactive tour of /proc
[root@plato]# ls /proc 1 2040 2347 2874 474 fb mdstat sys 104 2061 2356 2930 9 filesystems meminfo sysrq-trigger 113 2073 2375 2933 acpi fs misc sysvipc 1375 21 2409 2934 buddyinfo ide modules tty 1395 2189 2445 2935 bus interrupts mounts uptime 1706 2201 2514 2938 cmdline iomem mtrr version 179 2211 2515 2947 cpuinfo ioports net vmstat 180 2223 2607 3 crypto irq partitions 181 2278 2608 3004 devices kallsyms pci 182 2291 2609 3008 diskstats kcore self 2 2301 263 3056 dma kmsg slabinfo 2015 2311 2805 394 driver loadavg stat 2019 2337 2821 4 execdomains locks swaps [root@plato 1]# ls /proc/1 auxv cwd exe loginuid mem oom_adj root statm task cmdline environ fd maps mounts oom_score stat status wchan [root@plato]# cat /proc/1/cmdline init [5] [root@plato]# |
Listing 2. Reading from and writing to /proc (configuring the kernel)
[root@plato]# cat /proc/sys/net/ipv4/ip_forward 0 [root@plato]# echo "1" > /proc/sys/net/ipv4/ip_forward [root@plato]# cat /proc/sys/net/ipv4/ip_forward 1 [root@plato]# |
sysctlto configure these kernel items. See the
Resources section for more information on that.
By the way, the /proc filesystem isn't the only virtual filesystem in GNU/Linux. One such system,
sysfs, is similar to /proc but a bit more organized (having learned lessons from /proc). However, /proc is entrenched and therefore, even though sysfs has some advantages over it, /proc is here to stay. There's also the
debugfs filesystem, but it tends to be (as the name implies) more of a debugging interface. An advantage to debugfs is that it's extremely simple to export a single value to user space (in fact, it's a single call).
Introducing kernel modules
Loadable Kernel Modules (LKM) are an easy way to demonstrate the /proc filesystem, because they're a novel way to dynamically add or remove code from the Linux kernel. LKMs are also a popular mechanism for device drivers and filesystems in the Linux kernel.
If you've ever recompiled the Linux kernel, you probably found that in the kernel configuration process, many device drivers and other kernel elements are compiled as modules. If a driver is compiled directly into the kernel, its code and static data occupy
space even if they're not used. But if the driver is compiled as a module, it requires memory only if memory is needed and subsequently loaded, into the kernel. Interestingly, you won't notice a performance hit for LKMs, so they're a powerful means of creating
a lean kernel that adapts to its environment based upon the available hardware and attached devices.
Here's a simple LKM to help you understand how it differs from standard (non-dynamically loadable) code that you'll find in the Linux kernel. Listing 3 presents the simplest LKM. (You can download the sample code for this article from the
Downloads section, below.)
Listing 3 includes the necessary module header (which defines the module APIs, types, and macros). It then defines the license for the module using
MODULE_LICENSE. Here, it specifies GPL to avoid tainting the kernel.
Listing 3 then defines the module
initand
cleanupfunctions. The
my_module_initfunction is called when the module is loaded and the function can be used for initialization purposes. The
my_module_cleanupfunction is called when the module is being unloaded and is used to free memory and generally remove traces of the module. Note the use of
printkhere: this is the kernel
printffunction. The
KERN_INFOsymbol is a string that you can use to filter information from entering the kernel ring buffer (much like
syslog).
Finally, Listing 3 declares the entry and exit functions using the
module_initand
module_exitmacros. This allows you to name the module
initand
cleanupfunctions the way you want but then tell the kernel which functions are the maintenance functions.
Listing 3. A simple but functional LKM (simple-lkm.c)
#include <linux/module.h> /* Defines the license for this LKM */ MODULE_LICENSE("GPL"); /* Init function called on module entry */ int my_module_init( void ) { printk(KERN_INFO "my_module_init called. Module is now loaded.\n"); return 0; } /* Cleanup function called on module exit */ void my_module_cleanup( void ) { printk(KERN_INFO "my_module_cleanup called. Module is now unloaded.\n"); return; } /* Declare entry and exit functions */ module_init( my_module_init ); module_exit( my_module_cleanup ); |
simple-lkm.c, create a makefile whose sole content is:
obj-m += simple-lkm.o |
makecommand as shown in Listing 4.
Listing 4. Building an LKM
[root@plato]# make -C /usr/src/linux-`uname -r` SUBDIRS=$PWD modules make: Entering directory `/usr/src/linux-2.6.11' CC [M] /root/projects/misc/module2.6/simple/simple-lkm.o Building modules, stage 2. MODPOST CC /root/projects/misc/module2.6/simple/simple-lkm.mod.o LD [M] /root/projects/misc/module2.6/simple/simple-lkm.ko make: Leaving directory `/usr/src/linux-2.6.11' [root@plato]# |
simple-lkm.ko. The new naming convention helps to distinguish kernel objects (LKMs) from standard objects. You can now load and unload the module and then view its output. To load the module, use the
insmodcommand; conversely, to unload the module, use the
rmmodcommand.
lsmodshows the currently loaded LKMs (see Listing 5).
Listing 5. Inserting, checking, and removing an LKM
[root@plato]# insmod simple-lkm.ko [root@plato]# lsmod Module Size Used by simple_lkm 1536 0 autofs4 26244 0 video 13956 0 button 5264 0 battery 7684 0 ac 3716 0 yenta_socket 18952 3 rsrc_nonstatic 9472 1 yenta_socket uhci_hcd 32144 0 i2c_piix4 7824 0 dm_mod 56468 3 [root@plato]# rmmod simple-lkm [root@plato]# |
stdout, because
stdoutis process specific. To inspect messages on the kernel ring buffer, you can use the
dmesgutility (or work through /proc itself with the command
cat /proc/kmsg). Listing 6 shows the output of the last few messages from
dmesg.
Listing 6. Reviewing the kernel output from the LKM
[root@plato]# dmesg | tail -5 cs: IO port probe 0xa00-0xaff: clean. eth0: Link is down eth0: Link is up, running at 100Mbit half-duplex my_module_init called. Module is now loaded. my_module_cleanup called. Module is now unloaded. [root@plato]# |
Back to top
Integrating into the /proc filesystem
The standard APIs that are available to kernel programmers are also available to LKM programmers. It's even possible for an LKM to export new variables and functions that the kernel can use. A complete treatment of the APIs is beyond the scope of this article,
so I simply present some of the elements that I use later to demonstrate a more useful LKM.
Creating and removing a /proc entry
To create a virtual file in the /proc filesystem, use the
create_proc_entryfunction. This function accepts a file name, a set of permissions, and a location in the /proc filesystem in which the file is to reside. The return value of
create_proc_entryis a
proc_dir_entrypointer (or NULL, indicating an error in
create). You can then use the return pointer to configure other aspects of the virtual file, such as the function to call when a read is performed on the file. The prototype for
create_proc_entryand a portion of the
proc_dir_entrystructure are shown in Listing 7.
Listing 7. Elements for managing a /proc filesystem entry
struct proc_dir_entry *create_proc_entry( const char *name, mode_t mode, struct proc_dir_entry *parent ); struct proc_dir_entry { const char *name; // virtual file name mode_t mode; // mode permissions uid_t uid; // File's user id gid_t gid; // File's group id struct inode_operations *proc_iops; // Inode operations functions struct file_operations *proc_fops; // File operations functions struct proc_dir_entry *parent; // Parent directory ... read_proc_t *read_proc; // /proc read function write_proc_t *write_proc; // /proc write function void *data; // Pointer to private data atomic_t count; // use count ... }; void remove_proc_entry( const char *name, struct proc_dir_entry *parent ); |
read_procand
write_proccommands to plug in functions for reading and writing the virtual file.
To remove a file from /proc, use the
remove_proc_entryfunction. To use this function, provide the file name string as well as the location of the file in the /proc filesystem (its parent). The function prototype is also shown in Listing 7.
The parent argument can be NULL for the /proc root or a number of other values, depending upon where you want the file to be placed. Table 1 lists some of the other parent
proc_dir_entrys that you can use, along with their location in the filesystem.
Table 1. Shortcut proc_dir_entry variables
proc_dir_entry | Filesystem location |
---|---|
proc_root_fs | /proc |
proc_net | /proc/net |
proc_bus | /proc/bus |
proc_root_driver | /proc/driver |
You can write to a /proc entry (from the user to the kernel) by using a
write_procfunction. This function has this prototype:
int mod_write( struct file *filp, const char __user *buff, unsigned long len, void *data ); |
filpargument is essentially an open file structure (we'll ignore this). The
buffargument is the string data being passed to you. The buffer address is actually a user-space buffer, so you won't be able to read it directly. The
lenargument defines how much data in
buffis being written. The
dataargument is a pointer to the private data (see
Listing 7). In the module, I declare a function of this type to deal with the incoming data.
Linux provides a set of APIs to move data between user space and kernel space. For the
write_proccase, I use the
copy_from_userfunctions to manipulate the user-space data.
The Read Callback function
You can read data from a /proc entry (from the kernel to the user) by using the
read_procfunction. This function has the following prototype:
int mod_read( char *page, char **start, off_t off, int count, int *eof, void *data ); |
pageargument is the location into which you write the data intended for the user, where
countdefines the maximum number of characters that can be written. Use the
startand
offarguments when returning more than a page of data (typically 4KB). When all the data have been written, set the
eof(end-of-file) argument. As with
write,
datarepresents private data. The
pagebuffer provided here is in kernel space. Therefore, you can write to it without having to invoke
copy_to_user.
Other useful functions
You can also create directories within the /proc filesystem using
proc_mkdiras well as
symlinkswith
proc_symlink. For simple /proc entries that require only a
readfunction, use
create_proc_read_entry, which creates the /proc entry and initializes the
read_procfunction in one call. The prototypes for these functions are shown in Listing 8.
Listing 8. Other useful /proc functions
/* Create a directory in the proc filesystem */ struct proc_dir_entry *proc_mkdir( const char *name, struct proc_dir_entry *parent ); /* Create a symlink in the proc filesystem */ struct proc_dir_entry *proc_symlink( const char *name, struct proc_dir_entry *parent, const char *dest ); /* Create a proc_dir_entry with a read_proc_t in one call */ struct proc_dir_entry *create_proc_read_entry( const char *name, mode_t mode, struct proc_dir_entry *base, read_proc_t *read_proc, void *data ); /* Copy buffer to user-space from kernel-space */ unsigned long copy_to_user( void __user *to, const void *from, unsigned long n ); /* Copy buffer to kernel-space from user-space */ unsigned long copy_from_user( void *to, const void __user *from, unsigned long n ); /* Allocate a 'virtually' contiguous block of memory */ void *vmalloc( unsigned long size ); /* Free a vmalloc'd block of memory */ void vfree( void *addr ); /* Export a symbol to the kernel (make it visible to the kernel) */ EXPORT_SYMBOL( symbol ); /* Export all symbols in a file to the kernel (declare before module.h) */ EXPORT_SYMTAB |
Fortune cookies through the /proc filesystem
Here's an LKM that supports both reading and writing. This simple application provides a fortune cookie dispenser. After the module is loaded, the user can load text fortunes into it using the
echocommand and then read them back out individually using the
catcommand.
Listing 9 presents the basic module functions and variables. The
initfunction (
init_fortune_module) allocates space for the cookie pot with
vmallocand then clears it out with
memset. With the
cookie_potallocated and empty, I create my
proc_dir_entrynext in the /proc root called
fortune. With
proc_entrysuccessfully created, I initialize my local variables and the
proc_entrystructure. I load my /proc
readand
writefunctions (shown in Listings 9 and 10) and identify the owner of the module. The
cleanupfunction simply removes the entry from the /proc filesystem and then frees the memory that
cookie_potoccupies.
The
cookie_potis a page in length (4KB) and is managed by two indexes. The first,
cookie_index, identifies where the next cookie will be written. The variable
next_fortuneidentifies where the next cookie will be read for output. I simply wrap
next_fortuneto the beginning when all fortunes have been read.
Listing 9. Module init/cleanup and variables
#include <linux/module.h> |
-ENOSPC, which is communicated to the user process. Otherwise, the space exists, and I use
copy_from_userto copy the user buffer directly into the
cookie_pot. I then increment the
cookie_index(based upon the length of the user buffer) and NULL terminate the string. Finally, I return the number of characters actually written into the
cookie_potthat is propagated to the user process.
Listing 10. Function to write a fortune
ssize_t fortune_write( struct file *filp, const char __user *buff, |
page) is already in kernel space, I can manipulate it directly and use
sprintfto write the next fortune. If the
next_fortuneindex is greater than the
cookie_index(next position to write), I wrap
next_fortuneback to zero, which is the index of the first fortune. After the fortune is written to the user buffer, I increment the
next_fortuneindex by the length of the last fortune written. This places me at the index of the next available fortune. The length of the fortune is returned and propagated to the user.
Listing 11. Function to read a fortune
int fortune_read( char *page, char **start, off_t off, int count, int *eof, void *data ) { int len; if (off > 0) { *eof = 1; return 0; } /* Wrap-around */ if (next_fortune >= cookie_index) next_fortune = 0; len = sprintf(page, "%s\n", &cookie_pot[next_fortune]); next_fortune += len; return len; } |
Listing 12. Demonstrating the fortune cookie LKM
[root@plato]# insmod fortune.ko [root@plato]# echo "Success is an individual proposition. Thomas Watson" > /proc/fortune [root@plato]# echo "If a man does his best, what else is there? Gen. Patton" > /proc/fortune [root@plato]# echo "Cats: All your base are belong to us. Zero Wing" > /proc/fortune [root@plato]# cat /proc/fortune Success is an individual proposition. Thomas Watson [root@plato]# cat /proc/fortune If a man does his best, what else is there? Gen. Patton [root@plato]# |
Resources below.
相关文章推荐
- Access the Linux kernel using the /proc filesystem
- Access the Linux kernel using the /proc filesystem
- Peeking into Linux kernel-land using /proc filesystem for quick’n’dirty troubleshooting
- Peeking into Linux kernel-land using /proc filesystem for quick’n’dirty troubleshooting
- 《Understanding the Linux kernel》学习笔记 Chapter 12: The Virtual Filesystem
- Using the Oracle ASM Cluster File System (Oracle ACFS) on Linux, Part One
- You cannot access the client's file system using the FileSystemObject in your ASP code
- Get and display the size of file and directory in Linux system using du command 获取和现实linux文件大小(三)
- Using the Oracle ASM Cluster File System (Oracle ACFS) on Linux,Part Two
- Using the Oracle ASM Cluster File System (Oracle ACFS) on Linux, Part Three
- sockfs: the most simple file system in linux kernel
- The proc filesystem 2
- Demystifying the Linux Kernel Socket File Systems (Sockfs)
- 【linux报错】安装好虚拟机后,挂载光盘报错:mount:you must specify the filesystem type
- How Does The Linux File System Work?
- 简单Dream-虚拟机加载linux加载光盘到mnt报错:mount: you must specify the filesystem type
- The Kernel Newbie Corner: Kernel Debugging Using proc "Sequence" Files--Part 1
- linux挂载硬盘重启后出现an error occurred during the file system check错误最有效的解决办法
- linux 挂载光盘:mount: you must specify the filesystem type
- 虚拟机linux挂载光盘显示:mount: you must specify the filesystem type