ll和du显示的文件大小不一致问题研究
2018-01-24 11:11
447 查看
最近做项目遇到此问题,来研究一下.
曾经有几次,我用ls和du查看一个文件的大小,发现二者显示出来的大小并不一致,例如:
这里ls显示出fs.img的大小是1073741824字节(1GB),而du显示出fs.img的大小是0。
原来一直没有深究这个问题,今天特来补上。
造成这二者不同的原因主要有两点:
稀疏文件(sparse file)
ls和du显示出的size有不同的含义
先来看一下稀疏文件。稀疏文件只文件中有“洞”(hole)的文件,例如有C写一个创建有“洞”的文件:
从这个文件可以看出,创建一个有“洞”的文件主要是用lseek移动文件指针超过文件末尾,然后write,这样就形成了一个“洞”。
用Shell也可以创建稀疏文件:
使用稀疏文件的优点如下(Wikipedia上的原文):
The advantage of sparse files is that storage is only allocated when actually needed: disk space is saved, and large files can be created even if there is
insufficient free space on the file system.
即稀疏文件中的“洞”可以不占存储空间。
再来看一下ls和du输出的文件大小的含义(Wikipedia上的原文):
The du command which prints the occupied space, while ls print the apparent size。
换句话说,ls显示文件的“逻辑上”的size,而du显示文件“物理上”的size,即du显示的size是文件在硬盘上占据了多少个block计算出来的。举个例子:
这里我们先创建一个文件1B.txt,大小是一个字节,ls显示出的size就是1Byte,而1B.txt这个文件在硬盘上会占用N个block,然后根据每个block的大小计算出来的。这里之所以用了N,而不是一个具体的数字,是因为隐藏在幕后的细节还很多,例如Fragment size,我们以后再讨论。
当然,上述这些都是ls和du的缺省行为,ls和du分别提供了不同参数来改变这些行为。比如ls的-s选项(print the allocated size of each file, in blocks)和du的--apparent-size选项(print apparent sizes, rather than disk usage; although the apparent
size is usually smaller, it may be larger due to holes in (`sparse') files, internal fragmentation, indirect blocks, and the like)。
此外,对于拷贝稀疏文件,cp缺省情况下会做一些优化,以加快拷贝的速度。例如:
打开log文件,我们发现cp命令只是read和lseek,并没有write。
这和cp的关于sparse的选项有关,看cp的manpage:
By default, sparse SOURCE files are detected by a crude heuristic and the corresponding DEST file is made sparse as well. That is the behavior selected
by --sparse=auto. Specify --sparse=always to create a sparse DEST file whenever the SOURCE file contains a long enough sequence of zero bytes. Use --sparse=never to inhibit creation of sparse files.
看了一下cp的源代码,发现每次read之后,cp会判断读到的内容是不是都是0,如果是就只lseek而不write。
当然对于sparse文件的处理,对于用户都是透明的。
分类: Linux
标签: linux, ls, du, sparse
file
About Sparse Files
This document describes sparse files, exposure due to sparse files, and the effects of certain commands on sparse files. This document applies to all versions of AIX.
Overview
Creating a sparse file
The effect of certain commands
on sparse files
Many applications, particularly databases, maintain data in sparse files. A sparse file is a file with empty space, or gaps, left open for future addition of data. If the empty spaces are filled with the ASCII null character and the spaces are large enough,
the file will be sparse, and disk blocks will not be allocated to it.
This creates an exposure: a large file will be created, but the disk blocks will not be allocated. Then, as data is added to the file, the disk blocks will be allocated but there may not be enough free disk blocks in the file system. Then the file system will
be full and writes to any file in the file system will fail.
You can prevent these problems by either assuring that you have no sparse files on your system or by planning to have enough free space in the file system for the future allocation of the blocks.
You also need to be aware of how you manipulate sparse or potentially sparse files because you can easily change them from sparse to not sparse or vice-versa.
An example sparse file can be created fairly easily. To do this, open the file, seek to a large address, and write some data. This can be demonstrated with the dd command, as
follows:
First, create a regular file:
The output of the ls command will be similar to:
Use the fileplace command to see how many allocated and unallocated blocks are included in the file notsparse.
(NOTE: perfagent.tools must be installed to run the fileplace command at AIX 4.x and 5.x.)
The output will look similiar to:
(NOTE: Performance Analysis and Control Commands [perfagent.tools] must be installed to enable
the fileplace command.)
The du command will also reflect how many 512-byte blocks a file occupies.
Example output:
Now create a sparse file using the regular file notsparse as input:
Example output:
The dd command takes the data from the regular file and places it 100 512-byte blocks into thesparse.1 file.
Note that nothing is written to the initial 99 512-byte blocks. The following steps show the characteristics of the resulting file.
The ls command reports the distance from block zero to the last block in the file:
Example output:
The fileplace command tells the story accurately - there are 12 unallocated 4K blocks and one allocated 4K block in the file:
Example output:
The du command reports the number of allocated blocks the file takes:
Example output:
The restore command aggressively preserves sparseness. In fact, the restore command will unallocate
any blocks filled with zeroes, thus making a file sparse.
The cp command does not preserve the sparseness of a file.
If you create a backup using the cpio command on sparse files, you will need to use the paxcommand
to restore that data. Using the cpio command to restore the data will not preserve sparseness.
Using the dd command on the file itself does not preserve sparseness. However, using dd on the
file system device does preserve the state of the individual files.
Example: Backing up a logical volume:
The mksysb command uses backup/restore. See the section on backup/restore.
NOTE: The pax command can read tar archives and can read cpio archives
if the c flag was used.
The pax command aggressively preserves sparseness. In fact, the pax command will unallocate any
blocks filled with zeroes, thus making a file sparse.
Sysback will use either backup by name or inode to backup the data on the system. See the section on backup.
If you create a backup using the tar command on sparse files, you will have to use the paxcommand
to restore that data. Using the tar command to restore the data will not preserve sparseness.
Cross reference information
转载自:http://blog.csdn.net/loryliu/article/details/25337409
曾经有几次,我用ls和du查看一个文件的大小,发现二者显示出来的大小并不一致,例如:
bl@d3:~/test/sparse_file$ ls -l fs.img -rw-r--r-- 1 bl bl 1073741824 2012-02-17 05:09 fs.img
bl@d3:~/test/sparse_file$ du -sh fs.img 0 fs.img
这里ls显示出fs.img的大小是1073741824字节(1GB),而du显示出fs.img的大小是0。
原来一直没有深究这个问题,今天特来补上。
造成这二者不同的原因主要有两点:
稀疏文件(sparse file)
ls和du显示出的size有不同的含义
先来看一下稀疏文件。稀疏文件只文件中有“洞”(hole)的文件,例如有C写一个创建有“洞”的文件:
#include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> int main(int argc, char *argv[]) { int fd = open("sparse.file", O_RDWR|O_CREAT); lseek(fd, 1024, SEEK_CUR); write(fd, "\0", 1); return 0; }
从这个文件可以看出,创建一个有“洞”的文件主要是用lseek移动文件指针超过文件末尾,然后write,这样就形成了一个“洞”。
用Shell也可以创建稀疏文件:
$ dd if=/dev/zero of=sparse_file.img bs=1M seek=1024 count=0 0+0 records in 0+0 records out
使用稀疏文件的优点如下(Wikipedia上的原文):
The advantage of sparse files is that storage is only allocated when actually needed: disk space is saved, and large files can be created even if there is
insufficient free space on the file system.
即稀疏文件中的“洞”可以不占存储空间。
再来看一下ls和du输出的文件大小的含义(Wikipedia上的原文):
The du command which prints the occupied space, while ls print the apparent size。
换句话说,ls显示文件的“逻辑上”的size,而du显示文件“物理上”的size,即du显示的size是文件在硬盘上占据了多少个block计算出来的。举个例子:
bl@d3:~/test/sparse_file$ echo -n 1 > 1B.txt bl@d3:~/test/sparse_file$ ls -l 1B.txt -rw-r--r-- 1 bl bl 1 2012-02-19 05:17 1B.txt bl@dl3:~/test/sparse_file$ du -h 1B.txt 4.0K 1B.txt
这里我们先创建一个文件1B.txt,大小是一个字节,ls显示出的size就是1Byte,而1B.txt这个文件在硬盘上会占用N个block,然后根据每个block的大小计算出来的。这里之所以用了N,而不是一个具体的数字,是因为隐藏在幕后的细节还很多,例如Fragment size,我们以后再讨论。
当然,上述这些都是ls和du的缺省行为,ls和du分别提供了不同参数来改变这些行为。比如ls的-s选项(print the allocated size of each file, in blocks)和du的--apparent-size选项(print apparent sizes, rather than disk usage; although the apparent
size is usually smaller, it may be larger due to holes in (`sparse') files, internal fragmentation, indirect blocks, and the like)。
此外,对于拷贝稀疏文件,cp缺省情况下会做一些优化,以加快拷贝的速度。例如:
strace cp fs.img fs.img.copy >log 2>&1
打开log文件,我们发现cp命令只是read和lseek,并没有write。
stat("fs.img.copy", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 stat("fs.img", {st_mode=S_IFREG|0644, st_size=1073741824, ...}) = 0 stat("fs.img.copy", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 open("fs.img", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=1073741824, ...}) = 0 open("fs.img.copy", O_WRONLY|O_TRUNC) = 4 fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 mmap(NULL, 532480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f90df965000 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 524288) = 524288 lseek(4, 524288, SEEK_CUR) = 524288 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 524288) = 524288 lseek(4, 524288, SEEK_CUR) = 1048576 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 524288) = 524288 lseek(4, 524288, SEEK_CUR) = 1572864
这和cp的关于sparse的选项有关,看cp的manpage:
By default, sparse SOURCE files are detected by a crude heuristic and the corresponding DEST file is made sparse as well. That is the behavior selected
by --sparse=auto. Specify --sparse=always to create a sparse DEST file whenever the SOURCE file contains a long enough sequence of zero bytes. Use --sparse=never to inhibit creation of sparse files.
看了一下cp的源代码,发现每次read之后,cp会判断读到的内容是不是都是0,如果是就只lseek而不write。
当然对于sparse文件的处理,对于用户都是透明的。
分类: Linux
标签: linux, ls, du, sparse
file
About Sparse Files
Technote (FAQ)
Question
About Sparse Files
Answer
This document describes sparse files, exposure due to sparse files, and the effects of certain commands on sparse files. This document applies to all versions of AIX.Overview
Creating a sparse file
The effect of certain commands
on sparse files
the file will be sparse, and disk blocks will not be allocated to it.
This creates an exposure: a large file will be created, but the disk blocks will not be allocated. Then, as data is added to the file, the disk blocks will be allocated but there may not be enough free disk blocks in the file system. Then the file system will
be full and writes to any file in the file system will fail.
You can prevent these problems by either assuring that you have no sparse files on your system or by planning to have enough free space in the file system for the future allocation of the blocks.
You also need to be aware of how you manipulate sparse or potentially sparse files because you can easily change them from sparse to not sparse or vice-versa.
follows:
First, create a regular file:
date > notsparse ls -l
The output of the ls command will be similar to:
total 8 -rw-r--r-- 1 root sys 29 Dec 21 08:12 notsparse
Use the fileplace command to see how many allocated and unallocated blocks are included in the file notsparse.
(NOTE: perfagent.tools must be installed to run the fileplace command at AIX 4.x and 5.x.)
fileplace notsparse
The output will look similiar to:
File: notsparse Size: 29 bytes Vol: /dev/lv03 Blk Size: 4096 Frag size: 4096 Nfrags: 1 Compress: no Logical Fragment ---------------- 00716 1 frags 4096 bytes, 100.0%
(NOTE: Performance Analysis and Control Commands [perfagent.tools] must be installed to enable
the fileplace command.)
The du command will also reflect how many 512-byte blocks a file occupies.
du -rs *
Example output:
8 notsparse
Now create a sparse file using the regular file notsparse as input:
touch sparse.1 dd if=notsparse of=sparse.1 seek=100
Example output:
dd: 0+1 records in. dd: 0+1 records out.
The dd command takes the data from the regular file and places it 100 512-byte blocks into thesparse.1 file.
Note that nothing is written to the initial 99 512-byte blocks. The following steps show the characteristics of the resulting file.
The ls command reports the distance from block zero to the last block in the file:
ls -l
Example output:
total 16 -rw-r--r-- 1 root sys 29 Dec 21 08:12 notsparse -rw-r--r-- 1 root sys 51229 Dec 21 08:13 sparse.1
The fileplace command tells the story accurately - there are 12 unallocated 4K blocks and one allocated 4K block in the file:
fileplace sparse.1
Example output:
File: sparse.1 Size: 51229 bytes Vol: /dev/lv03 Blk Size: 4096 Frag Size: 4096 Nfrags: 1 Compress: no Logical Fragment ---------------- unallocated 12 frags 49152 Bytes, 0.0% 0000769 1 frags 4096 Bytes, 100.0%
The du command reports the number of allocated blocks the file takes:
du -rs *
Example output:
8 notsparse8 sparse.1
backup/restore (by name and inode)
The restore command aggressively preserves sparseness. In fact, the restore command will unallocateany blocks filled with zeroes, thus making a file sparse.
cp
The cp command does not preserve the sparseness of a file.
cpio
If you create a backup using the cpio command on sparse files, you will need to use the paxcommandto restore that data. Using the cpio command to restore the data will not preserve sparseness.
dd
Using the dd command on the file itself does not preserve sparseness. However, using dd on thefile system device does preserve the state of the individual files.
Example: Backing up a logical volume:
dd if=/dev/datalv of=/dev/rmt0 ibs=4096 obs=1024 conv=sync
mksysb
The mksysb command uses backup/restore. See the section on backup/restore.
pax
NOTE: The pax command can read tar archives and can read cpio archivesif the c flag was used.
The pax command aggressively preserves sparseness. In fact, the pax command will unallocate any
blocks filled with zeroes, thus making a file sparse.
sysback
Sysback will use either backup by name or inode to backup the data on the system. See the section on backup.
tar
If you create a backup using the tar command on sparse files, you will have to use the paxcommandto restore that data. Using the tar command to restore the data will not preserve sparseness.
Segment | Product | Component | Platform | Version | Edition |
---|---|---|---|---|---|
Operating System | AIX | Process and memory management |
相关文章推荐
- linux 文件大小ll和du不一致问题
- linux 文件大小ll和du不一致问题
- 关于右键属性与du -sh显示的文件大小不一致的解决
- u-boot2010.03 移植篇(三)-----修正配置文件.解决内存大小显示问题,真的没什么可看的
- u-boot2010.03 移植篇(三)-----修正配置文件.解决内存大小显示问题,真的没什么可看的
- 使用ls和du显示出来的文件大小有差别
- lsof处理df和du大小不一致的问题
- 设置textView的字体大小和资源文件不一致的问题
- windows7下php5.4成功安装imageMagick,及解决php imagick常见错误问题。(phpinfo中显示不出来是因为:1.imagick软件本身、php本身、php扩展三方版本要一致,2.需要把CORE_RL_*.dll多个文件放到/php/目录下面)
- 为什么用ls和du显示出来的文件大小有差别?
- df和du显示的磁盘空间使用情况不一致的原因及处理(文件删除后磁盘空间不释放)
- Linux被占用的日志文件清理后磁盘空间释放,但ll查看仍然显示曾经达到的最大大小,此时copy该文件也是按曾经最大的值占用空间
- 上传文件细节处理问题(包括中文乱码、限制文件大小、显示上传速度、删除临时文件,随机生成文件夹等)
- linux系统中df 与du 大小显示不一致
- fopen里的wb和w的区别即fwrite的返回值与实际文件大小不一致的问题
- android开发模拟器显示图片大小与真实图片大小不一致问题
- 为什么用ls和du显示出来的文件大小有差别?
- NET上传大文件出现网页无法显示的问题 默认的上传文件大小是4M
- du -sg 和df -g 所看的文件系统大小不一致
- android开发模拟器显示图片大小与真实图片大小不一致问题