《现代操作系统4th》英文版阅读笔记 4.3章 文件系统实现
2015-01-07 23:40
323 查看
Probably the most important issue in implementing file storage is keeping track of which disk blocks go with which file
实现文件存储最重要的一点就是如何在磁盘跟踪哪个磁盘块(block)保存哪一个文件。
contiguous allocation also has a very serious drawback: over the course of
time, the disk becomes fragmented
连续性分配一个非常严重的缺点就是随着经常的使用,磁盘碎片化严重。
When a file is removed, its blocks
are naturally freed, leaving a run of free blocks on the disk. The disk is not compacted on the spot to squeeze out the hole, since that would involve copying all the blocks following the hole, potentially
millions of blocks, which would take hours or even days with large disks.
当一个文件删除掉后,该文件占用的磁盘块也被释放,在磁盘上会留下一连串的空闲块。磁盘并不会进行压缩以便把这些空闲块集中在一起,因为这样可能需要把空闲块后面的磁盘块都进行复制,数量可能非常大,可能花费几个小时甚至几天的时间。
The situation with DVDs is a bit more complicated. In principle,
a 90-min movie could be encoded as a single file of length about 4.5 GB, but the file system used,UDF(Universal
Disk Format), uses a 30-bit number to represent file length, which limits files to 1 GB. As a consequence, DVD movies are generally stored as three
or four 1-GB files, each of which is contiguous. These physical pieces of the single logical file (the movie) are calledextents。
DVD的情况有点复杂。原则上,一个90分钟的电影可以编码为一个连续长度大小为4.5G的文件,但是由于DVD的文件系统采用的UDF标准,这个标准里面用30位的数字来表示文件长度,2的30次方为1GB,所以就限制了文件的最大长度为1GB。所以,DVD在存储电影时通常把电影存储为3个或4个1-GB的文件,每个文件都是连续的。单个逻辑文件(电影文件)的这些物理文件称为extents.
(链表格式的磁盘分配)
Also, the amount of data storage in a block is no longer a power
of two because the pointer takes up a few bytes. While not fatal, having a peculiar size is less efficient because many programs read and write in blocks whose size is a power of two. With the first few bytes of each block occupied
by a pointer to the next block, reads of the full block size require acquiring and concatenating information from two disk blocks, which generates extra overhead due to the copying.
此外,一个磁盘块上存储的数据大小不再是2的幂次方,因为指针占用了一些字节。虽然问题不致命,随意的磁盘大小效率比较低,因为许多程序在磁盘上读写时大小都是2的幂次方长度。由于每个磁盘块中一部分字节被下一个磁盘块指针占用,所以如果要读取磁盘块长度大小的数据的话就要获取结合两个磁盘块中的数据,产生额外的花销。
FAT(file allocation table)
Both disadvantages of the linked-list allocation can be eliminated by taking the pointer word from each disk block and putting it in a table in memory
链表分配方法的缺点可以消除,通过把每一个磁盘块中的指针从磁盘块中移除放在内存中的表中。
The primary disadvantage of this method is that the entire table
must be in memory all the time to make it work. With a 1-TB disk and a 1-KB block size, the table needs 1 billion entries, one for each of the 1 billion disk blocks. Each entry has
to be a minimum of 3 bytes. For speed in lookup, they should be 4 bytes. Thus the table will take up 3 GB or 2.4 GB of main memory all the time, depending on whether the system is optimized for space
or time. Not wildly practical. Clearly the FAT idea does not scale well to large disks. It was the original MS-DOS file system and is still fully supported by all versions of Windows though.
这个方法最基本的缺陷就是整个表必须一直停留在内存中才能工作。如果有一个1-TB的磁盘,每个磁盘块1KB,那么这个表就有10亿项(2的30次幂),每项对应一个磁盘块。每一个表项至少3个字节,为了快速查找,他们应该是4个字节大小。所以这个表会一直占用3GB或2.4GB(应该是4GB吧)的主内存,依赖系统是否优化了空间或时间。不是普遍实用。显然FAT方法不适应非常大的磁盘。FAT格式是最初MD-DOS文件系统采用,目前仍然被各个版本的WINDOWS系统支持。
One
problem with i-nodes is that if each one has room for a fixed number of disk addresses, what happens when a file grows beyond this limit? One solution is to reserve the last disk address not for a data
block, but instead for the address of a
block containing more disk-block addresses,
i-node方法的一个问题就是如果每个i-node的空间只能保存固定数量的磁盘地址,那么一个文件大小超出地址范围限制怎么办?其中一个方法就是保留最后一个磁盘地址,这个磁盘里面保存的不是数据,而是包含更多磁盘块地址的磁盘块地址。(也就是说最后一个地址指向的磁盘中包含的是其他磁盘块的地址,这些磁盘块中包含了更多的文件数据)
《现代操作系统4th》英文版下载地址
点击下图
实现文件存储最重要的一点就是如何在磁盘跟踪哪个磁盘块(block)保存哪一个文件。
contiguous allocation also has a very serious drawback: over the course of
time, the disk becomes fragmented
连续性分配一个非常严重的缺点就是随着经常的使用,磁盘碎片化严重。
When a file is removed, its blocks
are naturally freed, leaving a run of free blocks on the disk. The disk is not compacted on the spot to squeeze out the hole, since that would involve copying all the blocks following the hole, potentially
millions of blocks, which would take hours or even days with large disks.
当一个文件删除掉后,该文件占用的磁盘块也被释放,在磁盘上会留下一连串的空闲块。磁盘并不会进行压缩以便把这些空闲块集中在一起,因为这样可能需要把空闲块后面的磁盘块都进行复制,数量可能非常大,可能花费几个小时甚至几天的时间。
The situation with DVDs is a bit more complicated. In principle,
a 90-min movie could be encoded as a single file of length about 4.5 GB, but the file system used,UDF(Universal
Disk Format), uses a 30-bit number to represent file length, which limits files to 1 GB. As a consequence, DVD movies are generally stored as three
or four 1-GB files, each of which is contiguous. These physical pieces of the single logical file (the movie) are calledextents。
DVD的情况有点复杂。原则上,一个90分钟的电影可以编码为一个连续长度大小为4.5G的文件,但是由于DVD的文件系统采用的UDF标准,这个标准里面用30位的数字来表示文件长度,2的30次方为1GB,所以就限制了文件的最大长度为1GB。所以,DVD在存储电影时通常把电影存储为3个或4个1-GB的文件,每个文件都是连续的。单个逻辑文件(电影文件)的这些物理文件称为extents.
(链表格式的磁盘分配)
Also, the amount of data storage in a block is no longer a power
of two because the pointer takes up a few bytes. While not fatal, having a peculiar size is less efficient because many programs read and write in blocks whose size is a power of two. With the first few bytes of each block occupied
by a pointer to the next block, reads of the full block size require acquiring and concatenating information from two disk blocks, which generates extra overhead due to the copying.
此外,一个磁盘块上存储的数据大小不再是2的幂次方,因为指针占用了一些字节。虽然问题不致命,随意的磁盘大小效率比较低,因为许多程序在磁盘上读写时大小都是2的幂次方长度。由于每个磁盘块中一部分字节被下一个磁盘块指针占用,所以如果要读取磁盘块长度大小的数据的话就要获取结合两个磁盘块中的数据,产生额外的花销。
FAT(file allocation table)
Both disadvantages of the linked-list allocation can be eliminated by taking the pointer word from each disk block and putting it in a table in memory
链表分配方法的缺点可以消除,通过把每一个磁盘块中的指针从磁盘块中移除放在内存中的表中。
The primary disadvantage of this method is that the entire table
must be in memory all the time to make it work. With a 1-TB disk and a 1-KB block size, the table needs 1 billion entries, one for each of the 1 billion disk blocks. Each entry has
to be a minimum of 3 bytes. For speed in lookup, they should be 4 bytes. Thus the table will take up 3 GB or 2.4 GB of main memory all the time, depending on whether the system is optimized for space
or time. Not wildly practical. Clearly the FAT idea does not scale well to large disks. It was the original MS-DOS file system and is still fully supported by all versions of Windows though.
这个方法最基本的缺陷就是整个表必须一直停留在内存中才能工作。如果有一个1-TB的磁盘,每个磁盘块1KB,那么这个表就有10亿项(2的30次幂),每项对应一个磁盘块。每一个表项至少3个字节,为了快速查找,他们应该是4个字节大小。所以这个表会一直占用3GB或2.4GB(应该是4GB吧)的主内存,依赖系统是否优化了空间或时间。不是普遍实用。显然FAT方法不适应非常大的磁盘。FAT格式是最初MD-DOS文件系统采用,目前仍然被各个版本的WINDOWS系统支持。
One
problem with i-nodes is that if each one has room for a fixed number of disk addresses, what happens when a file grows beyond this limit? One solution is to reserve the last disk address not for a data
block, but instead for the address of a
block containing more disk-block addresses,
i-node方法的一个问题就是如果每个i-node的空间只能保存固定数量的磁盘地址,那么一个文件大小超出地址范围限制怎么办?其中一个方法就是保留最后一个磁盘地址,这个磁盘里面保存的不是数据,而是包含更多磁盘块地址的磁盘块地址。(也就是说最后一个地址指向的磁盘中包含的是其他磁盘块的地址,这些磁盘块中包含了更多的文件数据)
《现代操作系统4th》英文版下载地址
点击下图
相关文章推荐
- 《现代操作系统4th》英文版阅读笔记 4.3.5章 实现文件共享
- 《现代操作系统4th》英文版阅读笔记 4.3.4章 LFS(the Log-structured File System)系统
- 《现代操作系统4th》英文版阅读笔记 4.3.3章 目录的实现
- 云计算学习笔记004---hadoop的简介,以及安装,用命令实现对hdfs系统进行文件的上传下载
- 文献阅读笔记之 - - 48V锂电池管理系统的设计与实现(贾小龙)
- 安装、进程-云计算学习笔记---hadoop的简介,以及安装,用命令实现对hdfs系统进行文件的上传下载-by小雨
- 鸟哥的LINUX私房菜基础篇第三版 阅读笔记 三 Linux磁盘与文件系统管理
- 关于linux0.11文件系统高速缓冲的见解【《linux内核完全注释》阅读笔记】
- 12.11 阅读android项目源码笔记-水波view,左右翻页三种实现,图片、文件加密
- Linux内核设计与实现 学习笔记(3)虚拟文件系统
- 操作系统学习笔记:文件系统实现
- A study of linux file system evolution 阅读笔记(文件系统 补丁)
- 文件系统笔记七、文件目录、属性、共享、挂载的实现方式
- Linux根文件系统裁剪 论文阅读笔记
- 个人学习笔记---文件系统的实现
- linux应用编程笔记(5)系统调用文件编程方法实现文件复制
- sql server 关于表中只增标识问题 C# 实现自动化打开和关闭可执行文件(或 关闭停止与系统交互的可执行文件) ajaxfileupload插件上传图片功能,用MVC和aspx做后台各写了一个案例 将小写阿拉伯数字转换成大写的汉字, C# WinForm 中英文实现, 国际化实现的简单方法 ASP.NET Core 2 学习笔记(六)ASP.NET Core 2 学习笔记(三)
- Windows文件系统(阅读笔记总结)
- ICE笔记(06):简单文件系统的设计、实现
- 鸟哥的LINUX私房菜基础篇第三版 阅读笔记 四 档案的文件系统的压缩和打包