Linux文件写入的工作原理
2015-11-18 15:41
363 查看
背景
做运维的同学估计很多都遇到过如下这个问题:
程序启动了多个线程或多个进程,这些线程或进程都会写入一个文件,这时就有可能会造成文件错乱的情况,也就是多个线程或进程同时写入一个文件,造成这个文件错乱了,有些行被插入到了另一些行里去了。
这时很多同学想到了可以用文件锁来解决这个问题,很好,但你知不知道触发文件错乱是有一定条件的,在一次写入文件很小的情况下是不会造成文件错乱的。
原理分析
操作系统最小原子的概念。其实对于Linux系统,有一个最小操作原子的变量,有的是1024bytes,有的是4096bytes,如果一次写入不超过这个阀值,是不会引起文件错乱的。
下面我贴出一个shell脚本来模拟这种情况。
[code]# ./test_appends.sh 4096Launching 20 worker processes Each line will be 4096 characters long Waiting for processes to exit Testing output file .......................[snip].... All's good! The output file had no corrupted lines. # ./test_appends.sh 4097Launching 20 worker processes Each line will be 4097 characters long Waiting for processes to exit Testing output file .......................[snip]....Found 27 instances of corrupted lines
[code] ############################################################################# # # This script aims to test/prove that you can append to a single file from # multiple processes with buffers up to a certain size, without causing one # process' output to corrupt the other's. # # The script takes one parameter, the length of the buffer. It then creates # 20 worker processes which each write 50 lines of the specified buffer # size to the same file. When all processes are done outputting, it tests # the output file to ensure it is in the correct format. # ############################################################################# NUM_WORKERS=20 LINES_PER_WORKER=50 OUTPUT_FILE=/tmp/out.tmp # each worker will output $LINES_PER_WORKER lines to the output file run_worker() { worker_num=$1 buf_len=$2 # Each line will be a specific character, multiplied by the line length. # The character changes based on the worker number. filler_len=$((${buf_len}-1)) # -1 -> leave room for \n filler_char=$(printf \\$(printf '%03o' $(($worker_num+64)))) line=`for i in $(seq 1 $filler_len);do echo -n $filler_char;done` for i in $(seq 1 $LINES_PER_WORKER) do echo $line >> $OUTPUT_FILE done } if [ "$1" = "worker" ]; then run_worker $2 $3 exit fi buf_len=$1 if [ "$buf_len" = "" ]; then echo "Buffer length not specified, defaulting to 4096" buf_len=4096 fi rm -f $OUTPUT_FILE echo Launching $NUM_WORKERS worker processes for i in $(seq 1 $NUM_WORKERS) do $0 worker $i $buf_len & pids[$i]=${!} done echo Each line will be $buf_len characters long echo Waiting for processes to exit for i in $(seq 1 $NUM_WORKERS) do wait ${pids[$i]} done # Now we want to test the output file. Each line should be the same letter # repeated buf_len-1 times (remember the \n takes up one byte). If we had # workers writing over eachother's lines, then there will be mixed characters # and/or longer/shorter lines. echo Testing output file # Make sure the file is the right size (ensures processes didn't write over # eachother's lines) expected_file_size=$(($NUM_WORKERS * $LINES_PER_WORKER * $buf_len)) actual_file_size=`cat $OUTPUT_FILE | wc -c` if [ "$expected_file_size" -ne "$actual_file_size" ]; then echo Expected file size of $expected_file_size, but got $actual_file_size else # File size is OK, test the actual content # Only use newer versions of grep because older ones are way too slow with # backreferences [[ $(grep --version) =~ [^[:digit:]]*([[:digit:]]+)\.([[:digit:]]+) ]] grep_ver="${BASH_REMATCH[1]}${BASH_REMATCH[2]}" if [ "$grep_ver" -ge "216" ]; then num_lines=$(grep -v "^\(.\)\1\{$((${buf_len}-2))\}$" $OUTPUT_FILE | wc -l) else # Scan line by line in bash, which isn't that speedy, but is good enough # Note: Doesn't work on cygwin for lines < 255 line_length=$((${buf_len}-1)) num_lines=0 for line in `cat $OUTPUT_FILE` do if ! [[ $line =~ ^${line:0:1}{$line_length}$ ]]; then num_lines=$(($num_lines+1)) fi; echo -n . done echo fi if [ "$num_lines" -gt "0" ]; then echo "Found $num_lines instances of corrupted lines" else echo "All's good! The output file had no corrupted lines. $size" fi fi rm -f $OUTPUT_FILE
相关文章推荐
- linux中运行jar文件并写入日志
- Linux之C语言中如何抛出异常或将异常写入日志文件中
- PAM-Linux可插拔认证模块(PAM)的配置文件、工作原理与流程 .
- Linux可插拔认证模块(PAM)的配置文件、工作原理与流程
- [Linux文件]每隔1分钟创建一个文件,并且每隔1秒将当前时间信息写入到文件
- linux 文件系统的管理 (硬盘) 工作原理
- Linux文件系统的工作原理
- Linux下基于libxml2写入KVM与数据库配置文件
- linux实战(四)----写入文件----实例解析
- 记录 Linux 服务器磁盘空间还有但是服务创建写入文件的问题 解决
- .NET跨平台之旅:在Linux上将ASP.NET 5运行日志写入文件
- 文件系统 之:linux 文件系统的管理 (硬盘) 工作原理
- Linux创建crontab,定时将信息记录写入文件
- php,linux写入文件时 实现换行的注意事项
- linux 下echo命令写入文件内容
- linux 文件系统的管理 (硬盘) 工作原理
- block_dump观察Linux IO写入的具体文件
- 【Linux】make的工作原理和makefile文件
- 访问Linux的Apache web项目文件写入不成功问题
- [Linux文件]带回车换行的写入字符串实例