Bash:常用命令工具-uniq
2015-04-16 16:55
411 查看
NAME uniq - report or omit repeated lines SYNOPSIS uniq [OPTION]... [INPUT [OUTPUT]] DESCRIPTION Filter adjacent matching lines from INPUT (or standard input), writing to OUTPUT (or standard output). With no options, matching lines are merged to the first occurrence. Mandatory arguments to long options are mandatory for short options too. -c, --count prefix lines by the number of occurrences -d, --repeated only print duplicate lines -D, --all-repeated[=delimit-method] print all duplicate lines delimit-method={none(default),prepend,separate} Delimiting is done with blank lines -f, --skip-fields=N avoid comparing the first N fields -i, --ignore-case ignore differences in case when comparing -s, --skip-chars=N avoid comparing the first N characters -u, --unique only print unique lines -z, --zero-terminated end lines with 0 byte, not newline -w, --check-chars=N compare no more than N characters in lines --help display this help and exit --version output version information and exit A field is a run of blanks (usually spaces and/or TABs), then non-blank characters. Fields are skipped before chars. Note: 'uniq' does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use 'sort -u' without 'uniq'. Also, comparisons honor the rules specified by 'LC_COL‐ LATE'.
以上是man输出。
从最后的note中可以知道当使用uniq进行去重,要求输入重复项是相邻的。这个比较好理解,要求重复项时连续的话可以省去一个hashmap的空间来做统计。为了获得这样的一个输入,可以先对数据进行一个排序操作,这样重复项必然是连续相邻的。
有如下文本文件:
the day is sunny the the sunny day is today is sunny day
UASE CASE 1.
首先对单词内容做一个去重处理(先排序,再去重)
$ sort words.txt | uniq day is sunny the today
USE CASE 2.
重复统计:
$ sort words.txt | uniq -c 3 day 3 is 3 sunny 3 the 1 today
USE CASE 3.
只输出重复项或者只输出唯一项:
$ sort words.txt | uniq -d day is sunny the $ sort words.txt | uniq -u today
相关文章推荐
- 01. Shell基础和使用技巧(工具+常用bash命令加速操作)
- gitbash 工具下 npm,node常用命令
- Bash:常用命令工具-tr命令
- 10 分钟学会Linux常用 bash命令
- java虚拟机常用命令工具
- Oracle RAC 常用维护工具和命令
- Linux调试工具strace和gdb常用命令小结-转
- 版本控制工具Git详细介绍和常用命令
- jdk中密钥和证书管理工具keytool常用命令详解
- FREEBSD常用工具和命令
- java虚拟机常用命令工具
- Linux下Bash的管线pipe命令,cut,grep,sort,wc,uniq,tee,tr,col,join,past,expand
- git--版本工具的常用命令以及初始化实操
- 十、Axis WebService常用命令和调试工具
- 通过sql*plus工具操作ORACLE 常用命令
- Linux常用命令和工具摘要 (持续更新中)
- linux 常用命令总结 bash
- Android 系统提供的常用命令工具
- 【Shell常用命令一】echo bash alias history 输出重定向 快捷键
- apt工具常用命令