您的位置：首页 > 运维架构

openxlsx包：读取/输出excel数据

2017-09-21 16:06 232 查看

可以非常方便地读取、写入、编辑.xlsx文件，无需设置java，同时输出的table样式比write.csv从看起来更加舒适。

其他非.xlsx的excel文件可另存为.xlsx文件后使用这个包

描述：

openxlsx simplifies the the process of writing and styling Excel xlsx files from R and removes the dependency on Java.

简化了处理xlsx文件的过程

可以用于读取xlsx文件，将数据写入到xlsx文件，设置xlsx文件的样式

1. 从xlsx文件中读取数据

（1）read.xlsx（）函数

Read data from an Excel file or Workbook object into a data.frame

用法：

read.xlsx(xlsxFile, sheet = 1, startRow = 1, colNames = TRUE, rowNames = FALSE,

detectDates = FALSE, skipEmptyRows = TRUE, skipEmptyCols = TRUE,

rows = NULL, cols = NULL, check.names = FALSE, namedRegion = NULL, na.strings = "NA", fillMergedCells = FALSE)

常用参数：

xlsxFile：An xlsx file, Workbook object or URL to xlsx file. xlsx文件路径

sheet：The name or index of the sheet to read data from. sheet名或者数字索引

startRow：first row to begin looking for data. Empty rows at the top of a file are always skipped, regardless of the value of startRow. 从哪一行开始读入

colNames：If TRUE, the first row of data will be used as column names. 如果TRUE，则第一行作为列名

rowNames：If TRUE, first column of data will be used as row names. 如果TRUE，则第一列作为行名

skipEmptyRows：If TRUE, empty rows are skipped else empty rows after the first row containing data will return a row of NAs.是否跳过空白行，否则该行返回NA

skipEmptyCols：If TRUE, empty columns are skipped. 是否跳过空白列

rows：A numeric vector specifying which rows in the Excel file to read. If NULL, all rows are read. 数值型向量，读入哪些行，默认为全部读入

cols：A numeric vector specifying which columns in the Excel file to read. If NULL, all columns are read.数值型向量，读入哪些列，默认为全部读入

返回值：数据框

（2）readWorkbook（）

Read data from an Excel file or Workbook object into a data.frame

用法：

readWorkbook(xlsxFile, sheet = 1, startRow = 1, colNames = TRUE, rowNames = FALSE,

detectDates = FALSE, skipEmptyRows = TRUE, skipEmptyCols = TRUE,

rows = NULL, cols = NULL, check.names = FALSE, namedRegion = NULL, na.strings = "NA", fillMergedCells = FALSE)

参数：同read.xlsx

与read.xlsx（）函数的区别：

使用上应该是没有区别吧（其他不确定）

通过查看源码发现：read.xlsx（）为泛型函数，而readWorkbook（）函数在内部只调用read.xlsx函数。

2. 向xlsx文件中写入数据

（1）方案一：write.xlsx（）

write a data.frame or list of data.frames to an xlsx file

用法：

write.xlsx(x, file, asTable = FALSE, ...)

参数：

x：object or a list of objects that can be handled by writeData to write to file

file：xlsx file name

asTable：write using writeDataTable as opposed to writeData

...：optional parameters to pass to functions:

常用的可选参数：

firstRow = TRUE：冻结第一行

colWidths = ‘auto’：自动设置列宽

如果需要使用粳稻的可选参数设置样式，可以选择使用方案二写入数据

（2）方案二：createWorkbook（）+ addWorksheet（）+ writeData（）或writeDataTable（）+ saveWorkbook（）

创建Workbook - 插入sheet - 在sheet中写入数据 - 保存workbook

（2.1）创建新的workbook：createWorkbook(creator = Sys.getenv("USERNAME"))

参数：creator：Creator of the workbook (your name). Defaults to login username

返回值：workbook对象

（2.2）向workbook中插入新的sheet表：

addWorksheet(wb, sheetName, gridLines = TRUE, ...

（2.3）向sheet中写入数据

writeData(wb, sheet, x, startCol = 1, startRow = 1, xy = NULL, colNames = TRUE, rowNames = FALSE, headerStyle = NULL,...)

writeDataTable(wb, sheet, x, startCol = 1, startRow = 1, xy = NULL, colNames = TRUE, rowNames = FALSE, headerStyle = NULL,...)

Write to a worksheet and format as an Excel table.

（2.5）设置样式
createStyle（）：设置边框，字体，字号，背景色等，注意如果设置多行或者多列不连续，需要使用循环结果依次设置

addStyle（）：将样式添加到sheet上

freezePane（）：设置冻结窗格

setColwidths（）：设置行宽

（2.4）保存workbook

saveWorkbook(wb, file, overwrite = FALSE)

3. 使用options（）设置样式（全局）

options("openxlsx.borderColour" = "black")：表格边框颜色

options("openxlsx.borderStyle" = "thin")：表格边框样式

options("openxlsx.dateFormat" = "mm/dd/yyyy")：日期格式

options("openxlsx.datetimeFormat" = "yyyy-mm-dd hh:mm:ss")：时间日期格式

options("openxlsx.numFmt" = NULL)：

options("openxlsx.paperSize" = 9) ## A4

options("openxlsx.orientation" = "portrait") ## page orientation

4. 额外设置样式

（1）创建一个style对象createStyle（）

createStyle(fontName = NULL, fontSize = NULL, fontColour = NULL,

numFmt = "GENERAL", border = NULL,

borderColour = getOption("openxlsx.borderColour", "black"),

borderStyle = getOption("openxlsx.borderStyle", "thin"), bgFill = NULL,

fgFill = NULL, halign = NULL, valign = NULL, textDecoration = NULL,

wrapText = FALSE, textRotation = NULL, indent = NULL)

（2）将创建的style对象用于某些单元格：addStyle（wb, sheet, style, rows, cols, gridExpand = FALSE, stack = FALSE）

（3）openXL(wb) ## opens a temp version 设置好样式后浏览下

（4）冻结窗格：freezePane(wb, sheet, firstActiveRow = NULL, firstActiveCol = NULL, firstRow = FALSE, firstCol = FALSE)

（5）设置列宽：setColWidths(wb, sheet, cols, widths = 8.43, hidden = rep(FALSE, length(cols)), ignoreMergedCells = FALSE)

如需自动列宽，将widths = ‘auto’

未解决问题：

在数据量较大时，对数据框格式的数据使用write.xlsx（）或其他写入excel文件的函数时，第一次可以调用成功，但是把project关闭重新打开后，经常会出现

Error in is.nan(tmp) : default method not implemented for type 'list'

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航