您的位置:首页 > 运维架构

openxlsx包:读取/输出excel数据

2017-09-21 16:06 232 查看
可以非常方便地读取、写入、编辑.xlsx文件,无需设置java,同时输出的table样式比write.csv从看起来更加舒适。

其他非.xlsx的excel文件可另存为.xlsx文件后使用这个包

描述:

openxlsx simplifies the the process of writing and styling Excel xlsx files from R and removes the dependency on Java.

简化了处理xlsx文件的过程

可以用于读取xlsx文件,将数据写入到xlsx文件,设置xlsx文件的样式


1. 从xlsx文件中读取数据

(1)read.xlsx()函数

Read data from an Excel file or Workbook object into a data.frame

用法:

read.xlsx(xlsxFile, sheet = 1, startRow = 1, colNames = TRUE, rowNames = FALSE, 

detectDates = FALSE, skipEmptyRows = TRUE, skipEmptyCols = TRUE, 

rows = NULL, cols = NULL, check.names = FALSE, namedRegion = NULL, na.strings = "NA", fillMergedCells = FALSE)

常用参数:

xlsxFile:An xlsx file, Workbook object or URL to xlsx file. xlsx文件路径

sheet:The name or index of the sheet to read data from. sheet名或者数字索引

startRow:first row to begin looking for data. Empty rows at the top of a file are always skipped, regardless of the value of startRow. 从哪一行开始读入

colNames:If TRUE, the first row of data will be used as column names. 如果TRUE,则第一行作为列名

rowNames:If TRUE, first column of data will be used as row names. 如果TRUE,则第一列作为行名

skipEmptyRows:If TRUE, empty rows are skipped else empty rows after the first row containing data will return a row of NAs.是否跳过空白行,否则该行返回NA

skipEmptyCols:If TRUE, empty columns are skipped. 是否跳过空白列

rows:A numeric vector specifying which rows in the Excel file to read. If NULL, all rows are read. 数值型向量,读入哪些行,默认为全部读入

cols:A numeric vector specifying which columns in the Excel file to read. If NULL, all columns are read.数值型向量,读入哪些列,默认为全部读入

返回值:数据框

(2)readWorkbook()

Read data from an Excel file or Workbook object into a data.frame

用法:

readWorkbook(xlsxFile, sheet = 1, startRow = 1, colNames = TRUE,  rowNames = FALSE, 

detectDates = FALSE, skipEmptyRows = TRUE,  skipEmptyCols = TRUE, 

rows = NULL, cols = NULL, check.names = FALSE, namedRegion = NULL, na.strings = "NA", fillMergedCells = FALSE)

参数:同read.xlsx

与read.xlsx()函数的区别:

使用上应该是没有区别吧(其他不确定)

通过查看源码发现:read.xlsx()为泛型函数,而readWorkbook()函数在内部只调用read.xlsx函数。

2. 向xlsx文件中写入数据

(1)方案一:write.xlsx()

write a data.frame or list of data.frames to an xlsx file

用法:

write.xlsx(x, file, asTable = FALSE, ...)

参数:

x:object or a list of objects that can be handled by writeData to write to file

file:xlsx file name

asTable:write using writeDataTable as opposed to writeData

...:optional parameters to pass to functions:

常用的可选参数:

firstRow = TRUE:冻结第一行

colWidths = ‘auto’:自动设置列宽

如果需要使用粳稻的可选参数设置样式,可以选择使用方案二写入数据

(2)方案二:createWorkbook()+ addWorksheet()+ writeData()或writeDataTable()+ saveWorkbook()

创建Workbook - 插入sheet - 在sheet中写入数据 - 保存workbook

(2.1)创建新的workbook:createWorkbook(creator = Sys.getenv("USERNAME"))

参数:creator:Creator of the workbook (your name). Defaults to login username

返回值:workbook对象

(2.2)向workbook中插入新的sheet表:

addWorksheet(wb, sheetName, gridLines = TRUE, ...

(2.3)向sheet中写入数据

writeData(wb, sheet, x, startCol = 1, startRow = 1, xy = NULL, colNames = TRUE, rowNames = FALSE, headerStyle = NULL,...)

writeDataTable(wb, sheet, x, startCol = 1, startRow = 1, xy = NULL, colNames = TRUE, rowNames = FALSE, headerStyle = NULL,...)

Write to a worksheet and format as an Excel table.

(2.5)设置样式
createStyle():设置边框,字体,字号,背景色等,注意如果设置多行或者多列不连续,需要使用循环结果依次设置

addStyle():将样式添加到sheet上

freezePane():设置冻结窗格

setColwidths():设置行宽

(2.4)保存workbook

saveWorkbook(wb, file, overwrite = FALSE)

3. 使用options()设置样式(全局)

options("openxlsx.borderColour" = "black"):表格边框颜色

options("openxlsx.borderStyle" = "thin"):表格边框样式

options("openxlsx.dateFormat" = "mm/dd/yyyy"):日期格式

options("openxlsx.datetimeFormat" = "yyyy-mm-dd hh:mm:ss"):时间日期格式

options("openxlsx.numFmt" = NULL):

options("openxlsx.paperSize" = 9) ## A4

options("openxlsx.orientation" = "portrait") ## page orientation

4. 额外设置样式

(1)创建一个style对象createStyle()

createStyle(fontName = NULL, fontSize = NULL, fontColour = NULL,

  numFmt = "GENERAL", border = NULL,

  borderColour = getOption("openxlsx.borderColour", "black"),

  borderStyle = getOption("openxlsx.borderStyle", "thin"), bgFill = NULL,

  fgFill = NULL, halign = NULL, valign = NULL, textDecoration = NULL,

  wrapText = FALSE, textRotation = NULL, indent = NULL)

(2)将创建的style对象用于某些单元格:addStyle(wb, sheet, style, rows, cols, gridExpand = FALSE, stack = FALSE)

(3)openXL(wb) ## opens a temp version 设置好样式后浏览下

(4)冻结窗格:freezePane(wb, sheet, firstActiveRow = NULL, firstActiveCol = NULL, firstRow = FALSE, firstCol = FALSE)

(5)设置列宽:setColWidths(wb, sheet, cols, widths = 8.43, hidden = rep(FALSE, length(cols)), ignoreMergedCells = FALSE)

如需自动列宽,将widths = ‘auto’

未解决问题:

在数据量较大时, 对数据框格式的数据使用write.xlsx()或其他写入excel文件的函数时,第一次可以调用成功,但是把project关闭重新打开后,经常会出现

Error in is.nan(tmp) : default method not implemented for type 'list'
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: