R Programming Assignment 1
2016-06-11 12:55
621 查看
Part1
Write a function named ‘pollutantmean’ that calculates the mean of a pollutant (sulfate or nitrate) across a specified list of monitors. The function ‘pollutantmean’ takes three arguments: ‘directory’, ‘pollutant’, and ‘id’. Given a vector monitor ID numbers, ‘pollutantmean’ reads that monitors’ particulate matter data from the directory specified in the ‘directory’ argument and returns the mean of the pollutant across all of the monitors, ignoring any missing values coded as NA. A prototype of the function is as followspollutantmean <- function(directory, pollutant, id = 1:332) { ## 'directory' is a character vector of length 1 indicating ## the location of the CSV files ## 'pollutant' is a character vector of length 1 indicating ## the name of the pollutant for which we will calculate the ## mean; either "sulfate" or "nitrate". ## 'id' is an integer vector indicating the monitor ID numbers ## to be used ## Return the mean of the pollutant across all monitors list ## in the 'id' vector (ignoring NA values) ## NOTE: Do not round the result! }
You can see some example output from this function. The function that you write should be able to match this output. Please save your code to a file named pollutantmean.R.
pollutantmean.R:
pollutantmean <- function(directory, pollutant, id = 1:332) { files_full <- list.files(directory, full.names = TRUE) dat <- data.frame() for (i in id) { dat <- rbind(dat, read.csv(files_full[i])) } mean(dat[, pollutant], na.rm = TRUE) } source("pollutantmean.R") pollutantmean("specdata", "sulfate", 1:10) ## [1] 4.064 pollutantmean("specdata", "nitrate", 70:72) ## [1] 1.706 pollutantmean("specdata", "nitrate", 23) ## [1] 1.281
Part2
Write a function that reads a directory full of files and reports the number of completely observed cases in each data file. The function should return a data frame where the first column is the name of the file and the second column is the number of complete cases. A prototype of this function followscomplete <- function(directory, id = 1:332) { ## 'directory' is a character vector of length 1 indicating ## the location of the CSV files ## 'id' is an integer vector indicating the monitor ID numbers ## to be used ## Return a data frame of the form: ## id nobs ## 1 117 ## 2 1041 ## ... ## where 'id' is the monitor ID number and 'nobs' is the ## number of complete cases }
You can see some example output from this function. The function that you write should be able to match this output. Please save your code to a file named complete.R. To run the submit script for this part, make sure your working directory has the file complete.R in it.
complete.R:
complete <- function(directory, id = 1:332) { files_full <- list.files(directory, full.names = TRUE) dat <- data.frame() for (i in id) { moni_i <- read.csv(files_full[i]) nobs <- sum(complete.cases(moni_i)) tmp <- data.frame(i, nobs) dat <- rbind(dat, tmp) } colnames(dat) <- c("id", "nobs") dat }
Part3
Write a function that takes a directory of data files and a threshold for complete cases and calculates the correlation between sulfate and nitrate for monitor locations where the number of completely observed cases (on all variables) is greater than the threshold. The function should return a vector of correlations for the monitors that meet the threshold requirement. If no monitors meet the threshold requirement, then the function should return a numeric vector of length 0. A prototype of this function followscorr <- function(directory, threshold = 0) { ## 'directory' is a character vector of length 1 indicating ## the location of the CSV files ## 'threshold' is a numeric vector of length 1 indicating the ## number of completely observed observations (on all ## variables) required to compute the correlation between ## nitrate and sulfate; the default is 0 ## Return a numeric vector of correlations ## NOTE: Do not round the result! }
For this function you will need to use the ‘cor’ function in R which calculates the correlation between two vectors. Please read the help page for this function via ‘?cor’ and make sure that you know how to use it.
You can see some example output from this function. The function that you write should be able to match this output. Please save your code to a file named corr.R. To run the submit script for this part, make sure your working directory has the file corr.R in it.
corr.R:
corr <- function(directory, threshold = 0) { files_full <- list.files(directory, full.names = TRUE) dat <- vector(mode = "numeric", length = 0) for (i in 1:length(files_full)) { moni_i <- read.csv(files_full[i]) csum <- sum((!is.na(moni_i$sulfate)) & (!is.na(moni_i$nitrate))) if (csum > threshold) { tmp <- moni_i[which(!is.na(moni_i$sulfate)), ] submoni_i <- tmp[which(!is.na(tmp$nitrate)), ] dat <- c(dat, cor(submoni_i$sulfate, submoni_i$nitrate)) } } dat }
http://xmuxiaomo.github.io/2015/06/10/R-Programming-Assignment-1/
相关文章推荐
- 一、设计模式概述
- 【CSS笔记六】CSS盒模型
- 关于炉石的奥弹打死精灵龙的分析
- 【OS】磁盘调度算法
- 计算农历24节气
- 开机自启动nginx
- 体验Remix——安卓电脑
- spring配置文件中autowire详解
- 我的个人博客搭建记录
- 如何显示二进制流的图片(利用img控件)
- 【R】提升R代码运算效率的11个实用方法
- JavaScript手机振动API
- Java 类的方法总结-目前网上最完整9种方法总结
- 定义自己的rm command
- jquery.cookie用法
- ESP8266-SDK二次开发时遇到经常频繁自动复位的原因
- 艺术摄影--光线的运用(2学时)--SDUST
- java软件工程师招聘要求
- 艺术摄影--曝光与测光(2学时)--SDUST
- dotnet core开发体验之开始MVC