您的位置:首页 > 大数据 > 人工智能

AirbnB uses R to scale data science

2016-04-06 11:54 525 查看
(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Airbnb

the
property-rental marketplace that helps you find a place to stay when you're travelling, uses
R to scale data science. Airbnb is a famously data-driven company, and has recently gone through a period of rapid growth. To accommodate the influx of data scientists (80% of whom are proficient in R, and 64% use R as their primary data analysis language), Airbnb
organizes monthly week-long data bootcamps for new hires and current team members.

But just as important as the training program is the engineering process Airbnb uses to scale data science with R. Rather than just have data scientists write R functions independently (which not only is a likely duplication of work, but inhibits transparency
and slows down productivity), Airbnb has invested in building an internal R package called Rbnb that implements collaborative solutions to common problems, standardizes visual presentations, and avoids reinventing the wheel. (Incidentally, the development
and use of internal R packages is a common pattern I've seen at many companies with large data science teams.)

The Rbnb package used at Airbnb includes more than 60 functions and is still growing under the guidance of several active developers. It's actively used by Airbnb's engineering, data science, analytics and user experience teams, to do things like move aggregated
or filtered data from a Hadoop or SQL environment into R, impute missing values, compute year-over-year trends, and perform common data aggregations. It has been used to create more than 500 research reports and to solve problems likeautomating
the detection of host preferences and using guest ratings
to predict rebooking rates.





The package is also widely used to visualize data using a standard Airbnb "look". The package includes custom themes, scales, and geoms for ggplot2; CSS templates for htmlwidgets and Shiny; and
custom R Markdown templates for different types of reports. You can see several examples in the blog post by Ricardo
Bion linked below, including this gorgeous visualization of the 500,000 top Airbnb trips.





Medium (AirbnbEng): Using R packages
and education to scale Data Science at Airbnb
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: