您的位置:首页 > 编程语言 > Go语言

Mapreduce and Mobile Algorithms.

2011-10-28 10:41 155 查看
MapReduce Algorithms:

Introductory
slides:

http://code.google.com/edu/submissions/mapreduce-minilecture/lec2-mapred.ppt

Talk videos:

http://code.google.com/edu/submissions/mapreduce-minilecture/listing.html

Other tutorials:

http://www.cloudera.com/wp-content/uploads/2010/01/5-MapReduceAlgorithms.pdf

http://www.cloudera.com/videos/mapreduce_algorithms

http://www.umiacs.umd.edu/~jimmylin/cloud-2010-Spring/session3-slides.pdf

UMD Class
from Spring 2010:

http://www.umiacs.umd.edu/~jimmylin/cloud-2010-Spring/syllabus.html

Accompanying
text:

http://www.umiacs.umd.edu/~jimmylin/book.html

http://www.umiacs.umd.edu/~jimmylin/MapReduce-book-final.pdf



Papers to read/discuss in class:

Models

MapReduce: Simplified
Data Processing on Large Clusters

Jeffrey Dean
and Sanjay Ghemawat

OSDI 2004

http://labs.google.com/papers/mapreduce.html

Communications
of the ACM, 2010

http://cacm.acm.org/magazines/2010/1/55744-mapreduce-a-flexible-data-processing-tool/fulltext

On the Complexity
of Processing Massive, Unordered, Distributed Data.

J. Feldman,
S. Muthukrishnan, A. Sidiropoulos, C. Stein and Z. Svitkina,

SODA 2008

http://arxiv.org/abs/cs/0611108

A Model of
Computation for MapReduce

H. Karloff,
S. Suri, and S. Vassilvitskii

SODA 2010

http://www.sidsuri.com/About_Me_files/mrc2.pdf

Sorting, Searching,
and Simulation in the MapReduce Framework

Michael T.
Goodrich, Nodari Sitchinava, Qin Zhang

Under submission,
2011

http://arxiv.org/abs/1101.1902

Systems on top of MR:

Interpreting the
Data: Parallel Analysis with Sawzall

Scientific
Programming Journal 2005

Rob Pike,
Sean Dorward, Robert Griesemer, Sean Quinlan

http://research.google.com/archive/sawzall.html

Pig latin:
a not-so-foreign language for data processing

SIGMOD 2008

Christopher
Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins

https://portal.acm.org//citation.cfm?id=1376616.1376726&coll=DL&dl=GUIDE&CFID=5697894&CFTOKEN=14842407

Hive: a warehousing
solution over a map-reduce framework

VLDB 2009

Ashish Thusoo,
Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, Raghotham Murthy

http://portal.acm.org/ft_gateway.cfm?id=1687609&type=pdf&coll=DL&dl=GUIDE&CFID=5697894&CFTOKEN=14842407

Hive - A Petabyte
Scale Data Warehouse Using Hadoop

ICDE 2010

http://infolab.stanford.edu/~ragho/hive-icde2010.pdf

Alternatives/Extensions

Dryad: Distributed
data-parallel programs from sequential building blocks.

Eurosys 2007

Michael Isard,
Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly

http://research.microsoft.com/pubs/63785/eurosys07.pdf

MapReduce
Online

Tyson Condie,
Neil Conway, Peter Alvaro, Joseph M. Hellerstein, Khaled Elmeleegy and Russell Sears

NSDI 2010,
SIGMOD 2010

neilconway.org/docs/sigmod2010_hop_demo.pdf

http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-136.pdf

Algorithms

Simulating Parallel
Algorithms in the MapReduce Framework with Applications to Parallel Computational Geometry

Michael T.
Goodrich

Massive 2010

http://arxiv.org/abs/1004.4708

A New Computation
Model for Cluster Computing

Foto Afrati,
Jeff Ullman

http://infolab.stanford.edu/%7Eullman/pub/mapred-model-report.pdf

Max-cover
in map-reduce

Flavio Chierichetti,
Ravi Kumar, Andrew Tomkins

WWW 2010

http://portal.acm.org/citation.cfm?id=1772715

Graphs and Matrices

Graph Twiddling in
a MapReduce World

Jonathan Cohen

Computing
in Science and Engineering, vol. 11, no. 4, pp. 29-41, July/August, 2009.

http://www.computer.org/portal/web/csdl/doi/10.1109/MCSE.2009.120

Parallelizing
Random Walk with Restart for large-scale query recommendation

Meng-Fen Chiang,
Tsung-Wei Wang, Wen-Chih Peng

2010 Workshop
on Massive Data Analytics on the Cloud
http://portal.acm.org/citation.cfm?id=1779599.1779607

Distributed non-negative matrix factorization for dyadic data analysis on mapreduce

Chao Liu, Hung-chih
Yang, Jinliang Fan, Li-Wei He, Yi-Min Wang WWW 2010

http://research.microsoft.com/pubs/119077/DNMF.pdf

DOULION: Counting
Triangles in Massive Graphs with a Coin

Charalampos
E. Tsourakakis, U. Kang, Gary L. Miller, Christos Faloutsos

KDD 2009

Fast counting
of triangles in real-world networks: proofs, algorithms and observations

http://reports-archive.adm.cs.cmu.edu/anon/ml2008/CMU-ML-08-103.pdf

PEGASUS: A
Peta-Scale Graph Mining System - Implementation and Observations

U Kang, Charalampos
E. Tsourakakis, Christos Faloutsos

IEEE International
Conference on Data Mining (ICDM 2009)

http://www.cs.cmu.edu/~ctsourak/pegasusICDM09.pdf

Database

Optimizing Joins in a Map-Reduce Environment

Foto Afrati,
Jeff Ullman

http://infolab.stanford.edu/%7Eullman/pub/join-mr.pdf

Efficient
parallel set-similarity joins using MapReduce

Rares Vernica,
Michael J. Carey, Chen Li

SIGMOD 2010

http://portal.acm.org/citation.cfm?id=1807222

Applications

Scaling Up Classifiers to Cloud Computers

Christopher
Moretti, Karsten Steinhaeuser, Douglas Thain, Nitesh V. Chawla

ICDM 08

http://www.cse.nd.edu/~dthain/papers/classify-icdm08.pdf

Large-Scale
Behavioral Targeting

KDD 09

Ye Chen, Dmitriy
Pavlov, John Canny

http://www.cc.gatech.edu/~zha/CSE8801/ad/p209-chen.pdf

MrsRF: an
efficient MapReduce algorithm for analyzing large collections of evolutionary trees

BMC Bioinformatics
2010

Suzanne J
Matthews email and Tiffani L Williams email

http://www.biomedcentral.com/1471-2105/11/S1/S15

A novel approach
to multiple sequence alignment using hadoop data grids

2010
Workshop on Massive Data Analytics on the Cloud

G. Sudha Sadasivam,
G. Baktavatchalam

http://portal.acm.org/citation.cfm?id=1779599.1779601

Experiences
on Processing Spatial Data with MapReduce

Ariel Cary,
Zhengguo Sun, Vagelis Hristidis, Naphtali Rishe

21st International
Conference on Scientific and Statistical Database Management

http://users.cis.fiu.edu/~vagelis/publications/Spatial-MapReduce-SSDBM2009.pdf

Web-Scale
Distributional Similarity and Entity Set Expansion

Patrick Pantel,
Eric Crestan, Arkady Borkovsky, Ana-Maria Popescu, Vishnu Vyas

2009 Conference
on Empirical Methods in Natural Language Processing

http://www.aclweb.org/anthology/D/D09/D09-1098.pdf

Other lists of papers:

http://atbrox.com/2010/05/08/mapreduce-hadoop-algorithms-in-academic-papers-may-2010-update/

http://www.columbia.edu/~ak2834/mapreduce.html

-----------------------------------------------------------------------------------------------------------------------------------------------------------------

Schedule (Mobile Apps)

Jan 20: First meeting and planning projects.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: