Mapreduce and Mobile Algorithms.
2011-10-28 10:41
155 查看
MapReduce Algorithms:
Introductory
slides:
http://code.google.com/edu/submissions/mapreduce-minilecture/lec2-mapred.ppt
Talk videos:
http://code.google.com/edu/submissions/mapreduce-minilecture/listing.html
Other tutorials:
http://www.cloudera.com/wp-content/uploads/2010/01/5-MapReduceAlgorithms.pdf
http://www.cloudera.com/videos/mapreduce_algorithms
http://www.umiacs.umd.edu/~jimmylin/cloud-2010-Spring/session3-slides.pdf
UMD Class
from Spring 2010:
http://www.umiacs.umd.edu/~jimmylin/cloud-2010-Spring/syllabus.html
Accompanying
text:
http://www.umiacs.umd.edu/~jimmylin/book.html
http://www.umiacs.umd.edu/~jimmylin/MapReduce-book-final.pdf
Papers to read/discuss in class:
Data Processing on Large Clusters
Jeffrey Dean
and Sanjay Ghemawat
OSDI 2004
http://labs.google.com/papers/mapreduce.html
Communications
of the ACM, 2010
http://cacm.acm.org/magazines/2010/1/55744-mapreduce-a-flexible-data-processing-tool/fulltext
On the Complexity
of Processing Massive, Unordered, Distributed Data.
J. Feldman,
S. Muthukrishnan, A. Sidiropoulos, C. Stein and Z. Svitkina,
SODA 2008
http://arxiv.org/abs/cs/0611108
A Model of
Computation for MapReduce
H. Karloff,
S. Suri, and S. Vassilvitskii
SODA 2010
http://www.sidsuri.com/About_Me_files/mrc2.pdf
Sorting, Searching,
and Simulation in the MapReduce Framework
Michael T.
Goodrich, Nodari Sitchinava, Qin Zhang
Under submission,
2011
http://arxiv.org/abs/1101.1902
Data: Parallel Analysis with Sawzall
Scientific
Programming Journal 2005
Rob Pike,
Sean Dorward, Robert Griesemer, Sean Quinlan
http://research.google.com/archive/sawzall.html
Pig latin:
a not-so-foreign language for data processing
SIGMOD 2008
Christopher
Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins
https://portal.acm.org//citation.cfm?id=1376616.1376726&coll=DL&dl=GUIDE&CFID=5697894&CFTOKEN=14842407
Hive: a warehousing
solution over a map-reduce framework
VLDB 2009
Ashish Thusoo,
Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, Raghotham Murthy
http://portal.acm.org/ft_gateway.cfm?id=1687609&type=pdf&coll=DL&dl=GUIDE&CFID=5697894&CFTOKEN=14842407
Hive - A Petabyte
Scale Data Warehouse Using Hadoop
ICDE 2010
http://infolab.stanford.edu/~ragho/hive-icde2010.pdf
data-parallel programs from sequential building blocks.
Eurosys 2007
Michael Isard,
Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly
http://research.microsoft.com/pubs/63785/eurosys07.pdf
MapReduce
Online
Tyson Condie,
Neil Conway, Peter Alvaro, Joseph M. Hellerstein, Khaled Elmeleegy and Russell Sears
NSDI 2010,
SIGMOD 2010
neilconway.org/docs/sigmod2010_hop_demo.pdf
http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-136.pdf
Algorithms in the MapReduce Framework with Applications to Parallel Computational Geometry
Michael T.
Goodrich
Massive 2010
http://arxiv.org/abs/1004.4708
A New Computation
Model for Cluster Computing
Foto Afrati,
Jeff Ullman
http://infolab.stanford.edu/%7Eullman/pub/mapred-model-report.pdf
Max-cover
in map-reduce
Flavio Chierichetti,
Ravi Kumar, Andrew Tomkins
WWW 2010
http://portal.acm.org/citation.cfm?id=1772715
a MapReduce World
Jonathan Cohen
Computing
in Science and Engineering, vol. 11, no. 4, pp. 29-41, July/August, 2009.
http://www.computer.org/portal/web/csdl/doi/10.1109/MCSE.2009.120
Parallelizing
Random Walk with Restart for large-scale query recommendation
Meng-Fen Chiang,
Tsung-Wei Wang, Wen-Chih Peng
2010 Workshop
on Massive Data Analytics on the Cloud
http://portal.acm.org/citation.cfm?id=1779599.1779607
Chao Liu, Hung-chih
http://research.microsoft.com/pubs/119077/DNMF.pdf
DOULION: Counting
Triangles in Massive Graphs with a Coin
Charalampos
E. Tsourakakis, U. Kang, Gary L. Miller, Christos Faloutsos
KDD 2009
Fast counting
of triangles in real-world networks: proofs, algorithms and observations
http://reports-archive.adm.cs.cmu.edu/anon/ml2008/CMU-ML-08-103.pdf
PEGASUS: A
Peta-Scale Graph Mining System - Implementation and Observations
U Kang, Charalampos
E. Tsourakakis, Christos Faloutsos
IEEE International
Conference on Data Mining (ICDM 2009)
http://www.cs.cmu.edu/~ctsourak/pegasusICDM09.pdf
Foto Afrati,
Jeff Ullman
http://infolab.stanford.edu/%7Eullman/pub/join-mr.pdf
Efficient
parallel set-similarity joins using MapReduce
Rares Vernica,
Michael J. Carey, Chen Li
SIGMOD 2010
http://portal.acm.org/citation.cfm?id=1807222
Christopher
Moretti, Karsten Steinhaeuser, Douglas Thain, Nitesh V. Chawla
ICDM 08
http://www.cse.nd.edu/~dthain/papers/classify-icdm08.pdf
Large-Scale
Behavioral Targeting
KDD 09
Ye Chen, Dmitriy
Pavlov, John Canny
http://www.cc.gatech.edu/~zha/CSE8801/ad/p209-chen.pdf
MrsRF: an
efficient MapReduce algorithm for analyzing large collections of evolutionary trees
BMC Bioinformatics
2010
Suzanne J
Matthews email and Tiffani L Williams email
http://www.biomedcentral.com/1471-2105/11/S1/S15
A novel approach
to multiple sequence alignment using hadoop data grids
2010
Workshop on Massive Data Analytics on the Cloud
G. Sudha Sadasivam,
G. Baktavatchalam
http://portal.acm.org/citation.cfm?id=1779599.1779601
Experiences
on Processing Spatial Data with MapReduce
Ariel Cary,
Zhengguo Sun, Vagelis Hristidis, Naphtali Rishe
21st International
Conference on Scientific and Statistical Database Management
http://users.cis.fiu.edu/~vagelis/publications/Spatial-MapReduce-SSDBM2009.pdf
Web-Scale
Distributional Similarity and Entity Set Expansion
Patrick Pantel,
Eric Crestan, Arkady Borkovsky, Ana-Maria Popescu, Vishnu Vyas
2009 Conference
on Empirical Methods in Natural Language Processing
http://www.aclweb.org/anthology/D/D09/D09-1098.pdf
http://www.columbia.edu/~ak2834/mapreduce.html
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Schedule (Mobile Apps)
Jan 20: First meeting and planning projects.
Introductory
slides:
http://code.google.com/edu/submissions/mapreduce-minilecture/lec2-mapred.ppt
Talk videos:
http://code.google.com/edu/submissions/mapreduce-minilecture/listing.html
Other tutorials:
http://www.cloudera.com/wp-content/uploads/2010/01/5-MapReduceAlgorithms.pdf
http://www.cloudera.com/videos/mapreduce_algorithms
http://www.umiacs.umd.edu/~jimmylin/cloud-2010-Spring/session3-slides.pdf
UMD Class
from Spring 2010:
http://www.umiacs.umd.edu/~jimmylin/cloud-2010-Spring/syllabus.html
Accompanying
text:
http://www.umiacs.umd.edu/~jimmylin/book.html
http://www.umiacs.umd.edu/~jimmylin/MapReduce-book-final.pdf
Papers to read/discuss in class:
Models
MapReduce: SimplifiedData Processing on Large Clusters
Jeffrey Dean
and Sanjay Ghemawat
OSDI 2004
http://labs.google.com/papers/mapreduce.html
Communications
of the ACM, 2010
http://cacm.acm.org/magazines/2010/1/55744-mapreduce-a-flexible-data-processing-tool/fulltext
On the Complexity
of Processing Massive, Unordered, Distributed Data.
J. Feldman,
S. Muthukrishnan, A. Sidiropoulos, C. Stein and Z. Svitkina,
SODA 2008
http://arxiv.org/abs/cs/0611108
A Model of
Computation for MapReduce
H. Karloff,
S. Suri, and S. Vassilvitskii
SODA 2010
http://www.sidsuri.com/About_Me_files/mrc2.pdf
Sorting, Searching,
and Simulation in the MapReduce Framework
Michael T.
Goodrich, Nodari Sitchinava, Qin Zhang
Under submission,
2011
http://arxiv.org/abs/1101.1902
Systems on top of MR:
Interpreting theData: Parallel Analysis with Sawzall
Scientific
Programming Journal 2005
Rob Pike,
Sean Dorward, Robert Griesemer, Sean Quinlan
http://research.google.com/archive/sawzall.html
Pig latin:
a not-so-foreign language for data processing
SIGMOD 2008
Christopher
Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins
https://portal.acm.org//citation.cfm?id=1376616.1376726&coll=DL&dl=GUIDE&CFID=5697894&CFTOKEN=14842407
Hive: a warehousing
solution over a map-reduce framework
VLDB 2009
Ashish Thusoo,
Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, Raghotham Murthy
http://portal.acm.org/ft_gateway.cfm?id=1687609&type=pdf&coll=DL&dl=GUIDE&CFID=5697894&CFTOKEN=14842407
Hive - A Petabyte
Scale Data Warehouse Using Hadoop
ICDE 2010
http://infolab.stanford.edu/~ragho/hive-icde2010.pdf
Alternatives/Extensions
Dryad: Distributeddata-parallel programs from sequential building blocks.
Eurosys 2007
Michael Isard,
Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly
http://research.microsoft.com/pubs/63785/eurosys07.pdf
MapReduce
Online
Tyson Condie,
Neil Conway, Peter Alvaro, Joseph M. Hellerstein, Khaled Elmeleegy and Russell Sears
NSDI 2010,
SIGMOD 2010
neilconway.org/docs/sigmod2010_hop_demo.pdf
http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-136.pdf
Algorithms
Simulating ParallelAlgorithms in the MapReduce Framework with Applications to Parallel Computational Geometry
Michael T.
Goodrich
Massive 2010
http://arxiv.org/abs/1004.4708
A New Computation
Model for Cluster Computing
Foto Afrati,
Jeff Ullman
http://infolab.stanford.edu/%7Eullman/pub/mapred-model-report.pdf
Max-cover
in map-reduce
Flavio Chierichetti,
Ravi Kumar, Andrew Tomkins
WWW 2010
http://portal.acm.org/citation.cfm?id=1772715
Graphs and Matrices
Graph Twiddling ina MapReduce World
Jonathan Cohen
Computing
in Science and Engineering, vol. 11, no. 4, pp. 29-41, July/August, 2009.
http://www.computer.org/portal/web/csdl/doi/10.1109/MCSE.2009.120
Parallelizing
Random Walk with Restart for large-scale query recommendation
Meng-Fen Chiang,
Tsung-Wei Wang, Wen-Chih Peng
2010 Workshop
on Massive Data Analytics on the Cloud
http://portal.acm.org/citation.cfm?id=1779599.1779607
Distributed non-negative matrix factorization for dyadic data analysis on mapreduce
Chao Liu, Hung-chih
Yang, Jinliang Fan, Li-Wei He, Yi-Min Wang WWW 2010
http://research.microsoft.com/pubs/119077/DNMF.pdfDOULION: Counting
Triangles in Massive Graphs with a Coin
Charalampos
E. Tsourakakis, U. Kang, Gary L. Miller, Christos Faloutsos
KDD 2009
Fast counting
of triangles in real-world networks: proofs, algorithms and observations
http://reports-archive.adm.cs.cmu.edu/anon/ml2008/CMU-ML-08-103.pdf
PEGASUS: A
Peta-Scale Graph Mining System - Implementation and Observations
U Kang, Charalampos
E. Tsourakakis, Christos Faloutsos
IEEE International
Conference on Data Mining (ICDM 2009)
http://www.cs.cmu.edu/~ctsourak/pegasusICDM09.pdf
Database
Optimizing Joins in a Map-Reduce EnvironmentFoto Afrati,
Jeff Ullman
http://infolab.stanford.edu/%7Eullman/pub/join-mr.pdf
Efficient
parallel set-similarity joins using MapReduce
Rares Vernica,
Michael J. Carey, Chen Li
SIGMOD 2010
http://portal.acm.org/citation.cfm?id=1807222
Applications
Scaling Up Classifiers to Cloud ComputersChristopher
Moretti, Karsten Steinhaeuser, Douglas Thain, Nitesh V. Chawla
ICDM 08
http://www.cse.nd.edu/~dthain/papers/classify-icdm08.pdf
Large-Scale
Behavioral Targeting
KDD 09
Ye Chen, Dmitriy
Pavlov, John Canny
http://www.cc.gatech.edu/~zha/CSE8801/ad/p209-chen.pdf
MrsRF: an
efficient MapReduce algorithm for analyzing large collections of evolutionary trees
BMC Bioinformatics
2010
Suzanne J
Matthews email and Tiffani L Williams email
http://www.biomedcentral.com/1471-2105/11/S1/S15
A novel approach
to multiple sequence alignment using hadoop data grids
2010
Workshop on Massive Data Analytics on the Cloud
G. Sudha Sadasivam,
G. Baktavatchalam
http://portal.acm.org/citation.cfm?id=1779599.1779601
Experiences
on Processing Spatial Data with MapReduce
Ariel Cary,
Zhengguo Sun, Vagelis Hristidis, Naphtali Rishe
21st International
Conference on Scientific and Statistical Database Management
http://users.cis.fiu.edu/~vagelis/publications/Spatial-MapReduce-SSDBM2009.pdf
Web-Scale
Distributional Similarity and Entity Set Expansion
Patrick Pantel,
Eric Crestan, Arkady Borkovsky, Ana-Maria Popescu, Vishnu Vyas
2009 Conference
on Empirical Methods in Natural Language Processing
http://www.aclweb.org/anthology/D/D09/D09-1098.pdf
Other lists of papers:
http://atbrox.com/2010/05/08/mapreduce-hadoop-algorithms-in-academic-papers-may-2010-update/http://www.columbia.edu/~ak2834/mapreduce.html
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Schedule (Mobile Apps)
Jan 20: First meeting and planning projects.
相关文章推荐
- Algorithms and Protocols for Wireless, Mobile Ad Hoc Networks
- MapReduce Patterns, Algorithms, and Use Cases
- MapReduce 模式、算法和用例(MapReduce Patterns, Algorithms, and Use Cases)
- Implementing TF×IDF and PageRank Algorithms with MapReduce and Scala
- [Network]Wireless and Mobile
- Matrix Factorization, Algorithms, Applications, and Avaliable packages
- Summary on Structured Data and their algorithms with OOL perspective
- 文献笔记 《AndroTotal : A Flexible, Scalable Toolbox and Service for Testing Mobile Malware Detectors》
- Algorithms—121.Best Time to Buy and Sell Stock
- In a Web Application and Mobile (hybrid), how to know which this platform running?
- How to use write and run MapReduce in eclipse on windows.
- Basic Data Structures and Algorithms in the Linux Kernel--reference
- algorithms learning and what i've read today
- Windows Mobile Development Tools and Resources
- p3:An open source pcap packet and NetFlow file analysis tool using Hadoop MapReduce and Hive.
- MapReduce with MongoDB and Python
- 基于Problem Solving with Algorithms and Data Structures using Python的学习记录(3)——Basic Data Structures
- hadoop2.2 MapReduce and yarn(一)
- HTML5, jQuery Mobile and ASP.NET MVC 4 – Using the ViewModel between the model and controller
- Computer Vision: Algorithms and Applications 计算机视觉:算法与应用 翻译工作 序