Hotspots Of Topics From Time Stamped Document

Lieferzeit: Lieferbar innerhalb 14 Tagen

54,90 

The Mapreduce Way

ISBN: 3659479101
ISBN 13: 9783659479106
Autor: Ashokan, Ashwathy/Chundi, Parvathi
Verlag: LAP LAMBERT Academic Publishing
Umfang: 104 S.
Erscheinungsdatum: 29.11.2013
Auflage: 1/2013
Format: 0.7 x 22 x 15
Gewicht: 173 g
Produktform: Kartoniert
Einband: KT
Artikelnummer: 5837502 Kategorie:

Beschreibung

Hotspots of a word/topic are time periods with a burst of activities in a time stamped document set. Identifying and analyzing hot spots of topics has been an important area of research. Finding hot spots of topics requires processing of contents of documents which is often time consuming. In this thesis, we explore MapReduce style algorithms for computing hot spots of topics. MapReduce is a distributed parallel programming model and an associated implementation for processing and analyzing large datasets. User specifies a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model and this thesis explores the feasibility of implementing the hotspot algorithm using MapReduce. We design map and reduce functions appropriate for preprocessing of documents, and the hot spot computation. We implement the functions in Hadoop (a MapReduce framework for Apache Foundation) and conduct several experiments to assess the benefits of MapReduce style implementation versus simple sequential implementation.

Autorenporträt

Ashwathy Ashokan currently works as an Application Developer at Union Pacific Corporation (UPC), one of America's leading transportation companies. She has also interned at Microsoft Corporation and worked for Wipro Technologies, a leading IT firm. Ashwathy has a Masters Degree in Computer Science from University of Nebraska Omaha.

Das könnte Ihnen auch gefallen …