Browse by Tags

Tagged Content List
  • Wiki Page: The Hadoop on Azure Pegasus Page Rank Sample

    Overview This tutorial shows how to deploy Pegasus from the Hadoop on Azure portal to compute the page rank for a simple 16-node graph. The rank calculated for a node is a measure of how well connected it is to the other nodes in the graph structure. A graph is type of abstract mathematical structure...
  • Wiki Page: Analyzing Twitter Data with Hive in HDInsight and SteamInsight

    In this tutorial you will query, explore, and analyze data from twitter using Apache™ Hadoop™-based Services for Windows Azure and a Hive query in Excel. Social web sites are one of the major driving forces for Big Data adoption. Public APIs provided by sites like Twitter are a useful source of data...
  • Wiki Page: Simple recommendation engine using Apache Mahout

    Apache Mahout™ is a machine learning library built for use in scalable machine learning applications. Recommender engines are some of the most immediately recognizable machine learning applications in use today. In this tutorial you use the Million Song Dataset to create song recommendations for users...
  • Wiki Page: Working With Data in Windows Azure HDInsight Service

    This tutorial covers several techniques for storing and importing data for use in Hadoop MapReduce jobs run with Windows Azure HDInsight Service ( formerly Apache™ Hadoop™-based Services for Windows Azure). Apache Hadoop is a software framework that supports data-intensive distributed applications...
  • Wiki Page: Introduction to HDInsight Services for Windows Azure

    Overview HDInsight Services for Windows Azure is a service that deploys and provisions Apache™ Hadoop™ clusters in the cloud, providing a software framework designed to manage, analyze and report on big data. Data is described as "big data" to indicate that it is being collected in ever...
  • Wiki Page: Analyzing Twitter Movie Data with Hive in HDInsight

    In this tutorial you will query, explore, and analyze data from twitter using Apache™ Hadoop™-based Services for Windows Azure and a Hive query in Excel. Social web sites are one of the major driving forces for Big Data adoption. Public APIs provided by sites like Twitter are a useful source of data...
  • Wiki Page: The Hadoop on Azure Pi Estimator Sample Tutorial

    Overview This tutorial shows how to deploy a MapReduce program that uses a statistical (quasi-Monte Carlo) method to estimate the value of Pi. Points placed at random inside of a unit square also fall within a circle inscribed within that square with a probability equal to the area of the circle...
  • Wiki Page: Hadoop on Azure 10 GB GraySort Sample Tutorial

    Overview This tutorial shows how to run a general purpose GraySort on a 10 GB file using Hadoop on Azure. A GraySort is a benchmark sort whose metric is the sort rate (TB/minute) that is achieved while sorting a very large amount of data, usually a 100 TB minimum. This sample uses a more modest 10...
Page 1 of 1 (8 items)
Can't find it? Write it!