Browse by Tags

Tagged Content List
  • Wiki Page: Running HDInsight C# Hadoop Streaming Sample

    MapReduce is a programming model designed for processing large volumes of data in parallel by dividing the work into a set of independent tasks. Most of the MapReduce jobs are written in Java. Hadoop provides a streaming API to MapReduce that enables you to write map and reduce functions in languages...
  • Wiki Page: HDInsight Services For Windows

    This article is the main portal for technical information about HDInsight Services for Windows and related Microsoft technologies. It provides a brief overview of Apache Hadoop, as well as information for the HDInsight Services provided by Microsoft for deployment on both Windows and Windows Azure...
  • Wiki Page: The Hadoop on Azure Pegasus Page Rank Sample

    Overview This tutorial shows how to deploy Pegasus from the Hadoop on Azure portal to compute the page rank for a simple 16-node graph. The rank calculated for a node is a measure of how well connected it is to the other nodes in the graph structure. A graph is type of abstract mathematical structure...
  • Wiki Page: HDInsight Services for Windows Azure QuickStart: Running Hadoop Jobs

    This tutorial shows two ways in which Hadoop MapReduce programs can be run on an Hadoop Distributed File System (HDFS) using HDInsight Services for Windows Azure. Use the Create Job UI to run MapReduce programs written in Java, contained in Hadoop jar files Use the Interactive JavaScript Console...
  • Wiki Page: Analyzing Twitter Data with Hive in HDInsight and SteamInsight

    In this tutorial you will query, explore, and analyze data from twitter using Apache™ Hadoop™-based Services for Windows Azure and a Hive query in Excel. Social web sites are one of the major driving forces for Big Data adoption. Public APIs provided by sites like Twitter are a useful source of data...
  • Wiki Page: Simple recommendation engine using Apache Mahout

    Apache Mahout™ is a machine learning library built for use in scalable machine learning applications. Recommender engines are some of the most immediately recognizable machine learning applications in use today. In this tutorial you use the Million Song Dataset to create song recommendations for users...
  • Wiki Page: Working With Data in Windows Azure HDInsight Service

    This tutorial covers several techniques for storing and importing data for use in Hadoop MapReduce jobs run with Windows Azure HDInsight Service ( formerly Apache™ Hadoop™-based Services for Windows Azure). Apache Hadoop is a software framework that supports data-intensive distributed applications...
  • Wiki Page: Hadoop on Azure WordCount Sample Tutorial

    Overview This tutorial shows two ways to use Hadoop on Azure to run a MapReduce program that counts word occurences in a text. First, with a Hadoop .jar file by using the Create Job UI. Second, with a query by using the fluent API layered on Pig that is provided by the Interactive Console . The...
  • Wiki Page: Analyzing Twitter Movie Data with Hive in HDInsight

    In this tutorial you will query, explore, and analyze data from twitter using Apache™ Hadoop™-based Services for Windows Azure and a Hive query in Excel. Social web sites are one of the major driving forces for Big Data adoption. Public APIs provided by sites like Twitter are a useful source of data...
  • Wiki Page: Hadoop

    This article gives a brief overview of Apache Hadoop and directs you to Wiki articles that can give you more in-depth information about specific areas of Hadoop. Table of Contents Overview WindowsAzure.com TechNet Wiki Articles Community Resources Blog Posts Overview Apache Hadoop is an open...
  • Wiki Page: Windows Azure HDInsight Service FAQ

    Below are some frequently asked questions about Windows Azure HDInsight Service. Q: How to enable the Windows Azure HDInsight Service preview? A : To use Windows Azure HDInsight Service, you need a Windows Azure account that has the Windows Azure HDInsight feature enabled. If you don't...
  • Wiki Page: Write simple Interactive JavaScript statements

    How To Write simple Interactive JavaScript statements Lets HD-IN Interactive JavaScript Interactive Javascript console is a handy tool to execute pig parsing, upload files, run C# streaming Map/Reduce jobs, Here are some basic commands to get going with interactive Javascripts Samle...
  • Wiki Page: Deployment of Hadoop-based Services on Windows and on Windows Azure

    Apache™ Hadoop™ is an open source framework from Apache . It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. It is very useful for analyzing and developing relationships for large unstructured datasets. Data processing in Hadoop is...
  • Wiki Page: Introduction to the Hadoop Services on Azure Hive Console (video)

    The Microsoft deployment of Apache Hadoop for Windows lets you set up a private Hadoop cluster on Azure. One of the included administration/deployment tools is an Interactive Console for JavaScript and Hive. This video introduces the Interactive Hive console. Developer Lengning Liu demonstrates running...
  • Wiki Page: Run the Pi Estimator Sample on Hadoop Services for Windows Azure (video)

    http://youtu.be//w0BpLawwmKI Hadoop-based Services for Windows Azure includes several samples you can use for learning and testing.In this video, Developer Brad Sarsfield walks you through the Pi Estimator sample. See Also More Videos about Hadoop Services on Windows and Windows Azure ...
  • Wiki Page: How to SFTP Data to Hadoop-based services on Windows Azure

    by Brad Sarsfield In reference to the How to FTP Data to Hadoop-based services on Windows Azure , the follow up is how to SFTP. Open the SFTP port to your head node From you https://hadooponazure.com page; In the “Open Ports Tile” open the FTPS port 2226 This opens port...
  • Wiki Page: Hadoop-Based Services on Windows Azure How To Guide

    This content is a work in progress for the benefit of the Hadoop Community. Please feel free to contribute to this wiki page based on your expertise and experience with Hadoop. If you have any questions, please use the groups DL http://tech.groups.yahoo.com/group/hadooponazurectp/ Table of Contents...
  • Wiki Page: Deployment of Hadoop-based Services on the Windows Azure Portal

    This topic describes using the Hadoop on Windows Azure Portal to provision a new Apache Hadoop cluster. Clusters provisioned on the portal are temporary and have an expiration. These clusters are provisioned to run jobs processing input data that may be on the cluster or located elsewhere. For example...
  • Wiki Page: How to FTP Data to Hadoop-Based Services on Windows Azure

    Hadoop-based services for Windows include a FTP server that operates directly on the Hadoop Distributed File System (HDFS). The FTPS protocol is used for secure transfers. FTP communication is wire efficient and especially suited for transferring large data set. The steps below describe how to use...
  • Wiki Page: How to Create and Run a Job on the Hadoop on Windows Azure Portal

    This article describes how the create Map Reduce jobs on a cluster that has been provisioned on the Hadoop on Windows Azure Portal. For more information on running Map Reduce Jobs for an on-premise or Windows Azure Hadoop cluster, see the getting started guide for your cluster deployment type. ...
  • Wiki Page: Microsoft Hadoop Distribution Documentation Plan

    Introduction This topic describes the documentation currently being planned for the Microsoft Hadoop Distributions. This is a work in progress. Community participation and influence are encouraged to help meet the needs and expectations of the Apache Hadoop community working with Microsoft platforms...
  • Wiki Page: The Hadoop on Azure Sqoop Import Sample Tutorial

    Overview This tutorial shows how to use Sqoop to import data from a SQL database on Windows Azure to an Hadoop on Azure HDFS cluster. While Hadoop is a natural choice for processing unstructured and semi-structured data, such as logs and files, there may also be a need to process structured data...
  • Wiki Page: The Hadoop on Azure Pi Estimator Sample Tutorial

    Overview This tutorial shows how to deploy a MapReduce program that uses a statistical (quasi-Monte Carlo) method to estimate the value of Pi. Points placed at random inside of a unit square also fall within a circle inscribed within that square with a probability equal to the area of the circle...
  • Wiki Page: The Hadoop on Azure Pegasus Degree Distribution Sample Tutorial

    Overview This tutorial shows how to deploy Pegasus from the Hadoop on Azure portal to compute the degree of each node and the distribution of degrees for a simple 16-node graph. The degree distribution gives the number of nodes in the graph at each degree. The degree of a node in a network (or...
  • Wiki Page: The Hadoop on Azure Mahout Clustering Sample Tutorial

    Overview This tutorial illustrates how to use Hadoop on Azure to do cluster analysis with Mahout. The various forms of cluster analysis attempt to answer the problem: given a collection of objects with values for a set of properties, devise a scheme for grouping them where similar ones are put...
Page 1 of 2 (40 items) 12
Can't find it? Write it!