Browse by Tags

Tagged Content List
  • Wiki Page: Mahout on Windows Azure - Machine Learning Using Microsoft HDInsight

    Introduction One of the Microsoft HDInsight key components is Mahout, a scalable machine learning library that provides a number of algorithms relying on the Hadoop platform. Machine learning supports a wide range of use cases from email spam filtering to fraud detection to recommending books...
  • Wiki Page: How to Import Data to Hadoop on Windows Azure from Windows Azure Marketplace

    Before you use the Apache Hadoop on Windows Azure portal to import Windows Azure Marketplace data into Hadoop on Windows Azure, you must know the following information: User name: the live ID used to sign in to the marketplace. PassKey Sign in http://datamarket.azure.com with your live ID...
  • Wiki Page: Running HDInsight C# Hadoop Streaming Sample

    MapReduce is a programming model designed for processing large volumes of data in parallel by dividing the work into a set of independent tasks. Most of the MapReduce jobs are written in Java. Hadoop provides a streaming API to MapReduce that enables you to write map and reduce functions in languages...
  • Wiki Page: HDInsight Services For Windows

    This article is the main portal for technical information about HDInsight Services for Windows and related Microsoft technologies. It provides a brief overview of Apache Hadoop, as well as information for the HDInsight Services provided by Microsoft for deployment on both Windows and Windows Azure...
  • Wiki Page: Big Data – Beginning of New Era of Analytics in 2013

    In today’s digital world, data has been increasing at a very fast pace which demands organizations to focus on data and business intelligence. There is a drastic increase in technologies which rapidly evaluates massive amounts and varieties of data flowing from different devices, sensors, mobiles,...
  • Wiki Page: Microsoft HDInsight (Big Data) Solution

    This article serves as a single point repository for all content, resources, links, information and latest updates about Big Data, and Microsoft's activities around it. Contributions from everyone are most welcome. Note1: Please see, all the articles are listed in proper order. Articles...
  • Wiki Page: A Lap Around HDInsight

    (cross-posted from The Blog @ Graemesplace and the Content Master Technology Blog ) I’m currently working with Microsoft’s Patterns and Practices team, researching and documenting best practices guidance for big data analysis with HDInsight. For those of you who may not know, HDInsight is Microsoft...
  • Wiki Page: Power View Report to Hadoop on Azure Hive Sample

    Transcript This screencast shows how to used Power View to connect to your pre-existing PowerPivot workbook (the source of which is a Hive sample table within your Hadoop on Azure cluster) See Also More Videos about Hadoop Services on Windows and Windows Azure Apache Hadoop Services...
  • Wiki Page: Getting Started with the HDInsight Server Developer Preview

    Table of Contents Introduction Installation of Hadoop on Windows The Apache™ Hadoop™-based services on Windows dashboard Getting started with Microsoft Hadoop on Windows Load Some Data Running MapReduce Jobs Running Pig Jobs Running Hive Jobs Additonal Resources: Apache Hadoop, Hadoop on Windows, and...
  • Wiki Page: The Hadoop on Azure Pegasus Page Rank Sample

    Overview This tutorial shows how to deploy Pegasus from the Hadoop on Azure portal to compute the page rank for a simple 16-node graph. The rank calculated for a node is a measure of how well connected it is to the other nodes in the graph structure. A graph is type of abstract mathematical structure...
  • Wiki Page: HDInsight Services for Windows Azure QuickStart: Running Hadoop Jobs

    This tutorial shows two ways in which Hadoop MapReduce programs can be run on an Hadoop Distributed File System (HDFS) using HDInsight Services for Windows Azure. Use the Create Job UI to run MapReduce programs written in Java, contained in Hadoop jar files Use the Interactive JavaScript Console...
  • Wiki Page: Analyzing Twitter Data with Hive in HDInsight and SteamInsight

    In this tutorial you will query, explore, and analyze data from twitter using Apache™ Hadoop™-based Services for Windows Azure and a Hive query in Excel. Social web sites are one of the major driving forces for Big Data adoption. Public APIs provided by sites like Twitter are a useful source of data...
  • Wiki Page: Simple recommendation engine using Apache Mahout

    Apache Mahout™ is a machine learning library built for use in scalable machine learning applications. Recommender engines are some of the most immediately recognizable machine learning applications in use today. In this tutorial you use the Million Song Dataset to create song recommendations for users...
  • Wiki Page: Working With Data in Windows Azure HDInsight Service

    This tutorial covers several techniques for storing and importing data for use in Hadoop MapReduce jobs run with Windows Azure HDInsight Service ( formerly Apache™ Hadoop™-based Services for Windows Azure). Apache Hadoop is a software framework that supports data-intensive distributed applications...
  • Wiki Page: Introduction to HDInsight Services for Windows Azure

    Overview HDInsight Services for Windows Azure is a service that deploys and provisions Apache™ Hadoop™ clusters in the cloud, providing a software framework designed to manage, analyze and report on big data. Data is described as "big data" to indicate that it is being collected in ever...
  • Wiki Page: Hadoop on Azure WordCount Sample Tutorial

    Overview This tutorial shows two ways to use Hadoop on Azure to run a MapReduce program that counts word occurences in a text. First, with a Hadoop .jar file by using the Create Job UI. Second, with a query by using the fluent API layered on Pig that is provided by the Interactive Console . The...
  • Wiki Page: How to Connect Excel to Hadoop on Azure via HiveODBC

    One key feature of Microsoft’s Big Data Solution is solid integration of Apache Hadoop with the Microsoft Business Intelligence (BI) components. A good example of this is the ability for Excel to connect to the Hive data warehouse framework in the Hadoop cluster. This section walks you through using...
  • Wiki Page: Analyzing Twitter Movie Data with Hive in HDInsight

    In this tutorial you will query, explore, and analyze data from twitter using Apache™ Hadoop™-based Services for Windows Azure and a Hive query in Excel. Social web sites are one of the major driving forces for Big Data adoption. Public APIs provided by sites like Twitter are a useful source of data...
  • Wiki Page: Hadoop

    This article gives a brief overview of Apache Hadoop and directs you to Wiki articles that can give you more in-depth information about specific areas of Hadoop. Table of Contents Overview WindowsAzure.com TechNet Wiki Articles Community Resources Blog Posts Overview Apache Hadoop is an open...
  • Wiki Page: HDInsight Scenario: Query a Web Log via HiveQL

    The purpose of this wiki post is to provide an example scenario on how to work with Hadoop on Azure, upload a web log sample file via secure FTP, and run some simple HiveQL queries. Important! This wiki topic may be obsolete. The wiki topics on Windows Azure HDInsight Service are no...
  • Wiki Page: Windows Azure HDInsight Service FAQ

    Below are some frequently asked questions about Windows Azure HDInsight Service. Q: How to enable the Windows Azure HDInsight Service preview? A : To use Windows Azure HDInsight Service, you need a Windows Azure account that has the Windows Azure HDInsight feature enabled. If you don't...
  • Wiki Page: How to Connect Excel PowerPivot to Hive on Azure via HiveODBC

    The Hive ODBC Driver enables client applications such as Excel PowerPivot to access a Hive data warehouse running on Windows Azure. This driver requires the ODBC Server Port to be opened on the Hadoop Services on Azure ( http://www.hadooponazure.com ) portal. This walkthrough provides the following...
  • Wiki Page: Deployment of Hadoop-based Services on Windows and on Windows Azure

    Apache™ Hadoop™ is an open source framework from Apache . It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. It is very useful for analyzing and developing relationships for large unstructured datasets. Data processing in Hadoop is...
  • Wiki Page: Introduction to the Hadoop Services on Azure Hive Console (video)

    The Microsoft deployment of Apache Hadoop for Windows lets you set up a private Hadoop cluster on Azure. One of the included administration/deployment tools is an Interactive Console for JavaScript and Hive. This video introduces the Interactive Hive console. Developer Lengning Liu demonstrates running...
  • Wiki Page: Use SQL Azure Database as a Hive Metastore

    When requesting a Hadoop cluster on http://www.hadooponazure.com/ , select the option: Use SQL Azure for Hive Metastore to use a SQL Azure database as a Hive metastore. When you select this option, you will need to specify the following parameters: SQL Azure Server . You do not need to...
Page 1 of 3 (62 items) 123
Can't find it? Write it!