For some scenarios, MAP requires the collection of performance data. Use the Performance Metrics Wizard to gather information about the CPU, memory, disk, and network utilization of computers for a duration you specify. The minimum time required is 30 minutes of successful collection. If MAP fails to collect at least 30 minutes of performance data, you will not be able to run any of the wizards that require performance data. Be aware that time spent performing inventory data collection of machines may delay the start of performance data collection; it is therefore recommended that you perform an inventory of the target machines prior to running performance collection.
It is recommended that you conduct an initial 30-minute test run to ensure you are collecting data from the target machines. Once you are satisfied that MAP is able to collect data from the target machines, you can conduct a full run. When deciding how long to run the performance data collection, we recommend you consider the following:
Performance counters are collected from each computer in 5-minute intervals. The number of computers from which the MAP Toolkit can collect performance counter data successfully depends upon factors such as network latency and the responsiveness of servers.
Note If you have previously gathered performance data, you will be prompted on subsequent performance counter gathering runs to either delete existing data or to append the newly gathered data to what was collected previously. If you split up your target computers to improve performance, select No in the Performance Data Exists dialog box.
Starting in MAP 6.0, two significant changes were made to how the performance data is used and aggregated in the Server Consolidation scenario. These same comments apply to the Microsoft Private Cloud Fast Track scenario that was also introduced in MAP 6.0. The changes are:
Another significant change in MAP 6.0 is the introduction of the notion of an “infrastructure” in order to support the Microsoft Private Cloud Fast Track scenario. This same idea of an infrastructure was also added as an option to the existing Server Consolidation scenario. The following sections give more details on how these changes work.
When you collect performance data with MAP, a variety of performance metrics are sampled every 5 minutes for the included machines. Consider the metric %CPU utilization for a hypothetical machine Guest1. The sequence of %CPU utilization samples taken from Guest1 over time might look like the following where each pair is the elapsed time expressed as Hours:Minutes:Seconds since data collection began followed by the %CPU utilization:
(00:00:00, 25.5), (00:05:00, 36.2), (00:10:00, 24.4), (00:15:00, 41.33), (00:20:00, 57.41), ..., (47:55:00, 29.6), (48:00:00, 33.7)
When you have a sequence of %CPU utilization samples over time like this, one natural question to ask is “What was the %CPU utilization of Guest1 over the entire time span?" This, correspondingly, begs the question of how you aggregate this sequence of numbers into a single number representing the %CPU utilization for the entire time span; this is where aggregates like average, max and 95th percentile come into the picture.
Prior to MAP 6.0, the average or max aggregates were used when reporting performance metrics or when using performance metrics to place guests in the Server Consolidation scenario. However, for capacity planning exercises like the Server Consolidation scenario, a better aggregation method is to use a percentile aggregation with the 95th percentile being the typical choice. The 95th percentile of a sequence of %CPU utilization samples like the above is defined as the minimum sample S for which 95% of the samples in the sequence are less than or equal to S. Typically this will mean that 5% of the samples are greater than S. Why is this a good aggregation choice for capacity planning? If you plan enough capacity for the 95th percentile of a sequence of %CPU utilization samples over time, then this means 95% of the time you will have enough CPU capacity to service the observed load.
Correspondingly, 5% of the time your systems may be over utilized, but this is a reasonable tradeoff between hardware costs and the fraction of time when responsiveness is degraded. A similar observation can be made for other resources such as disk IO, memory, and network, whose utilization varies over time.
For reasons that will become clear in the following section on the Time Series Placement Algorithm, we have to normalize the performance data collected from different machines so that sequences of performance metrics collected from different machines can be added together. Suppose we have two machines Guest1 and Guest2 with the following sequences of network utilization values in Mbps (megabits/sec):
Guest1: (00:00:00, 3.5), (00:05:00, 11.1), (00:10:00, 5.4), (00:15:00, ?.??), (00:20:00, 3.71), ..., (47:55:00, 1.19), (48:00:00, 15.0)
Guest2: (00:00:15, ?.?), (00:05:35, ??.?), (00:10:47, 2.7), (00:16:03, 7.12), (00:21:13, 1.04), ..., (47:58:22, 1.19), (48:03:35, ??.?)
The two sequences are lined up such that the samples that are closest in time from Guest1 and Guest2 are stacked one right on top of the other. Notice, however, that the samples are not taken at exactly the same time. Due to all sorts of variables in the environment and the resource limits of the machine running MAP, performance metrics cannot be sampled from all machines
Moreover, notice that some of the samples are marked with question marks like “?.??” to indicate that these samples were unavailable because the target machine was offline, or MAP was not collecting performance data for that machine at the time. Clearly, the raw performance data has some rough edges. So what if we want to add these two sequences of network utilization metrics together to get the combined utilization of Guest1 and Guest2? This is where performance data normalization comes in.
Without going into exhaustive detail on how the normalization works, here is the basic idea:
Guest1: (Tmin, 9.05), (Tmin+10, 5.4), (Tmin+20, 4.1), (Tmin+30, 13.3), (Tmin+40, 7.50), ..., (Tmin+NNN-10, 11.7), (Tmin+NNN, 8.69)
Guest2: (Tmin, 2.70), (Tmin+10, 6.3), (Tmin+20, 1.8), (Tmin+30, 11.2), (Tmin+40, 10.1), ..., (Tmin+NNN-10, 2.31), (Tmin+NNN, 1.19)
Since Guest1 and Guest2 now have normalized samples for the same set of times and there is a sample for each time, it becomes obvious how to add the two sequences of resource utilization values together; namely, you just add together each pair of numbers at the same normalized times.
Given the explanations in the previous two sections on how the 95th percentile aggregate and performance data normalization works, a natural question to ask is how you know that the 95th percentile aggregates of the normalized performance data accurately represent the real world behavior of your systems; this is where data quality considerations come into play.
If you think about the definition of the 95th percentile aggregate---the minimum sample S for which 95% of the samples in the sequence are less than or equal to S---, then it becomes clear that this aggregate is not very useful for small data sets. For example, what does it mean for 95% of 8 samples to be less than or equal to one of these 8 samples? MAP can still compute the 95th percentile aggregate in this case because it uses a deterministic algorithm to compute the value (the computed value will be or be close to the max value), but the statistically interesting properties of the 95th percentile only show up for much larger data sets. This means you should plan to collect performance data for at least 2 days before you can expect to get good values from the 95th percentile aggregate. That said, you should not hesitate to collect performance data for shorter periods of time when doing test runs or familiarizing yourself with the MAP tool.
Another data quality issue to consider is what happens when MAP normalizes the performance data and fills in values for times at which machines are missing values by using aggregates of other samples nearby in time. Filling in these “holes” is necessary so that we can add the sequences of performance metrics from different machines together as described in the previous section, but if there is a large percentage of missing values (say more than 5%), then this may significantly distort the statistical properties of the normalized performance data compared to the real world behavior. What does this mean in terms of how you use MAP? Here are some rules of thumb:
One consequence of the first rule of thumb is that the results will not be as accurate if you use the functionality in the Performance Metrics wizard that allows you to append the collection of new performance data to existing data. The primary purpose behind this feature is to let you continue collecting performance data from a set of machines if it was interrupted unexpectedly for some reason (for example, the MAP machine collecting the performance data rebooted after applying an update). If, for example, you use this feature to collect performance data from one group of machines on Monday and Tuesday and then another group of machines on Thursday and Friday, then the normalized time period over which data was collected is Monday through Friday. This means that 3 of the 5 days will be missing data for all of the machines and the “holes” in the data that the normalization process fills in will be at least 60% of the normalized data. This obviously is not a desirable state of affairs, so using the append functionality of the Performance Metrics wizard in this fashion is not a recommended practice.
In versions of MAP prior to 6.0, a single aggregate (max or average depending on the resource) was computed for the entire time span over which performance data was collected for a machine for each resource type (CPU, memory, etc.). These numbers were used to determine if there was room for the machine on a Hyper-V host in the Server Consolidation scenario. This approach is not ideal because it does not deal well with machines whose resource utilization is uneven over time.
For example, if the machine Guest1 is used by people in North America and Guest2 is used by people in Asia, then these machines are likely to have inverted resource usage profiles over time and would fit well together on the same Hyper-V host. Analogously, if two machines served the same geographic region then their resource usage profiles over time would likely be similar and they may not be a good fit together on the same Hyper-V host. However, using an average metric over the entire time span for which performance data was collected misses these subtleties. In addition to serving different geographic regions, there are myriad other reasons that machines could have usage profiles over time that fit well together or not.
These observations led to the method introduced in MAP 6.0 to determine if there is room for a machine on a Hyper-V host while taking time into account.
The previous section described how MAP normalizes the raw performance data, and how this enables adding together the sequences of normalized resource utilization metrics for two or more machines over time. This ability to add sequences of normalized resource utilization metrics is at the heart of the Time Series Placement Algorithm introduced in MAP 6.0.
While running the algorithm, suppose that MAP has determined that machines Guest1, Guest2, ..., Guest4 will fit on Hyper-V host Host1. How do we know if Guest5 will fit on Host1 as well? What the algorithm does is add together the sequences of normalized resource utilization metrics for Guest1, ..., Guest5 for each resource we care about (CPU, memory, etc.) and determines if Host1 has enough capacity in each resource dimension. What is enough capacity? Host1 has enough capacity if the 95th percentile aggregate of the sum of the normalized sequences for Guest1, ..., Guest5 is less than or equal to the total capacity of Host1 for that resource dimension.
There are numerous other subtleties to the Time Series Placement Algorithm, not the least of which is determining which potential host Host1, ..., HostN provides the “best” fit for the next candidate machine GuestM. That said, the ability to sum up sequences of normalized resource utilization metrics and take the 95th percentile of the summed sequence is the fundamental insight behind how MAP takes time into account when making consolidation suggestions in the Server Consolidation and Microsoft Private Cloud Fast Track scenarios.
MAP 6.0 introduces the notion of an “infrastructure” used in the Microsoft Private Cloud Fast Track scenario, and optionally in the Server Consolidation scenario. Basically an infrastructure is the resource enclosure in which a group of Hyper-V hosts is provisioned; i.e., a server rack with associated disk (SAN) and network resources along with the Hyper-V hosts. This allows you to run consolidation scenarios that are more reflective of how an organization may be buying server resources; namely, in units of pre-provisioned server racks with everything necessary to run a “private cloud” rather than buying the individual components themselves with further assembly required.
So how is the infrastructure level taken into account during the Time Series Placement Algorithm when determining if a candidate machine will fit on a Hyper-V host residing in a particular infrastructure? Basically MAP just does the same thing at the infrastructure level that it does at the host level: it sums up the normalized resource utilization sequences for all the guests in the infrastructure across all hosts and determines if the 95th percentile aggregate of this generated sequence exceeds the SAN or network capacity of the infrastructure.
The previous sections provide the basic information concerning how the raw performance data is normalized and aggregated and how this data is used in the placement algorithm underlying the Server Consolidation and the Microsoft Private Cloud Fast Track scenarios. This, however, does not explain which numbers are showing up where in the various reports; this section provides these remaining details.
PerfMetricsResults-<date>.xlsx
ServerVirtRecommendation-<date>.xlsx
Important Because the 95th percentile of the sum of sequences of performance metrics from the placed guests is not the same as the sum of the 95th percentile of each of those sequences, adding up the guest utilization values on this worksheet will not give you the value of the host utilization except in the case of disk space usage which does not use the 95th percentile aggregate. A similar observation can be made about the utilization values for the infrastructures.
Microsoft Private Cloud Fast Track Consolidation Report-<date>.xlsx
Note There are no memory and disk space overhead values specified for the placed guests in this wizard
InfrastructureProfile – Provides the overview of the Microsoft Private Fast Track Infrastructure hardware profile.
Patris_70 edited Original. Comment: removed en-US from title