PowerShell: Optimization and Performance Testing

PowerShell: Optimization and Performance Testing

Overview


PowerShell has been around long enough now that many of the basic questions are well-addressed. Between Get-Help, the internet and all the community resources available to scripters most of the what and how required to accomplish ordinary tasks can be answered quickly. Once you get your script working, bug-free, and, ready for roll out to production another set of questions may arise for which there is not as much discussion yet: how do I write the best possible script? Transforming a working script into a highly efficient, robust tool, however, requires a different set of questions and thought processes. This begins to bring the scripter out of the world of end use and administration into the arena of the developer. While some are reluctant to pick up another hat and put it in the development ring there is really no better tool than PowerShell to open the door into this parallel universe. So, let's take a glance at some basic concepts for maximizing a script's capabilities and then move onto some more in depth questions to see just how far we can push the envelope.

Two key areas of PowerShell scripting that do not get as much attention as they deserve are optimization and performance testing. Optimization means script modification designed or intended to reduce the time and resources required to run a script. In short, reworking code to run faster and more efficiently. Performance testing goes hand in hand with optimization and indicates the analysis of script results and execution to determine if optimization is effective. This is the way to really figure out if all those 'changes for the better' are really better. While optimization can be done without performance testing it begs plenty of questions. One quote I've seen a few different times comes to mind with regards to optimization without testing,

Great engineering comes from creating predictable results at predictable costs. In fact, I like to say that if you’re not measuring you’re not engineering. Performance Testing Guidance for Web Applications

"If you're not measuring you're not engineering." Granted, this is an old post, but, the point is as relevent today as it was when it was made. The only way to truly validate whether all the work you put into reworking something paid off or not it to give tangible, real world results. And, performance testing is a key way to do just that.

Default Order of Optimal Execution


With terms now laid out let's talk about some basic PowerShell points of knowledge that should be covered before you start. Knowing PowerShell well is a given. If you do not know the basics, exploring advanced concepts will be of little value. For example, Bruce Payette outlined the order of optimal execution in PowerShell in Action based on which type of command you are running. In section 2.1.2 (Command Categories) from pages 30-34, he says the fastest operations will occur in this order:
  1. cmdlets: compiled binaries
  2. functions: memory resident script code
  3. scripts (script command): functions/commands saved in .ps1 files 
  4. native Win32 executables: traditional executables
So, if you have a Win32 executable, it is possible porting it to a script, a function or even a cmdlet may speed things up. This may not always be feasible, but, there are plenty of situations where porting your functionality upstream can produce improvements in the efficiency of your Powershell code. The key here is to understand how using different categories of commands may affect the final degree to which your script may be optimized.

Optimization Considerations


With the basic order of optimal execution covered we can look at some questions you can ask to begin the process of refining your scripts:

Questions to ask when optimizing

  • Am I using the right construct for this situation?
  • Am I doing too many or unnecessary operations?
  • Am I working with too many objects?
  • Am I using too many shared resources?
  • Am I using collections correctly? Put another way: am I using the right (or best) collection for the task?
  • Is the pipeline the best approach?
  • Is there another command model that may work better?
  • Am I thinking about objects properly?
  • Am I using -Filter (if available)?
  • Am I reinitializing variables unnecessarily?
  • Is the task something for which jobs or runspaces could help?
  • Are my loops designed efficiently?
  • Am I using Foreach-Object when for will work just as well?
  • Are you using begin/process/end with ForEach-Object and in your scripts?
  • Are you using advanced functions?
  • Are you doing parameter checking?
  • Are you using Strict-Mode?
  • Do I limit the number of objects I connect to remotely, such as in Active Directory?
  • Do I limit the size of resultsets retrieved over the network, using filters, Where clauses, and parameters that specify values returned?
  • Do I retrieve values once, and save in a variable if needed again later?
  • Is there a low-level solution that may be more effective? .NET? P/Invoke?
  • Are you passing large collections to the pipeline instead of storing in an object?
  • Am I testing a condition unnecessarily?
 Below are demonstrations of why the questions above can pose issues:

Am I using -Filter (if available)? 


 In the snippet below a demonstration of how using -Filter with appropriate cmdlets can increase efficiency significantly. For this example, I search for a file C:\test\100Mb.txt using Get-ChildItem -Filter and Get-ChildItem | Where {$_.Name -eq '100Mb.txt'}. 

Clear-Host
1..10 | 
ForEach-Object {
"Command: output $_"
"-" * 25
$test1 = { Get-ChildItem -Path C:\test -Filter 100mb.txt | Out-Null }
$test2 = { Get-ChildItem -Path C:\test | Where {$_.Name -eq '100Mb.txt'} | Out-Null }
$results1 = (Measure-Command -Expression $test1).Ticks
$results2 =(Measure-Command -Expression $test2).Ticks
"{0}`t`t{1}" -f '-filter',$results1
"{0}`t`t`t{1}" -f 'Where',$results2
""
"{0}`t`t{1:N}" -f 'Difference',($results1/$results2)
""
}
Results returned from this test indicate, in such a small case, that speeds can be increased anywhere from 4-8 times when using -Filter in favor ForEach-Object with Where. As always, there are times when it is necessary to use Where, but, in simple filtering scenarios the -Filter parameter can significantly enhance script speeds.

Am I using Foreach-Object when for will work just as well? 


In the snippet below the latency introduced by using Foreach-Object instead of for to loop through collections gets examined.

"Command: output 1"
"-" * 25
$test1 = { 1..1000 | % { 1 } }
$test2 = { for($i = 1; $i -le 1000; $i++) { 1 }}
$results1 = (Measure-Command -Expression $test1).Ticks
$results2 =(Measure-Command -Expression $test2).Ticks
"{0}`t`t{1}" -f 'foreach',$results1
"{0}`t`t`t{1}" -f 'for',$results2
""
"{0}`t`t{1:N}" -f 'Difference',($results1/$results2)
""
"Command: evaluate 1 -eq 1"
"-" * 25
$test3 = { 1..1000 | % { 1 -eq 1 } }
$test4 = { for($i = 1; $i -le 1000; $i++) { 1 -eq 1 }}
$results3 = (Measure-Command -Expression $test3).Ticks
$results4 =(Measure-Command -Expression $test4).Ticks
"{0}`t`t{1}" -f 'foreach',$results3
"{0}`t`t`t{1}" -f 'for',$results4
""
"{0}`t`t{1:N}" -f 'Difference',($results3/$results4)
When this command is run on my machine I get the following output:

Command: output 1
-------------------------
foreach 903091
for 18212

Difference 49.59


Command: evaluate 1 -eq 1
-------------------------
foreach 907313
for 22254

Difference 40.77
This code demonstrates that using ForEach-Object, on my machine, when I tested, was anywhere from 40-50 times slower than using a straight for loop. Now, there are times when ForEach-Object simplifies iteration and control, so, it has it's place, but, to fully optimize scripts this is a major point of consideration.

Am I testing a condition unnecessarily?


Many times when examining conditions to determine is something is true/false there are simpler ways to test without actually evaluating. For instance, instead of testing to see if something is $true like this

if(1 -eq $true)
{
   Do-Something
}

you can simply let the if perform the test for you and this suffices:

if(1)
{
   Do-Something
}

To test this, I wrote a very small sample set which demonstrated about a 3% performance improvement when skipping the use of comparison operators in your evaluation:

$test_001 = { if(1 -eq $true) {1} else {0}}
$test_002 = { if(1) {1} else {0}}
Get-ChildItem variable:\test_* |
ForEach-Object {
Measure-Command -Expression {1..100000 | % { $_ }} |
select TotalSeconds
}

TotalSeconds
------------
12.1220562
11.732413
This is a simple example, but, if the object you are examining in an if statement can be reduced to a boolean value use that instead of testing.

Performance Testing Considerations


When performance testing there are several things to consider. Below is a list of key points to review in your commands and scripts:
  • Are you testing enough iterations and variations in order of alternate approaches to even out results?
  • Are you testing volatile data, structures or systems?
  • Are your results reproducible?
  • Can you explain why your test results validate your changes and optimizations?
  • If the first iteration is faster than subsequent iterations, results may be cached. If so, insert a pause between iterations sufficient to flush the cache and get consistent results.

Additional Resources


Below are additional resources you can use to explore optimization and performance testing further:

 

Leave a Comment
  • Please add 1 and 7 and type the answer here:
  • Post
Wiki - Revision Comment List(Revision Comment)
Sort by: Published Date | Most Recent | Most Useful
Comments
  • Carsten Siemens edited Revision 22. Comment: Added tag: has comment. Fixed misspelling.

  • Ed Price - MSFT edited Revision 21. Comment: Title edit, per guidelines

  • Will Steele edited Revision 20. Comment: Added section: Am I testing a condition unnecessarily?

  • Josh Miller 76 edited Revision 19. Comment: Adding a link to PowerShell Team blog about ways to suppress output.

  • Richard Mueller edited Revision 18. Comment: Added testing consideration to avoid distortions due to caching

  • Josh Miller 76 edited Revision 17. Comment: Test Considerations expand bullet 1

  • FZB edited Revision 15. Comment: typo

  • Richard Mueller edited Revision 14. Comment: Added toc

  • Al Dunbar edited Revision 13. Comment: removed the implicit assumption that one's powershell code resides only in powershell scripts.

  • Will Steele edited Revision 12. Comment: Add Doug Finke link

Page 1 of 2 (20 items) 12
Wikis - Comment List
Sort by: Published Date | Most Recent | Most Useful
Posting comments is temporarily disabled until 10:00am PST on Saturday, December 14th. Thank you for your patience.
Comments
  • Richard Mueller edited Original. Comment: Added 3 considerations regarding retrieving information remotely. These have been the biggest factors in improving performance.

  • Will Steele edited Revision 1. Comment: Added example for foreach/for lag times

  • Richard Mueller edited Revision 2. Comment: Converted list to bullet points, added tag

  • Will Steele edited Revision 3. Comment: Added section breaks

  • Will Steele edited Revision 4. Comment: editted bullet

  • Will Steele edited Revision 5. Comment: added bullet on -Filter

  • Will Steele edited Revision 6. Comment: Add testing considerations

  • Will Steele edited Revision 7. Comment: Added test for Am I use -Filter?

  • Will Steele edited Revision 8. Comment: resized headers

  • Will Steele edited Revision 9. Comment: Added new section and link for Additional Resources

  • Will Steele edited Revision 10. Comment: Added new link to resources

  • Will Steele edited Revision 11. Comment: Added link to Resources

  • Will Steele edited Revision 12. Comment: Add Doug Finke link

  • Al Dunbar edited Revision 13. Comment: removed the implicit assumption that one's powershell code resides only in powershell scripts.

  • Alex Angelopolous nailed it when he wrote: "Don't try to optimize PowerShell scripts as you write them. You might be optimizing code that either disappears on its own or doesn't have a significant effect on final performance. Scripter time is always more difficult to come by than CPU cycles."

    I would extend this to suggest that care be taken to code in the most general manner possible by such strategies as: naming variables for what they actually mean, avoiding the re-use of variables for purposes other than their initial use, and avoiding complex expressions, among others.

    Collections can be processed in a variety of ways, given the set of available looping and conditional structures. The choice may have performance implications, but it also has ease of understanding implications. Saving current run-time at the expense of making future maintenance more difficult is, in my mind, a questionable tactic. As well, it can often be the case that improved speed comes at the expense of other resources, such as memory.

    There is another old saw that applies here: optimize only in those cases where the overall improbement will be significant. Tweaking a loop to run in 5% of the original time will be an insignificant gain if only 1% of the total time was spent in the loop. Thus what appeared at first to increase the speed of execution by a factor of 20 would, in that case reduce overall execution time by less than 1%.

Page 1 of 2 (24 items) 12