Home > PowerShell > Progress bars for lazy PowerShell users

Progress bars for lazy PowerShell users

You know you’re guilty of it. We’ve all done it out of laziness. A little "Write-Host" sprinkled here and there throughout your script in order to show progress because using Write-Progress was too much of a burden. Trust me, I feel your pain. I regularly process large data sets of call detail records (CDR’s) in PowerShell and chewing on a file can take a very long time. When I was writing my cmdlets that process CDR’s, I made sure to support the integrated progress display mechanism so my end users (aka me) had a general idea of when the damned thing would finish.

But man it was a lot of work and so very often I would skip it and just resort to Write-Host messages. Big mistake. Aside from cluttering up the UI, it is hard to keep track of percent complete and you miss out on the better (but still not quite great) progress indicator in PowerShell ISE.

So I am pleased to unveil to the community my general purpose ForEach-Object replacement – Progress-Object. I struggled with the name, and I know I am in violation of every PowerShell naming commandment.

It is very similar to ForEach-Object, though I did take some shortcuts by not implementing the Begin/End script blocks. The key difference being that it consumes all of the input into a buffer (which is assumed to be easily gathered) then distributes the input to the processing script (or the pipeline) one by one, updating a the progress display as it progresses.

A simple example of a file concatenation follows.

# Assumes Progress-Object is aliased to %%
dir c:\logs\*.txt | %% { gc $_ } | out-file alllogs.txt

# or you can specify the activity message
dir c:\logs\*.txt | %% -activity 'merging logs' | out-file alllogs.txt

# activity and status message can be string or a scriptblock
dir c:\logs\*.txt | 
    %% -status { "$($_) - $($_.Length) bytes" } | 
    get-content |
    out-file alllogs.txt

# like Write-Progress, you can use nested progress by
# specifying the Id and ParentId parameters
dir c:\logs\*.txt | 
%% -Id 1 { 
    gc $_ -ReadCount 1000 | 
    %% -id 2 -parentid 1 { 
        # do stuff
    } 
}

Key thing to remember is that Progress-Object first *gathers* all of the input before processing any of it. This is required in order to know how far along the process is going. This makes Progress-Object undesirable in some cases.

When to use Progress-Object:

  • You have a finite set of inputs
  • Your inputs take a relatively short time to produce or are already collected (arrays are a great example)
  • The processing of an individual input may take a long time

When not to use Progress-Object:

  • You have only a few inputs that each take a long time to produce (the progress bar will be too chunky to be useful)
  • Your inputs take a very long time to produce (a recursive scan of the hard drive to sum up file lengths would spend all of it’s time gathering the inputs and only a tiny amount of time progressing through the results)
  • You have too many inputs to reasonably hold in memory
  • You have a continuous input stream (duh – Get-Content -Wait will never finish producing input so nothing will get processed.)

You can get Progress-Object from my SkyDrive in the Scripting.psm1 module. I’ve done my best to eliminate dependencies in that function, but the Scripting module as a whole may have some dependencies on other modules in my SkyDrive.

If you like it and want to see more, let me know!

http://cid-89e05724af67a39e.skydrive.live.com/embedrowdetail.aspx/PowerShell/Modules/Scripting

Categories: PowerShell
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.