Multithreading In PowerShell - Running A Specific Number Of Threads
Posted on November 28, 2019
- and tagged as
- powershell
In this post we’re aiming to accomplish in PowerShell what the previous post did in Python, which is to create a pool of threads to carry out a set of given tasks concurrently. The closest equivalent to what we achieved in Python can be accomplished in PowerShell using Runspace Pools. PowerShell also has PSJobs functionality via Start-Job
and -AsJob
, but these don’t allow us to specify the maximum number of threads we want, at least not without a significant amount of scaffolding or using non-default modules.
PowerShell Runspace Pools
Let’s use a simple example to illustrate how Runspace Pools are used. We can start by defining our worker function (as a ScriptBlock), which is the function that will be doing whatever work we wish to run in parallel. For this example, let’s create some files where the filename is a parameter we pass in. We’ll also add a Start-Sleep
command to simulate some lengthy process that would make this work suitable for multithreading.
$Worker = {
param($Filename)
Write-Host "Processing $Filename"
Start-Sleep -Seconds 5 # Doing some work....
$Item = New-Item -Name $Filename
Write-Output $Item.FullName}
We then define our runspace pools and configure the number of threads we wish to run. An ArrayList
for running jobs is also created which will help us monitor the batch processing status.
$MaxRunspaces = 5
$RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxRunspaces)
$RunspacePool.Open()
$Jobs = New-Object System.Collections.ArrayList
The CreateRunspacePool
method takes two values, the minimum and maximum number of threads. Below we define our sample data that will serve as the filenames for the files we’re creating.
$Filenames = @("file1.txt", "file2.txt", "file3.txt", "file4.txt", "file5.txt", "file6.txt", "file7.txt", "file8.txt", "file9.txt", "file10.txt", "file11.txt")
Finally we can bring everything together and run our tasks in parallel.
foreach ($File in $Filenames) {
Write-Host "Creating runspace for $File"
$PowerShell = [powershell]::Create()
$PowerShell.RunspacePool = $RunspacePool
$PowerShell.AddScript($Worker).AddArgument($File) | Out-Null
$JobObj = New-Object -TypeName PSObject -Property @{
Runspace = $PowerShell.BeginInvoke()
PowerShell = $PowerShell
}
$Jobs.Add($JobObj) | Out-Null
}
while ($Jobs.Runspace.IsCompleted -contains $false) {
Write-Host (Get-date).Tostring() "Still running..."
Start-Sleep 1
}
Putting it all together
If we combine the above code snippets we can have a reasonable boilerplate for future runspace pool usage.
$Worker = {
param($Filename)
Write-Host "Processing $filename"
Start-Sleep -Seconds 5 # Doing some work....
$Item = New-Item -Name $Filename
Write-Output $Item.FullName
}
$MaxRunspaces = 5
$RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxRunspaces)
$RunspacePool.Open()
$Jobs = New-Object System.Collections.ArrayList
$Filenames = @("file1.txt", "file2.txt", "file3.txt", "file4.txt", "file5.txt", "file6.txt", "file7.txt", "file8.txt", "file9.txt", "file10.txt", "file11.txt")
foreach ($File in $Filenames) {
Write-Host "Creating runspace for $File"
$PowerShell = [powershell]::Create()
$PowerShell.RunspacePool = $RunspacePool
$PowerShell.AddScript($Worker).AddArgument($File) | Out-Null
$JobObj = New-Object -TypeName PSObject -Property @{
Runspace = $PowerShell.BeginInvoke()
PowerShell = $PowerShell
}
$Jobs.Add($JobObj) | Out-Null
}
while ($Jobs.Runspace.IsCompleted -contains $false) {
Write-Host (Get-date).Tostring() "Still running..."
Start-Sleep 1
}
Caveats and Workarounds
There are a few gotchas when using runspaces that will probably cause a few headaches the first time you use them.
Getting return data
Below is the output for the above code
Creating runspace for file1.txt
Creating runspace for file2.txt
Creating runspace for file3.txt
Creating runspace for file4.txt
Creating runspace for file5.txt
Creating runspace for file6.txt
Creating runspace for file7.txt
Creating runspace for file8.txt
Creating runspace for file9.txt
Creating runspace for file10.txt
Creating runspace for file11.txt
4/12/2019 7:47:22 PM Still running...
4/12/2019 7:47:23 PM Still running...
4/12/2019 7:47:24 PM Still running...
4/12/2019 7:47:25 PM Still running...
4/12/2019 7:47:26 PM Still running...
4/12/2019 7:47:27 PM Still running...
4/12/2019 7:47:28 PM Still running...
4/12/2019 7:47:29 PM Still running...
4/12/2019 7:47:30 PM Still running...
4/12/2019 7:47:31 PM Still running...
The most obvious thing is the lack of output from our Worker function. We have both Write-Host
and Write-Output
commands inside the function, but none are present in the output. This is where some of the complexity around runspaces starts to show. To get output, we need to run the EndInvoke()
method of PowerShell instance for each iteration and provide it the runspace handle. Both of these are present in the $Jobs ArrayList.
PS C:\> $Jobs
PowerShell Runspace
---------- --------
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
System.Management.Automation.PowerShell System.Management.Automation.PowerShellAsyncResult
PS C:\> $Jobs[0].PowerShell.EndInvoke($Jobs[0].Runspace)
C:\Users\md\file1.txt
This has some ramifications for error handling, we either need to completely handle errors inside the worker function, or we need to append error data to our return object.
Access to the current PowerShell environment
If you ran the above code from anywhere but the default path your PowerShell is configured for (typically C:\Users\<username>
) you would have noticed that the files weren’t created in your working directory, rather in the default path. This is because each runspace runs in its own environment. However, even this can be a little tricky to understand. What would be the expected behavior if we slightly modified our worker function to this?
$Worker = {
param($Filename)
Write-Host "Processing $filename"
if ($Filename -eq "file1.txt") {Set-Location C:\Temp} Start-Sleep -Seconds 5 # Doing some work....
New-Item -Name $Filename
Write-Output $NewItem.FullName
}
The initial assumption tends to be that only file1.txt
ends up in C:\Temp
, whereas what actually happens is that files file1.txt
, file6.txt
, and file11.txt
all end up in there. The reason for this is that runspaces get reused. When file1.txt1
runs in the first runspace, the remaining 4 runspaces are used by file2.txt
, file3.txt
, file4.txt
, and file5.txt
. The first runspace then sets its working directory to C:\Temp
as the if
condition is satisfied. When the first runspace is free, having completed the work for file1.txt
, the next job is started, which happens to be file6.txt
. The path at this point is still set to C:\Temp
and that is where the file is created, and the same applies for file11.txt
.
Variable access
Runspaces don’t have access to variables defined in the parent PowerShell environment. This means if we define a higher scoped variable for the desired file path where we want to place the files, it will not work.
$Filepath = "C:\Temp\"
$Worker = {
param($Filename)
try {
$Result = New-Item -Name $Filename -Path $Filepath }
catch {
$Result = $_.Exception.Message
}
Write-Output $Result
}
What line 6 is actually executing here would be equivalent to the following
New-Item -Name $Filename -Path $Null
We can confirm this if we look at the output.
PS C:\> $Jobs[0].PowerShell.EndInvoke($Jobs[0].Runspace)
Cannot bind argument to parameter 'Path' because it is null.
The only way to have a runspace access variables that aren’t passed in as arguments is to use synchronized hashtables. Another great advantage of using synchronized hashtables is that we can also write to/modify them safely. To illustrate this, we can use the following code.
$Configuration = [hashtable]::Synchronized(@{})$Configuration.FilePath = "C:\Temp\"$Configuration.CreatedFiles = @()
$Worker = {
Param($Filename, $Configuration) Write-Host "Processing $filename"
Start-Sleep -Seconds 5 # Doing some work....
Try {
$Result = New-Item -Name $Filename -Path $Configuration.FilePath $Configuration.CreatedFiles += $Result.FullName }
Catch {
$Result = $_.Exception.Message
}
Write-Output $Result
}
$MaxRunspaces = 5
$SessionState = [System.Management.Automation.Runspaces.InitialSessionState]::CreateDefault()$RunspacePool = [RunspaceFactory]::CreateRunspacePool(1, $MaxRunspaces, $SessionState, $Host)$RunspacePool.Open()
$Jobs = New-Object System.Collections.ArrayList
$Filenames = @("file1.txt", "file2.txt", "file3.txt", "file4.txt", "file5.txt", "file6.txt", "file7.txt", "file8.txt", "file9.txt", "file10.txt", "file11.txt")
foreach ($File in $Filenames) {
Write-Host "Creating runspace for $File"
$PowerShell = [powershell]::Create()
$PowerShell.RunspacePool = $RunspacePool
$PowerShell.AddScript($Worker).AddArgument($File).AddArgument($Configuration) | Out-Null
$JobObj = New-Object -TypeName PSObject -Property @{
Runspace = $PowerShell.BeginInvoke()
PowerShell = $PowerShell
}
$Jobs.Add($JobObj) | Out-Null
}
while ($Jobs.Runspace.IsCompleted -contains $false) {
Write-Host (Get-date).Tostring() "Still running..."
Start-Sleep 1
}
And our hashtable has been modified as expected.
PS C:\> $Configuration
Name Value
---- -----
FilePath C:\Temp\
CreatedFiles {C:\Temp\file1.txt, C:\Temp\file2.txt, C:\Temp\file3.txt, C:\Temp\file4.txt...}
Lastly, this also sends the Write-Host
data to our parent console - neat!
Creating runspace for file1.txt
Creating runspace for file2.txt
Processing file1.txt
Creating runspace for file3.txt
Processing file2.txt
Creating runspace for file4.txt
Processing file3.txt
Creating runspace for file5.txt
Processing file4.txt
Creating runspace for file6.txt
Creating runspace for file7.txt
Creating runspace for file8.txt
Processing file5.txt
Creating runspace for file9.txt
Creating runspace for file10.txt
Creating runspace for file11.txt
4/12/2019 7:58:39 PM Still running...
4/12/2019 7:58:40 PM Still running...
4/12/2019 7:58:41 PM Still running...
4/12/2019 7:58:42 PM Still running...
4/12/2019 7:58:43 PM Still running...
Processing file6.txt
Processing file7.txt
Processing file8.txt
Processing file9.txt
Processing file10.txt
4/12/2019 7:58:44 PM Still running...
4/12/2019 7:58:45 PM Still running...
4/12/2019 7:58:46 PM Still running...
4/12/2019 7:58:47 PM Still running...
4/12/2019 7:58:48 PM Still running...
Processing file11.txt
4/12/2019 7:58:49 PM Still running...
4/12/2019 7:58:50 PM Still running...
4/12/2019 7:58:51 PM Still running...
4/12/2019 7:58:52 PM Still running...
4/12/2019 7:58:53 PM Still running...