Update Excel Spreadsheets from PowerShell

Update Excel Spreadsheets from PowerShell

The database technology that holds the most corporate data is Excel. While it's not relational, the definition of database is "a comprehensive collection of related data organized for convenient access, generally in a computer." Business people who need to make decisions do so based on data they most often find in an Excel spreadsheet.

Related: Is Excel the Next Killer BI App?

While I generally recommend using something like SQL Server Reporting Services (SSRS) to create a report, and then export that report to Excel, there are needs which make this method problematic. In a case I'm currently working with, a rather well designed dashboard-type spreadsheet needs to be updated every day with the new results of the nightly processing. There are a couple of worksheets within the document that need to be refreshed with the current data, and then the calculations on the dashboard will adjust the results based on the new data.

Test Spreadsheet and Added Worksheet

I've set up a test spreadsheet, and added a worksheet that I named "Play." Each time I run the script, I want to delete everything on the sheet, run a query against my SQL Server database, and populate the sheet with the contents of the query results.

First, let's open the Excel document using the COM object (since there's no native PowerShell cmdlets for this yet). We'll also pipe the Worksheets collection from the workbook to a Where-Object cmdlet, selecting just the one where the name is 'Play.'

$Path = 'c:\Work\PowerShellPlayTime.xlsx'
 
# Open the Excel document and pull in the 'Play' worksheet
$Excel = New-Object -Com Excel.Application
$Workbook = $Excel.Workbooks.Open($Path) 
$page = 'Play'
$ws = $Workbook.worksheets | where-object {$_.Name -eq $page}

There are a lot of different ways to delete the existing rows, but in our case we just want to remove what's there, so we'll use the Clear() method. I pipe the output to Out-Null, because the method is chatty, and will return 'True' when it completes.

# Delete the current contents of the page
$ws.Cells.Clear() | Out-Null

Now, I'm going to use ADO.NET to execute my query and return results. If you've loaded the SQLPS module you can use Invoke-Sqlcmd, but with ADO.NET, I don't have to ensure the module is loaded before running my query. The tools I need are already there.

# Connect to the target SQL Server and run the query to refresh data
$cstr = "Server=MyServer;Database=Northwind;User ID=sa;[email protected]"
$cn = new-object system.data.SqlClient.SqlConnection($cstr);
$ds = new-object "System.Data.DataSet" "dsProductData"
$q = @"
SELECT TOP 100 [ProductID]
  ,[ProductName]
  ,[SupplierID]
  ,[CategoryID]
  ,[QuantityPerUnit]
  ,[UnitPrice]
  ,[UnitsInStock]
  ,[UnitsOnOrder]
  ,[ReorderLevel]
  ,[Discontinued]
  FROM [dbo].[Products]
"@
$da = new-object "System.Data.SqlClient.SqlDataAdapter" ($q, $cn)
$da.Fill($ds) | Out-Null
 
$dtProduct = $ds.Tables[0]

The $dtProduct variable is an ADO.NET DataRow collection. The Fill() method will return the number of rows returned from the query, and again, I don't need that information, so I pipe it to Out-Null. If I pipe the $dtProduct variable to Get-Member, it shows me the methods and properties, allowing me to see the data types of each of the columns returned.

DataRow

For ease in readability, I'll set a variable to contain the cells of the worksheet, and also initialize row and column variables for navigation as I populate the worksheet. Then, I'll write the column names into the first row as column headers.

# Set variables for the worksheet cells, and for navigation
$cells=$ws.Cells
$row=1
$col=1
 
# Add the headers to the worksheet
$headers = "ProductID","ProductName","SupplierID","CategoryID","QuantityPerUnit","UnitPrice","UnitsInStock","UnitsOnOrder","ReorderLevel","Discontinued"
$headers | foreach {
    $cells.item($row,$col)=$_
    $col++
}

Next, we'll iterate through the DataRows and insert them into the cells, and if they are numeric columns, we'll set the properties of the cell so they display properly.

# Add the results from the DataTable object to the worksheet
foreach ($prod in $dtProduct) {
    $row++
    $col=1
    $cells.item($Row,$col)=$prod.ProductID
    $col++
    $cells.item($Row,$col)=$prod.ProductName
    $col++
    $cells.item($Row,$col)=$prod.SupplierID
    $col++
    $cells.item($Row,$col)=$prod.CategoryID
    $col++
    $cells.item($Row,$col)=$prod.QuantityPerUnit
    $cells.item($Row,$col).NumberFormat="0.00"
    $col++
    $cells.item($Row,$col)=$prod.UnitPrice
    $cells.item($Row,$col).NumberFormat="0.00"
    $col++
    $cells.item($Row,$col)=$prod.UnitsInStock
    $cells.item($Row,$col).NumberFormat="0"
    $col++
    $cells.item($Row,$col)=$prod.UnitsOnOrder
    $cells.item($Row,$col).NumberFormat="0"
    $col++
    $cells.item($Row,$col)=$prod.ReorderLevel
    $cells.item($Row,$col).NumberFormat="0"
    $col++
    $cells.item($Row,$col)=$prod.Discontinued
}

Readability is Important

Readability is important, so we'll set all the columns to AutoFit based on the column contents.

# Set the width of the columns automatically
$ws.columns.item("A:J").EntireColumn.AutoFit() | out-null

Finally, we'll close the worksheet, saving our changes, and exit Excel.

# Close the workbook and exit Excel
$workbook.Close($true)
$excel.quit()

Rather than using SQL Server Management Studio (SSMS), running the query, and pasting the results into the spreadsheet, I can run this script, and the formulas in the rest of the dashboard will now be based on the new results.

Related: Use PowerShell to Load XML Data into SQL Server

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish