PowerShell Hashtables – You’re doing them wrong

Update from the PowerShell team:
$h.$key will perform very differently if $key has many unique values

$h.["abc"] or $h.abc will perform roughly the same in a loop.

Both versions add the numbers 1 to 1000 as keys to a hashtable.

The difference? The second version is 100 times faster.

In PowerShell you can access the key of a hashtable with dot notation, the first way or with indexer notation, the second way.

Slight difference in syntax, significant difference in performance.

Note: I found this out when building this Native PowerShell Spelling Corrector – Google Style.

Native PowerShell Spelling Corrector – Google Style

Peter Norvig, Director of Research at Google, posted “How to Write a Spelling Corrector”.

The PowerShell version reads a text file that has ~110K words, creates a lookup dictionary, and produces a list of possible corrections based on the edit distance between two words. See Mr. Norvig’s post for details.

All this in a page of PowerShell script, and it comes back in less than a second.

Invoke-SpellCorrector speling

Ported To PowerShell

I ported this back in 2007, PowerShell v1.0. Grab the script and the homles.txt file from GitHub.

Web Scraping with PowerShell – PDF Files

Often times you want the information that is in a PDF. You want to extract data, munge text for input to another process, or parse and save the results to a database.

Parsing PDF data is a challenge, fortunately there is a great library iTextSharp that does the job for you. You can download it from NuGet Gallery and it’s written for .NET. That means you can use in PowerShell to automate PDF processing.

Get-PDFContent

Here is a script that lets you read a PDF and extract the content as a string.

Dot source the script and you can start reading local PDF files.

Bonus Points

This same script let’s you read a PDF straight from the web.

Grab the PowerShell Script and iTextSharp

The PowerShell script and iTextSharp DLL are here on my GitHub repo.

Web Scraping with PowerShell – CSV Files

Reading CSV Files

PowerShell can work with CSV files from local files or in memory. Sometimes, there are CSV files on the web. You could download the file manually and then use a PowerShell script to process it with Import-Csv. You could also craft a script to download the file and then use Import-Csv.

Or, you could retrieve that file as a string and convert it on the fly with ConvertFrom-Csv. This option is the cleanest, it doesn’t require a local file to be created and when the script finishes, memory is reclaimed. On GitHub I created a CSV file Album List that has a random selection of albums.

You can retrieve and print them like this:

Here is a partial printing:

The Invoke-RestMethod retrieves the text from the target url. It’s then is piped to ConvertFrom-Csv which creates and prints an array of objects with the property names Artist and Name.

For bonus points, grab the PowerShell Excel module on the PowerShell Gallery, or GitHub and create a spreadsheet with columns ready for reading with AutoSize.

image

PowerShell ISE goes Agile!

The PowerShell Integrated Scripting Environment (ISE) just got posted to the PowerShell Gallery, official post.

ISE has shipped in the PowerShell box for years, meaning, the only way we got new features or fixes was when a new version of PowerShell shipped.

Starting with the new PowerShellISE-preview, you’ll see new features perhaps on a monthly basis.

Plus, it runs side by side with the the built-in ISE, so you won’t be disrupted and will always a stable release to fall back on.

But wait, there’s more! You get to add to and vote on the backlog. Head over to the PowerShell UserVoice site and have at it.

I did a down and dirty web scraping of the 59 open items for ISE, here are the top 10:

This is great news. We get features and fixes that much sooner, and the add-on model for ISE is the first scheduled set of improvements. That’s terrific news for everyone who wants to customize the editing experience.

Note: This first preview release only works with PowerShell v5, is English-only and existing add-ons could have issues.

So, download it, kick the tires, and head on over to the UserVoice site and vote early and often. Don’t forget to add what you want to see in ISE too!

Visual Studio Code

While we’re on the topic of editors, Microsoft’s new modern editor Visual Studio is available and has PowerShell support. Plus, the VS Code, PowerShell Editor Services, and vscode-powershell extension are on open source and up on GitHub.

Folks like Keith Hill, Adam Driscoll and myself have already contributed features.

Options are always good. Having multiple editors tuned to for different purposes is great.

Convert JSON or CSV to a PowerShell Class

There’s a new keyword in PowerShell v5.0, it’s for creating classes directly in your PowerShell scripts. Check out Introduction to PowerShell 5 Classes.

Verify Incoming Data

One application of PowerShell classes is the ability to simplify verification of incoming data. For example, you may have comma separated input that looks like this:

Line 3 clearly has bad data. If you use ConvertFrom-Csv, it will happily create PowerShell objects on the fly but both the name and age property will be defined as strings, and 10a is a string. We need a little more safety.

Here is a class Person that strongly types the properties name as string and age as an int.

Putting this together let’s you detect data errors early.

Running the above generates this error.

This is a great way to do pre-flight checks on incoming data to determine if there are issues.

Expand the DataSet

What if the input data has more properties:

Working up a PowerShell class by hand for this can be tedious and error prone. So, let’s automate it with ConvertTo-Class

ConvertTo-Class

ConvertTo-Class is a PowerShell module I published, and it’s on the gallery. It automates creating a PowerShell class from either CSV or JSON text data. ConvertTo-Class determines both the name of the property and the actual data type of the data.

Quick, Easy and accurate!

ConvertTo-Class infers the correct data type for the properties age,zip, and rent.

The class is ready to go. You can save it to a file and dot source it or you can do an Invoke-Expression on it and use it immediately.

Auto Detects JSON Too

The same ConvertTo-Class function atuo detects JSON and produces the same class as it did with the CSV data.

Here is the same class. Notice you can use the -ClassName parameter to override the default class name of RootObject.

Plus, ConvertTo-Class Handles Multiple Classes

Here is sample JSON that has multiple classes. ConvertTo-Class detects and generates each class with it’s properties.

Here’s the code generated PowerShell classes, wired up and with the correct data types.

Generate C# or PowerShell Classes

PowerShell class syntax is the default output. You can also generate classes for use in C# with the -CodeGen CSharp parameter.

In an upcoming post, you’ll see how you can easily specify your own code generation “rules” to create constructs for other languages or purposes.

In Action Video

I’ve also published two PowerShell scripts that let you paste JSON or CSV to classes from the clipboard in either the console, using PSReadline or from ISE via an addon.

image

Grab the PowerShell

You can get ConvertTo-Class from the PowerShell Gallery or on my GitHub repo.

More Visual Studio Code Extensions – Rendering Markdown through Pandoc

Markdown is a fantastic way to overcome the ceremony of writing a document. From blog posts, to readme files and publishers let authors write their book in markdown. Once you’re done writing, the last step is to render the markdown in a format for the target audience. If it’s a comment on GitHub/Stackoverflow, it’s automatically rendered.

If you using a text editor, you might drop down to the command line or context switch to another tool.

This Just got easier in Visual Studio Code

This Visual Studio Code extension lets you easily render markdown files as a pdf, word document or html file.

You need to install Pandoc – a universal document converter.

Two ways to run the extension. You need to have a markdown file open.

  1. press F1 on Windows (cmd+shift+P on Mac), type pandoc, press Enter
  2. Or – press the key chord ctrl+k then p

Then choose from the list what document type you want to render and press enter (you can also type in the box rather than cursor around).

Getting the Extension

The vscode-pandoc extension is published on the gallery. You can install the vscode-pandoc from the palette.

New Visual Studio Code Extension – Markdown Links

Markdown Reference Links in Visual Studio Code

This VS Code extension let’s you find a reference link already in the file being edited. Many times a second link to a URL is needed you linked to earlier in the document.

How it works:

  1. Scan the document and collect all the reference links in it
  2. Present the list of links, allows you to choose the one you want
  3. Add the Markdown code that refers to the chosen link

In Action

Two ways to get to make this work. First , select the text you want to become the link. Then:

  • Open the palette, find the extension and press enter.
    • Press F1, type xref, press Enter
  • Or: Press ctrl+shift+L

image

VS Code Gallery

The extension is up on the gallery. In VS Code, press F1, type ext install press Enter. This gets a list of extensions, then type xref to search for this markdown one. From there you can install it.

Thanks to this post for a great idea to make editing even simpler More Markdown reference links in BBEdit

3 Ways to Speed up Visual Studio Code Extension Development

I published my first Microsoft Visual Studio Code Extension to the gallery. Select text in the editor, press Ctrl+Shift+F1 and it will search Stack Overflow, and specifically entries tagged with PowerShell.

While creating an extension in Visual Studio Code, I found myself interacting with three command line tools and I wanted a more efficient way to work with them from PowerShell.

The Command Line

The three tools are npm, tsd and vsce.

  • npm – is the default Package manager for the JavaScript runtime environment Node.js
  • tsd – is the TypeScript Definition manager for DefinitelyTyped
  • vsce – is tool you use to publish Visual Studio Code extensions to the Extension Gallery

For example, if you want to open a URL in the users browser from your extension, you can leverage the node open package. I used TypeScript to create my extension so I needed to install both the the node package and TypeScript definition to make this work.

npm install open –-save
tsd install open –-save

After installing the packages you need and you finished developing your extension, publishing to the gallery is next. That’s a whole other command line tool, vsce. It too has several parameters and switches to setup a publisher, do the packaging or publishing and more.

Enter PowerShell

You’ve got enough things to remember. Plus, we’re polyglots. I prefer to let the computer augment my recall capacity, because it’s so much more efficient.

At the command line, if you type npm <space> and then <TAB>, PowerShell will complete it with file names in the current directory. I could type npm and press <Enter>, scroll around looking for the parameter/switch I need and then type install. Lots of typing, lots of chances for spelling mistakes, very inefficient.

Another scenario, lets say you wanted to run a script in your package.json. You can use npm run, press <Enter>, list the available scripts, bring up that up again and type the script you want to run.

Compare that to the quick and easy tab completion in PowerShell.

Note: If you don’t have the npm completion module, then from a PowerShell v5 prompt install it with:

Install-Module -Name NPMTabCompletion. After it is installed, do an Import-Module NPMTabCompletion

The Microsoft TabExpansion Module lets you dynamically interact with the environment as you type. When you type npm run, the NPM Completion module cracks open the package.json, converts it to PowerShell objects (ConverFrom-JSON), enumerates the scripts property to construct the drop down list. Lastly, each script entry has a name and it’s associated code. This is transformed into the list item and tooltip.

The Other Tab Completion Modules

All of the tab completion modules are on the PowerShell Gallery and can be installed using Install-Module.

Note: You also need the Microsoft TabExpansion Module

Install-Module -Name NPMTabCompletion
Install-Module -Name TSDTabCompletion
Install-Module -Name VSCETabCompletion

If you want do see how these were built, check it out on my GitHub repos: NPM TabCompletion, TSD TabCompletion, VSCE TabCompletion

Summary

These tab completions help you power through what you need to get done when using the command line tools. The added bonus, you can install each separately. So if you’re working with just TypeScript or NPM, this will make you more productive.

Visual Studio Code and the PowerShell Extension Hack Week

Starting Sunday, December 6 from 11am to 12pm PST. Microsoft’s David Wilson will host a Crowdcast Event giving an overview of the PowerShell Editor Services, the PowerShell extension for VS Code, and other general ideas for contributions that people can make.

Microsoft Visual Studio Code is the new modern editor, it’s open sourced, up on GitHub and it has an extension model. Plus the PowerShell team has dedicated resources to create and open source a PowerShell extension. This extension provides PowerShell intellisense, debugging and more.

Hope to see you at the Hack Week!

I’m a contributor to the PowerShell Extension, Editor Services and VSCode-PowerShell.

Here are a couple of items going into the core offering.

Launch Online PowerShell Help

image

Expanding Aliases

image