Have you ever wanted to find more information on government spending?
I know I have. Rohit Amarnath posted Extracting Government Spending Data via Talend and Ruby into CouchDB. He is working with Amazon to post this as a public data set via AWS.
I wanted to parse, filter and mine some of it from the original source USAspending.gov. Being built on .Net plus slick Xml capabilities, PowerShell makes this super simple.
This snippet
$url="http://www.usaspending.gov/fpds/fpds.php?datype=X&detail=-1&state=NY"
$wc = New-Object Net.WebClient
$xml = [xml]$wc.DownloadString($url)
$xml.usaspendingSearchResults.data.record.Totals
Retrieve Summary Contracts for All States
Get-SpendingSummary
- Line 17 pulls the Xml from the Web and converts it to an Xml Document
- Line 19 walks the Xml records and extracts the Total information
- Line 20 adds the State field to the record
Enrich-Summary: Adding Calculated Fields
I wanted to see the average number of transactions and average obligated amount per contractor. The Enrich-Summary function extends the downloaded data by adding three fields to the record. Two of them, AvgTransPerContractor and AvgDollarsPerContractor are calculated fields.
- Lines 27-29 use the Add-Member cmdlet. It creates a ScriptProperty and uses the $this variable which refers to the current object being extended.
Download the Script
{ 1 trackback }
{ 3 comments… read them below or add one }
C# code for the above PS script
static void Main(string[] args)
{
string[] States = {”AL”,”AK”,”AS”,”AZ”,”AR”,”CA”,”CO”,”CT”,”DE”,”DC”,”FM”,”FL”,”GA”,”GU”,”HI”,”ID”,”IL”,”IN”,”IA”,”KS”,”KY”,”LA”,”ME”,”MH”,”MD”,”MA”,”MI”,”MN”,”MS”,”MO”,”MT”,”NE”,”NV”,”NH”,”NJ”,”NM”,”NY”,”NC”,”ND”,”MP”,”OH”,”OK”,”OR”,”PW”,”PA”,”PR”,”RI”,”SC”,”SD”,”TN”,”TX”,”UT”,”VT”,”VI”,”VA”,”WA”,”WV”,”WI”,”WY”};
var wc = new System.Net.WebClient();
foreach (string state in States)
{
XmlDocument xd = new XmlDocument();
xd.LoadXml(wc.DownloadString(string.Format(”http://www.usaspending.gov/fpds/fpds.php?datype=X&detail=-1&state={0}”, state)));
XmlNode xnode = xd.SelectSingleNode(”usaspendingSearchResults/data/record/totals”);
Console.WriteLine(@”{0} {1} {2} {3}”,state, xnode.ChildNodes[0].InnerText, xnode.ChildNodes[1].InnerText, xnode.ChildNodes[2].InnerText);
}
Console.Read();
}
Here is a version using Linq to Xml
static void Main(string[] args)
{
var States = new string[] { “AL”, “AK”, “AS”, “AZ”, “AR”, “CA”, “CO”, “CT”, “DE”, “DC”, “FM”, “FL”, “GA”, “GU”, “HI”, “ID”, “IL”, “IA”, “KS”, “KY”, “LA”, “ME”, “MH”, “MD”, “MA”, “MI”, “MN”, “MS”, “MO”, “MT”, “NE”, “NV”, “NH”, “NJ”, “NM”, “NY”, “NC”, “ND”, “MP”, “OH”, “OK”, “OR”, “PW”, “PA”, “PR”, “RI”, “SC”, “SD”, “TN”, “TX”, “UT”, “VT”, “VI”, “VA”, “WA”, “WV”, “WI”, “WY” };
var url = @”http://www.usaspending.gov/fpds/fpds.php?datype=X&detail=-1&state={0}”;
var q = from state in States
from x in XElement.Load(string.Format(url, state)).FirstNode.ElementsAfterSelf(”data”).Elements(”record”).Elements(”totals”)
select new
{
State = state,
Total = x.Element(”total_ObligatedAmount”).Value,
Contractors = x.Element(”number_of_contractors”).Value,
TransActions = x.Element(”number_of_transactions”).Value
};
foreach (var item in q)
{
Console.WriteLine(item);
}
}
Here is the version using F# Why leave the function programming
#light
open System.Net
open System.Xml
let states = ["AL";"AK";"AS";"AZ";"AR";"CA";"CO";"CT";"DE";"DC";"FM";"FL";"GA";"GU";"HI";"ID";"IL";"IN";"IA";"KS";"KY";"LA";"ME";"MH";"MD";"MA";"MI";"MN";"MS";"MO";"MT";"NE";"NV";"NH";"NJ";"NM";"NY";"NC";"ND";"MP";"OH";"OK";"OR";"PW";"PA";"PR";"RI";"SC";"SD";"TN";"TX";"UT";"VT";"VI";"VA";"WA";"WV";"WI";"WY"]
let wc = new WebClient()
let xDoc = new XmlDocument()
let url = “http://www.usaspending.gov/fpds/fpds.php?datype=X&detail=-1&state={0}”
for state in states do
xDoc.LoadXml(wc.DownloadString(string.Format(url, state)));
let xnode = xDoc.SelectSingleNode(”usaspendingSearchResults/data/record/totals”)
System.Console.WriteLine(@”{0} {1} {2} {3}”,state, xnode.ChildNodes.Item(0).InnerText, xnode.ChildNodes.Item(1).InnerText, xnode.ChildNodes.Item(2).InnerText)
System.Console.Read();