Last month, in “Start Harvesting with SOAP and REST Web Services,” I showed you how Web services offer a method to get data from Web servers without the need for screen-scraping, and that the services fall into two camps: the older SOAP-type and the newer, faster-growing RESTful Web services. This month, we’ll get our hands dirty by grabbing some real-time data from a government Web service.
Querying the U.S. Geological Survey’s (USGS’s) databases to find the current water height (in feet) of Florida’s Rainbow River is no harder than clicking a URL (http://waterservices.usgs.gov/nwis/iv/?sites=02313098¶meterCd=00065). Your browser will then show you a page of dense XML containing one little element with our desired data:
<ns1:value qualifiers="P" dateTime="2015-10-12T16:45:00.000-04:00">6.23</ns1:value>
In English, that means that the river’s water was last measured at 12:45 p.m., and its height at that moment was 6.23 feet.
As a first example of useful data from a Web service, this is pretty good, but it leads to three practical questions: How did I create that URL? How best can I capture that with Invoke-WebRequest? And how can we tell PowerShell to extract 6.23 from all that XML?
Building RESTful URLs
To build that long URL, I just searched the USGS site for “RESTful services” and located a page (http://waterservices.usgs.gov/rest/) that does a terrific job of explaining the kinds of things that their service offers. Not only that, but it provides a tool that prompts you to choose menu items, after which it will generate a working URL for their RESTful service and your particular need. That’s great—easy documentation yielding access to useful information! Understand, however, that some Web services have more complex interfaces (like ones that need authentication, as we’ll see in future columns), and others are just plain poorly documented.
Notice that the entire request has just two parts. First, there’s a URL pointing to a Web address (http://waterservices.usgs.gov/nwis/iv/), and then there’s a string (?sites=02313098¶meterCd=00065). If you’re Web-tech-savvy, you already know that this is a GET-type HTTP request, and that the string contains two name/value pairs: sites=02313098 (identifying which sensor site to query) and parameterCd=00065 (which in this case means “Tell me how deep the river is” rather than “What is its temperature?” or “How quickly is it flowing?” or the like). The site gave me the codes, and I didn’t mess up on the leading lowercase p. (Remember that XML is case-sensitive.)
If you’re a little rusty on passing parameters in a URL, you can see that the first parameter’s name/value pair is prefixed by a question mark (?), and any following parameters are prefixed with an ampersand (&). If there were a third name/value pair, like format=XML, the URL would look like http://waterservices.usgs.gov/nwis/iv/?sites=02313098¶meterCd=00065?format=XML.
Capture the Web Service Output to a Variable with Invoke-WebRequest
Now, time to put down the browser and pick up the PowerShell! We’ve used Invoke-WebRequest before, so the syntax will be pretty familiar. I’d do the Web service request in two statements:
$URI = "http://waterservices.usgs.gov/nwis/iv/?sites=02313098¶meterCd=00065" [string]$MyResult=(Invoke-WebRequest $URI)
Notice one new thing I did: I prefixed the $MyResult variable name with [string]. This is a technique called casting—something I’ve done before. PowerShell is smart, and it recognizes that the incoming data looks like XML, so by default it would automatically spend time parsing it into an XML object. There’s no need for it to do that, however, so I cast it to force it to store the data as a string, which takes less time and is a bit more copacetic with the next cmdlet we’ll be using.
Extract the Data with XPath and Select-XML
A look at $MyResult will show some messy XML, so I always dump it into XML Explorer (a terrific free tool) to get it more readably formatted. Next, we need an easy way to tell PowerShell to pluck out just that one number (I see 6.23, but you’ll probably see a different value). But how? By using a query language designed just for XML called XPath and PowerShell’s XPath query cmdlet, Select-XML.
XPath is useful because in this case (and in many Web service cases) we’ve got some complex XML output and we want to get our data nugget out with a minimum of effort. The key is that we know the name of the element containing our data: value, prefixed in this case with a namespace named ns1. This PowerShell statement will use XPath to find an element value regardless of its namespace, and then extract the text in the element, which is our number:
$WaterHeight = (Select-XML -content $result -XPath "//*[local-name()='value']").node.'#text'
I’ll return to XPath another day, but now, just know that all you’ve got to do to use (or repurpose) this statement is to find the element name and change value above to that name, so if your next Web service were to put your desired result in an element named OutVal, you’d use this statement:
$Webdata = (select-xml -content $result -xpath "//*[local-name()='OutVal']").node.'#text'
Try that out on a few Web services, and next month, we’ll dig deeper!