USERS  
  Log-In  
  Register  
  Members  

Daniel A. Shockley - POSTing web forms

Daniel A. Shockley - POSTing web forms

Here is a recap of some information I posted at the AppleScript BBS at MacScripter.net. One poster wanted to take information from a FileMaker database, do some kind of web search based on that, and return the source to a FileMaker field. This is a somewhat simplistic description of what happens, but it should get you started. I've also got a useful URL Access handler that uses curl if you are on OS X.

An overview of POSTing web forms from FileMaker

First, we'll show a simple case, using the GET method (the search terms data is simply included in the URL), then move on to POST.

If you wanted to search Yahoo from a FileMaker field called WebSearch, you'd do the following (don't need the tell and end tell commands if run inside a FileMaker Perform AppleScript step):

tell app "FileMaker Pro"
  set searchBase to "http://search.yahoo.com/bin/search?p=" 
  set theSearchTerms to cell "WebSearch" of current record of document 1 
  set theURL to searchBase & theSearchTerms 

  -- get the source as a string 
  set sourceHTML to (do shell script "curl " & quoted form of theURL & " | vis")
  set cell "Source" of current record of document 1 to sourceHTML
end tell 

A few important points:

  1. You need to find the base URL you need.
  2. Some sites don't allow this (although it is hard to stop). Google recently made it difficult (there may be a way around, but you're violating their "terms of service") to download search results directly, without a web browser. Curl can mimic web browsers in sophisticated ways, but the Terms of Service are Google's prerogative.
  3. You may need to encode multiple search terms in URL format (i.e. spaces become %20), although it often will work without this.
  4. The search engine you wish to use may use the POST method, rather than GET (using GET means the search terms are in the URL, using POST means they are attached apart from the URL). If the engine uses POST, you'll have to put your search terms in an option to curl: --data (read the curl man pages for more detail, an look up "form post method" on the web)

How do I figure out the search URL I need for a form that uses POST?

This case is more difficult, as it requires the POST method, not just GET (which means you may want to avoid URL Access Scripting on OS X, since URLAS's POST was broken in several versions of 10.1 and 10.2). Here's an overview, with amazon.com as an example, since it uses POST:

HTTP is just a simple way for a computer to say "Hey, give me some information back", as I'm sure you know. One method is to just send a URL to a server, and it then sends a bunch of ASCII (or binary) back to your computer, which is waiting for the response. Many forms work this way. The form is specified in the HTML source code of the page to use the method GET, which means that your web browser will take the info you typed into the fields (and any hidden INPUT fields in the source code), and tack that onto the end of the ACTION url for that form (ACTION is also specified in the HTML source code of the page). Then, the server gets all the fields you filled out (plus those hidden ones), determines what you want, and sends back a page (or other data in some cases).

However, sometimes there is just too much information to fit in an URL (or the website doesn't want data sent in URLs). So, there is the POST method, which means that your request includes the fields as a separate part of your request to the web server.

The site you're trying to access uses POST. Take a look at the source code of the page where your form is. Find the form tag you want (there can be more than one). See what the ACTION parameter equals (this may be a relative URL, which means you need to look at the URL of the page your are on). Also, see if the method is GET or POST. If it is POST, as in our example, you'll need to attach the data separately from the URL. The command-line utility curl can do this using the --data option. URL Access Scripting is SUPPOSED to be able to attach form POST data, but that is broken in some versions of OS X (works in OS 9). That is the main reason I've given up on URLAS. A second reason is that cuurl is much faster and has more options, such as saving directly to string, rather than requiring you to save to a file.

Look for the <INPUT> tags to get the fields' names. You may also need <SELECT> tags, if a popup is used, or <TEXTAREA> as well. Most will be <INPUT> tags, with a type of hidden, text, radio, or checkbox. The --data options needs a string (with single quotes around the string) that looks like this: 'SOMEFIELDNAME=SOMETHING&ANOTHERNAME=SomethingElse' and so on.

So, your command to curl may look like this:

curl 'http://www.SOMESITE.com/their_cgi' --data \
'firstname=Dan&lastname=Shockley' 

So, for Amazon.com, to search for Stephenson in All Products, you'd use (notice I did not include the big numeric part in the URL, since that was specific to my visit):

curl 'http://www.amazon.com/exec/obidos/search-handle-form/' \
--data 'url=index=aps&field-keywords=Stephenson' 

Now, some caveats: you need to make sure you specify all the required hidden fields. Another issue is that you may get back a response saying that the info you want is at another URL. This can happen when the web server gets your request, and, instead of feeding it back immediately, caches it in a local file that you can retrieve with a simple GET request. When it responds this way, your web browser just goes there, so it looks to you as if it went directly there immediately. What really happened is that the web browser was smart enough to say "Oh, OK, the web server is telling me that what I want is somewhere else, so I'll just ask for that now instead" and you see the desired information. To tell curl to mimic this behavior, you need to add the --location option. So now your curl command would look like this (Amazon.com does not need this):

curl 'http://www.SOMESITE.com/their_cgi' --data \
'firstname=Dan&lastname=Shockley' --location

Now you'll get the information you want.

Oh, I should mention that to make it much easier to see what happens when you post a form in your web browser, you can get the utility tcpflow at http://www.circlemud.org/~jelson/software/tcpflow/

To watch your Ethernet connection, use: sudo tcpflow -c -i en0
(you'll need your Admin password)

To watch your Airport connection, use: sudo tcpflow -c -i en1
(you'll need your Admin password)


For more information, please

Return to my main page