Browsing From The Command Line / Server With cURL

cURL is a tool used for browsing the web from the shell or command-line. It supports many internet protocols, such as HTTP, FTM, POP3, IMAP, SMTP and more. See the full list here.

With libcurl installed in your system and the C API, you can browse using a C program. You can also install the extension cURL for PHP.

Steps of a cURL Program

  1. Obtain a curl handle
    For example:

    $curl_handle = curl_init();
  2. Set options to the handle.
    Define the behavior of curl when executing the request: setting the URL, cookie location, variables passed to the server, etc.
    In PHP, you can use curl_setop to set a single option, or curl_setopt_array to pass a PHP array.
    For example:

    curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,true);

    will cause cURL to store the output in a string; the value ‘false’ will cause cURL to send it to the standard output.

  3. Execute the request.
    $out=curl_exec($curl_handle);
  4. Close the handle
    curl_close($curl_handle);

As long as the handle is open, you can repeat steps 2 and 3 as many times as you need.

Another useful cURL function is curl_getinfo. In the example below, I have used

"$httpCode = curl_getinfo($curl_handle, CURLINFO_HTTP_CODE);"

to determine if the login action was successful.

 An Example – Sharing a Message in LinkedIn

Publishing a text message in LinkedIn is simple: surf to LinkedIn, find the relevant form in the response and submit it. I’ve found the login form and the publish form using HTML Dom documents. Then populated post vars according to them, and connected to the URL given in the “action” attribute of the form. The code is run in PHP CLI in Linux (“stty -echo” is a system call that suppresses the echoing of input characters in Linux).

Step I – Surf to LinkedIn

In this step cURL will send a request to get the content in linkedin.com This is the first time, so the user is not logged in, and there are no cookies. The option CURLOPT_USERAGENT will make the server believe that the request has been sent from a real browser. The option CURLOPT_COOKIEJAR will define the file from which to store and retrieve cookies.

Following is the code:

<?php
error_reporting(E_ERROR | E_PARSE); //Suppress warnings

$curl_handle = curl_init();

// Connect to the site for the first time.
curl_setopt($curl_handle,CURLOPT_URL,"https://www.linkedin.com");
curl_setopt($curl_handle,CURLOPT_USERAGENT,'Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:35.0) Gecko/20100101 Firefox/35.0');
curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,true);
curl_setopt($curl_handle,CURLOPT_COOKIEJAR,'/tmp/cookies');
$out = curl_exec($curl_handle);
if (!$out){
  echo "Error: " . curl_error($curl_handle) . "\n";
  die();
}

Step II – Get the Login Form, Populate It, and Submit It

In this step your script will read your e-mail address and password from the standard input (“php://stdin”), then populate the login form, and submit it. Using the Firefox extension DOM Inspector, I found that the Id of the form element is ‘login’, the username (e-mail) field’s name is “session_key”, and the password field’s name is “session_password”. The script willl submit the form with the input fields of type ‘hidden’ and with the entered e-mail and password. If the login was successful, the http code returned in the header would be 302, which means the output is returned from another address.

Following is the code:

$stdin = fopen('php://stdin','r');
echo "Enter e-mail:";
$email = trim(fgets($stdin));
system("stty -echo");
echo "Enter Password:";
$pass = trim(fgets($stdin));
system("stty echo");
echo "\n";
// Get the form inputs.
$doc = new DOMDocument();
$doc->loadHTML($out);

$form = $doc->getElementById('login');
$inputElements = $form->getElementsByTagName('input');
$length = $inputElements->length;

$inputs = Array();
for ($i=0;$i<$length;$i++){
  $elem=$inputElements->item($i);
  $name = $elem->getAttribute('name');
  $value = $elem->getAttribute('value');
  $inputs[$name]=$value;
}
$inputs['session_key']=$email;
$inputs['session_password']=$pass;
$keys = array_keys($inputs);
$postvars = '';

$firstInput=true;
foreach ($keys as $key){
  if (!$firstInput)
    $postvars .= '&';
  $firstInput = false;
  $postvars .= $key . "=" . urlencode($inputs[$key]);
}
$submitUrl = $form->getAttribute('action');

curl_setopt_array($curl_handle, Array(
  CURLOPT_URL=>$submitUrl,
  CURLOPT_POST=>true,
  CURLOPT_POSTFIELDS=>$postvars
));
$out=curl_exec($curl_handle);
$httpCode = curl_getinfo($curl_handle, CURLINFO_HTTP_CODE);

if ($httpCode != 302)
  die("Error - could not connect: $httpCode\n");

Step III – Post the Silly Message

After a successful login, the relevant data is stored in the cookie jar associated with the cURL handle. This time the script will read the content of the home page with the user logged in. A logged-in user can post status updates. This time, the operation is not complete until the “browser” is referred to the new address. So, we set the cURL option “CURLOPT_FOLLOWLOCATION” to true. In addition, PHP cURL allows to send an associative array as the value of the option “CURLOPT_POSTFIELDS”, a more elegant way to send POST data.

Following is the code:

// Post the message
curl_setopt($curl_handle, CURLOPT_URL, 'https://www.linkedin.com');
$out = curl_exec($curl_handle);
$doc = new DOMDocument();
$doc->loadHTML($out);
$form=$doc->getElementById('share-form');
$inputElements = $form->getElementsByTagName('input');
$length = $inputElements->length;
$inputs=Array();
for ($i=0;$i<$length;$i++){
  $elem=$inputElements->item($i);
  $name = $elem->getAttribute('name');
  $value = $elem->getAttribute('value');
  $inputs[$name]=$value;
}
$inputs['postText']="Hello! I am a message sent by a PHP script.";
$inputs['postVisibility2']='EVERYONE';
$keys=array_keys($inputs);


$formAction = $form->getAttribute('action');
if (substr($formAction,0,5)!='http:')
  $formAction = 'http://www.linkedin.com' . $formAction;

curl_setopt_array($curl_handle, Array(
  CURLOPT_URL=>$formAction,
  CURLOPT_POST=>true,
  CURLOPT_FOLLOWLOCATION=>true,
  CURLOPT_POSTFIELDS=>$inputs
));
$out = curl_exec($curl_handle);

curl_close($curl_handle);
?>
Advertisement

Creating Flash Sites With Ming

You probably know the SWF file format. This is not just a movie, but also can be an interactive application. SWF files can be created with the Ming PHP extension. You can get information on how to install and use the extension here. The movie format can be extended with a special scripting language, named ActionScript. Ming is not well-documented, so you can download a little API here, and maybe it will help you. There are also class for creating GUI objects, such as buttons, text fields, etc.

Let’s Discuss Some Classes

SWFMovie -The main class for movies, used for creating movies, and writing them to output streams.

Useful functions:

  • The constructor of course.
  • add – to add various objects, such as SWFAction scripts, sprites, shapes, buttons, text, etc.
  • save – to save your work to a file.
  • output – to send the output to the browser. before you send it, define the MIME type using
    header(‘Content-type: application/x-shockwave-flash’);

Notes:

  • Define the SWF version before you play it, or you will not be able to view the clip. Here‘s an user-contributed example of a way to determine the version and find more useful details. Set the version with ‘ming_useswfversion’.
  • Use scaling to avoid movies in strange sites at strange screen locations. We’ll discuss it later.
  • If you have created a movies from another movie, you must have access to the original movie from the new movie.

SWFAction – a class used for creating scripts. The scripts ca add functionality to the movie and make it interactive. It’s only function is the constructor, that takes scripts as its argument. You can use it for adding text fields – including input text fields -, communicate with other sites (using the LoadVars class for example), jumping to other frames, defining events, etc. Add it to your movie clips with the function ‘add’. Read more here.

an example of scaling with this class is:

  Stage.scaleMode='noScale';

Note: Error messages are not sent to the log, if they are not syntax errors.

SWFShape – used for creating shapes. This can be used for defining the shape of buttons (need not be rectangular). It can also be added to movie clips. With this class you can draw lines, arcs, and quadratic and cubic Bezzier curves.  You can fill your shape with colors, gradients or bitmaps. If your fill is an image, you can create an object of class SWFFill using “addFill ( SWFBitmap $bitmap [, int $flags ] ).”.  Then you can fill your shape using ‘setRightFill’ or ‘setLeftFill passing your fill as the argument.

SWFFill – This class does not have a constructor. An instance of this class is created by the function addFill of class  SWFShape . It is important to move the fill to the exact location using the function ‘moveTo’ and to scale it using ‘scaleTo”. If you want to use an image at its original dimension, you will probably have to scale it to (20,20) $fill->scaleTo(20,20) because the number of horizontal and vertical twips in a pixel is 20. In addition the fill can be rotated and/or skewed.

SWFButton – A button is a GUI element that triggers an action when clicked. You can add actions, sounds and shapes using the function addAction/setAction, addSound and addShape respectively. You better add a shape, to define the shape and location of the button. The prototype of addShape is ‘void addShape ( SWFShape $shape , int $flags )’. The flags are a combination (using bitwise or) of SWFBUTTON_UP, SWFBUTTON_OVER, SWFBUTTON_DOWN and SWFBUTTON_HIT. These flags define when the button is displayed.

An Example

This is an example of a script that plays a movie backwards:

$x = new SWFMovie();
.
.
.

$actionText = <<<'EOT'

this.createEmptyMovieClip("mc",2);

mc.loadMovie("selfie.swf", "GET");
this.gotoAndStop(mc._totalframes - 1);
this.createTextField("myText", this.getNextHighestDepth(), 0, 0, 200,220);
var tf:TextFormat = new TextFormat();
tf.color = 0x0;
tf.size = 30;
tf.font = "Arial";
myText.setTextFormat(tf);
this.addChild(myText);

this.onEnterFrame=function(){
  if (mc._currentFrame <= 1){
    mc.gotoAndPlay(mc._totalframes - 1);
  }
  mc.prevFrame();
}; 

EOT;
$act=new SWFAction($actionText);
//$x->add($text);
$x->add($act);


Adding Java Classes to Rhino JavaScript

In the post LibreOffice Javascript, I wrote about Rhino Javascript, which is a Javascript interpreter written in Java. This tool has been developed by Mozilla. With this tool you can instantiate Java classes and to access them via Javascript commands ‘importClass’ and ‘importPackage’.

How to ass classes and packages you can load ?

This is not too hard: the ‘rhino’ command is a shell script. In linux, you can find it using the command:

which rhino

You’ll see the response:

/usr/bin/rhino

Then you can look into the file with a text editor or the ‘more’ command, and see that this is a script that performs a Java class. Jar files and classes that a Java programs uses are found in the environment variable ‘CLASSPATH” or after the directive ‘-classpath’. In this script you’ll find that the class path the content of the variable ‘JAVA_CLASSPATH’.

I decided to add the ‘Tidy’ package. In my computer the path of this package is ‘/usr/share/maven-repo/net/sf/jtidy/jtidy/debian/jtidy-debian.jar’, so the script looks like:

#!/bin/sh

JAVA_CMD=”/usr/bin/java”
JAVA_OPTS=””
JAVA_CLASSPATH=”/usr/share/java/js.jar:/usr/share/java/jline.jar:/usr/share/maven-repo/net/sf/jtidy/jtidy/debian/jtidy-debian.jar
JAVA_MAIN=”org.mozilla.javascript.tools.shell.Main”
export LD_LIBRARY_PATH=/home/amity/myWs
##
## Remove bootclasspath overriding for OpenJDK since
## it now use a mangled version of Rhino (in sun.org.mozilla.rhino package)
##
## References:
## <https://bugs.launchpad.net/ubuntu/+source/openjdk-6/+bug/255149&gt;
## <http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=179&gt;
## <http://www.openoffice.org/issues/show_bug.cgi?id=91641&gt;
##

$JAVA_CMD $JAVA_OPTS -classpath $JAVA_CLASSPATH $JAVA_MAIN “$@”

BTW, in Window, environment variables are enclosed by ‘%’ signes. For example: ‘%JAVA_CLASSPATH%’.

LibreOffice – Scripting Your Editor

I think using a spreadsheet document for Sudoku and Kakuro puzzles is great because those documents contain cells where you can place your content: numeric value and text. As you probably know, Kakuro puzzles have squares split by diagonal lines into triangles.

A kakuro puzzle copied to a spreadsheet document

The cells split into triangles are not provided by the office suite and should be crated by the user. The best way to create them is by a macro. The macro can be written in any language that accesses UNO(Universal Network Objects) components. For example: Javascript, Java, BeanShell, Python and Basic.
To learn how to write a HelloWorld script in each language, click here.

Files And Directories

A macro needs an entry point. If your macro is written in Java, it mus have a function with a parameter of type ‘XScriptContext’. The path to function name should be found in a file named ‘parcel-descriptor.xml’. The path includes the package name, the class name, and the function name in the format ‘<pacckage-name>.<class-name>.<function-name>, for example:

“hello.HelloWorld.printHW”.

The parcel-descriptor should also contain the location of the jar file (a zipped directory containing Java classes).

The parcel descriptor is located in ‘<path>/Scripts/java/<Script Dir>/’.

Path may be one of

  • a user path, such as’ ${HOME}/.config/libreoffice/3/user/’ – for a specific user.
  • a LibreOffice path, such as ‘/usr/lib/libreoffice/share/’         – for all LibreOffice users

Example:

The function ‘printHW’ is in the class ‘HelloWorld’ in package hello. The package is stored in “${HOME}/.config/libreoffice/3/user/Scripts/java/HelloWorld1/HelloWorld1.jar”

The file “${HOME}/.config/libreoffice/3/user/Scripts/java/HelloWorld1/parcel-descriptor.xml” will look like:

<parcel language=”Java”>

   <script language=”Java”>

      <locale lang=”en”>

         <displayname value=”HelloWorld1″/>

         <description>Prints “Hello World”.</description>

      </locale>

     <functionname value=”hello.HelloWorld.printHW“/>

     <logicalname value=”HelloWorld.printHW“/>

     <languagedepprops>

        <prop name=”classpath” value=”HelloWorld1.jar“/>

     </languagedepprops>

   </script>

</parcel>

Library Files for the Class Path

Your macro will access classes found in JAR files. Some of the jar files can be found in the Java directory (In my Ubuntu 12.04, it is ‘/usr/share/java’) and some in the ‘libreoffice/program/classes’ (‘/usr/lib/libreoffice/program/classes’).

The files are:

  • ridl.jar – in ‘/usr/share/java’
  • unoil.jar – in (‘/usr/lib/libreoffice/program/classes’
  • unoloader.jar – in ‘/usr/share/java’
  • jurt.jar – in ‘/usr/share/java’
  • juh.jar – in ‘/usr/share/java’

Data Types

There are 4 kinds of data types in UNO:

  • Simple and primitive data types, with equivalents in Java, described here.
  • Structures – objects with public attributes, described here
  • Interfaces containing the functions to be used by the programmer, described here.
  • Services which are everything the module (Draw, Writer, etc) provides to the user, a Spreadsheet cell, for example.

The service implements interfaces and other services. To access the service function, get the relevant interface using  ‘UnoRuntime.queryInterface(InterfaceClass, object)’.

To instantiate (or create a Java object from) a service, use the function ‘createInstance’ of the MultiServiceFactory or MultiComponentFactory.

The next post will describe an example macro in java, The Kakuro Cell macro.