Browsing From The Command Line / Server With cURL

cURL is a tool used for browsing the web from the shell or command-line. It supports many internet protocols, such as HTTP, FTM, POP3, IMAP, SMTP and more. See the full list here.

With libcurl installed in your system and the C API, you can browse using a C program. You can also install the extension cURL for PHP.

Steps of a cURL Program

  1. Obtain a curl handle
    For example:

    $curl_handle = curl_init();
  2. Set options to the handle.
    Define the behavior of curl when executing the request: setting the URL, cookie location, variables passed to the server, etc.
    In PHP, you can use curl_setop to set a single option, or curl_setopt_array to pass a PHP array.
    For example:

    curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,true);

    will cause cURL to store the output in a string; the value ‘false’ will cause cURL to send it to the standard output.

  3. Execute the request.
    $out=curl_exec($curl_handle);
  4. Close the handle
    curl_close($curl_handle);

As long as the handle is open, you can repeat steps 2 and 3 as many times as you need.

Another useful cURL function is curl_getinfo. In the example below, I have used

"$httpCode = curl_getinfo($curl_handle, CURLINFO_HTTP_CODE);"

to determine if the login action was successful.

 An Example – Sharing a Message in LinkedIn

Publishing a text message in LinkedIn is simple: surf to LinkedIn, find the relevant form in the response and submit it. I’ve found the login form and the publish form using HTML Dom documents. Then populated post vars according to them, and connected to the URL given in the “action” attribute of the form. The code is run in PHP CLI in Linux (“stty -echo” is a system call that suppresses the echoing of input characters in Linux).

Step I – Surf to LinkedIn

In this step cURL will send a request to get the content in linkedin.com This is the first time, so the user is not logged in, and there are no cookies. The option CURLOPT_USERAGENT will make the server believe that the request has been sent from a real browser. The option CURLOPT_COOKIEJAR will define the file from which to store and retrieve cookies.

Following is the code:

<?php
error_reporting(E_ERROR | E_PARSE); //Suppress warnings

$curl_handle = curl_init();

// Connect to the site for the first time.
curl_setopt($curl_handle,CURLOPT_URL,"https://www.linkedin.com");
curl_setopt($curl_handle,CURLOPT_USERAGENT,'Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:35.0) Gecko/20100101 Firefox/35.0');
curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,true);
curl_setopt($curl_handle,CURLOPT_COOKIEJAR,'/tmp/cookies');
$out = curl_exec($curl_handle);
if (!$out){
  echo "Error: " . curl_error($curl_handle) . "\n";
  die();
}

Step II – Get the Login Form, Populate It, and Submit It

In this step your script will read your e-mail address and password from the standard input (“php://stdin”), then populate the login form, and submit it. Using the Firefox extension DOM Inspector, I found that the Id of the form element is ‘login’, the username (e-mail) field’s name is “session_key”, and the password field’s name is “session_password”. The script willl submit the form with the input fields of type ‘hidden’ and with the entered e-mail and password. If the login was successful, the http code returned in the header would be 302, which means the output is returned from another address.

Following is the code:

$stdin = fopen('php://stdin','r');
echo "Enter e-mail:";
$email = trim(fgets($stdin));
system("stty -echo");
echo "Enter Password:";
$pass = trim(fgets($stdin));
system("stty echo");
echo "\n";
// Get the form inputs.
$doc = new DOMDocument();
$doc->loadHTML($out);

$form = $doc->getElementById('login');
$inputElements = $form->getElementsByTagName('input');
$length = $inputElements->length;

$inputs = Array();
for ($i=0;$i<$length;$i++){
  $elem=$inputElements->item($i);
  $name = $elem->getAttribute('name');
  $value = $elem->getAttribute('value');
  $inputs[$name]=$value;
}
$inputs['session_key']=$email;
$inputs['session_password']=$pass;
$keys = array_keys($inputs);
$postvars = '';

$firstInput=true;
foreach ($keys as $key){
  if (!$firstInput)
    $postvars .= '&';
  $firstInput = false;
  $postvars .= $key . "=" . urlencode($inputs[$key]);
}
$submitUrl = $form->getAttribute('action');

curl_setopt_array($curl_handle, Array(
  CURLOPT_URL=>$submitUrl,
  CURLOPT_POST=>true,
  CURLOPT_POSTFIELDS=>$postvars
));
$out=curl_exec($curl_handle);
$httpCode = curl_getinfo($curl_handle, CURLINFO_HTTP_CODE);

if ($httpCode != 302)
  die("Error - could not connect: $httpCode\n");

Step III – Post the Silly Message

After a successful login, the relevant data is stored in the cookie jar associated with the cURL handle. This time the script will read the content of the home page with the user logged in. A logged-in user can post status updates. This time, the operation is not complete until the “browser” is referred to the new address. So, we set the cURL option “CURLOPT_FOLLOWLOCATION” to true. In addition, PHP cURL allows to send an associative array as the value of the option “CURLOPT_POSTFIELDS”, a more elegant way to send POST data.

Following is the code:

// Post the message
curl_setopt($curl_handle, CURLOPT_URL, 'https://www.linkedin.com');
$out = curl_exec($curl_handle);
$doc = new DOMDocument();
$doc->loadHTML($out);
$form=$doc->getElementById('share-form');
$inputElements = $form->getElementsByTagName('input');
$length = $inputElements->length;
$inputs=Array();
for ($i=0;$i<$length;$i++){
  $elem=$inputElements->item($i);
  $name = $elem->getAttribute('name');
  $value = $elem->getAttribute('value');
  $inputs[$name]=$value;
}
$inputs['postText']="Hello! I am a message sent by a PHP script.";
$inputs['postVisibility2']='EVERYONE';
$keys=array_keys($inputs);


$formAction = $form->getAttribute('action');
if (substr($formAction,0,5)!='http:')
  $formAction = 'http://www.linkedin.com' . $formAction;

curl_setopt_array($curl_handle, Array(
  CURLOPT_URL=>$formAction,
  CURLOPT_POST=>true,
  CURLOPT_FOLLOWLOCATION=>true,
  CURLOPT_POSTFIELDS=>$inputs
));
$out = curl_exec($curl_handle);

curl_close($curl_handle);
?>

Communication Between Backbone.js And PHP

Backbone.js is a Javascript framework known as MV* (Model, View and stuff). This framework has the Backbone.sync function to perform CRUD (Create, Read, Update, Delete) operations. In order for a model object to perform these operations, it should have a property named ‘url’. The ‘sync’ function receives 3 arguments: method, model and options. Model objects also have functions that call the ‘sync’ functions, these functions are: ‘fetch’, ‘save’ and ‘destroy’. The function Save does both Create and Update operation: If the property idAttribute is defined, and the idAttribute is set, ‘save’ sends an update request, otherwise it sends a ‘create’ request. The content type of the request is ”application/json”.

How Does The Server Know What Request Type It Receives?

The client-side developer specifies one ‘url’ only per model. The server-side program can determine the type by checking the value of $_SERVER[‘REQUEST_MOTHED’]:

  • ‘GET’ – a read (or fetch) request.
  • ‘POST’ – a create(save a new model) request.
  • ‘PUT’ – an update(save existing model’s detail) request.
  • ‘DELETE’ – a delete(destroy) request.

What To Send and How To Read?

Backbone.sync uses AJAX to send requests to the server. What you have to pass is the method and model. You better not skip the 3rd arguments, options, which contains at least two member:

  • ‘success’ – a function called when the server processes the request successfully.
  • ‘error’ – a function called when the server fails to process the request.

For example:

Backbone.sync(‘GET’, myModel, {

success: function(resp){

// process the response

},

error: function(resp){

// Recover or print an error message.

}

});

If you call Backbone.sync from an object don’t use ‘this” inside the success and error function as a reference to your object.

For example, if you call ‘sync’ from a view, write ‘myView=this;’ before your call to the ‘sync’ function.

Read Requests

If you send a ‘fetch’ request, specify ‘data’ in the ‘options’ argument, for example:

myModel.fetch({

data: {‘attr1′:’value1’, ‘attr2′:’value2’, … , ‘attrN’:”valueN’},

success:

….

error:

});

In the Server Side

The request attributes are taken from $_GET or $_REQUEST, as usual.

Save Requests

Send the attributes and the options.

For example:

myModel.save(myModel.attributes, {

success:

error:

….

});

In The Server Side

The data should be read from the file “php://input”, a filename used for reading data from the request body. The content is a string, and should be parsed, but since the incoming request data is JSON encoded, you should use the PHP method “json_decode”.

For example:

$post_vars = json_decode(file_get_contents(‘php://input’), true);

Destroy Requests

This time, you should set the ‘data’ member of the options argument to a JSON string. For example:

Backbone.sync(‘delete’, myModel,{
data: JSON.stringify(myModel),  // This time the developer encodes the data.

success: function(model, resp) {

…..

},

error: function(){

….

} );

In The Server Side

Read the data from ‘php://input’ and decode it using json_decode. For example:

$post_vars = json_decode(file_get_contents(‘php://input’), true);

Anyways …

You can make sure you read the data correctly by sending the contents of ‘$_SERVER’, $post_vars and ‘$_REQUEST’ to the error_log. To print an array use json_encode.

Sending a Response To the Client

The response is everything the server side application prints to the standard output. You should print your response data encoded into JSON, as follows:

echo json_encode(responsedata);

Reading a Response

The response returned from the server is the first argument passed to the ‘succss’ callback function. It is not a model object, and you should get values from there using the method ‘get(attr-name’ or the member ‘attributes’.

For example:

{success: function(msg){
switch (msg.get(‘status’)){  // Get the value of the attribute ‘status’ in the server’s response.

To debug your code, you can use ‘alert(JSON.stringify(msg));’.