2

I am trying to login to ets.org/toefl account using php curl. But I am unable to login to the website. I usually get an error saying server is busy, but it works when I login using a browser. I have attached my code. Can anyone see what is wrong?

<?php
include('simple_html_dom.php');

$login_url = 'https://toefl-registration.ets.org/TOEFLWeb/logon.do';

$username='****';
$password='***';
$ck = 'cookie.txt';

$agent = 'Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0';
// extra headers
$headers[] = "Connection: keep-alive";
//$headers[]= "Accept-Encoding: gzip, deflate";


$ch = curl_init();

curl_setopt($ch, CURLOPT_HEADER,  0);
curl_setopt($ch, CURLOPT_HTTPHEADER,  $headers);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);         
curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 

curl_setopt($ch, CURLOPT_COOKIEJAR, $ck);
curl_setopt ($ch, CURLOPT_COOKIEFILE, $ck);

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
//curl_setopt($ch, CURLOPT_URL, 'https://toefl-registration.ets.org/TOEFLWebextISERLogonPrompt.do');

$output = curl_exec($ch);
//echo $output;

$html = new simple_html_dom();
$html = str_get_html($output);
$e = $html->find(".loginform");
$a = $e[0]->find('input');
$str = $a[0]->outertext;
preg_match("/value=\"(.*)\"/",$str,$match);
$h_attr = $match[1];

$fields['org.apache.struts.taglib.html.TOKEN'] = $h_attr;
$fields['currentLocale']= 'en_US';
$fields['username'] = $username;
$fields['password'] = $password;
$fields['x'] = 11;
$fields['y'] = 4;
//print_r($fields);
//echo "\r\n";
$POSTFIELDS = http_build_query($fields); 
//echo $POSTFIELDS;

$headers[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$headers[] = "Accept-Language: en-US,en;q=0.5";
$headers[]="Referer: https://toefl-registration.ets.org/TOEFLWeb/extISERLogonPrompt.do";

curl_setopt($ch, CURLOPT_URL, $login_url); 
curl_setopt($ch, CURLOPT_HTTPHEADER,  $headers);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $POSTFIELDS); 
$result = curl_exec($ch);
print $result;

(Update from comments)

Post by browser:

org.apache.struts.taglib.html.TOKEN=c1b88957e9914492fe8cc20b33ef1cdd&currentLoca‌​le=en_US&username=name&password=pass&x=23&y=3 By me. org.apache.struts.taglib.html.TOKEN=345a9f935b2db8a69f55c5b4d3372190&currentLoca‌​le=en_US&username=name&password=pass&x=11&y=4

Post generated by php curl verbose:

POST /TOEFLWeb/logon.do HTTP/1.1 User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0 Host: toefl-registration.ets.org Cookie: au=MTM3Mjc4ODQwMg%3d%3d; server=3; JSESSIONID=23C39022E2641B8F5AC944295837315E Connection: keep-alive Accept: / Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8 Accept-Language: en-US,en;q=0.5 Referer: toefl-registration.ets.org/TOEFLWeb/extISERLogonPrompt.do Content-Length: 134 Content-Type: application/x-www-form-urlencoded

Leigh
  • 28,765
  • 10
  • 55
  • 103
Lonewolf
  • 497
  • 5
  • 13

2 Answers2

2

Try comparing the HTTP headers sent by your CURL script to those headers sent by your browser (use chrome dev tools). Maybe the remote server is refusing you due to some missing header info.

Ensure cookie files have full permissions. From php.net:

When specifing CURLOPT_COOKIEFILE or CURLOPT_COOKIEJAR options, don't forget to "chmod 777" that directory where cookie-file must be created.

beiller
  • 3,105
  • 1
  • 11
  • 19
  • Why vote this down. I think this method will shed light on the fact that this form is protected via some sort of parameters. – beiller Jul 02 '13 at 18:13
  • Hey Lonewolf could you add the POST constructed by your script? Where do you suppose the x and y come from? Are you certain you are capturing the TOKEN correctly? – beiller Jul 02 '13 at 18:17
  • @beiller The permissions are proper – Lonewolf Jul 02 '13 at 18:41
  • 1
    @beiller Thank you for expanding your original post and providing an explanation. I've removed my downvote. :) – War10ck Jul 02 '13 at 18:43
  • @Lonewolf - For next time, comments have limited formatting. So they are not good place for multiple lines of code/debug text. You can always [edit your question](http://stackoverflow.com/posts/17432508/edit) to include additional details. – Leigh Jul 02 '13 at 21:59
0

I got it working somehow... I added certificate verification to the code. Further i found that some delay needs to be present between the two functions get cookie and login. The working code is below

<?php
include('simple_html_dom.php');

$login_url = 'https://toefl-registration.ets.org/TOEFLWeb/logon.do';
$cookie_page = 'https://toefl-registration.ets.org/TOEFLWeb/extISERLogonPrompt.do';

$username='******';
$password='******';

//$ck = 'E:\Projects\Web Development\toefl_script\cookie.txt';
$ck = 'D:\Nikhil\Projects\Wamp\toeflscript\cookie.txt';

//$agent = 'Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0';
$agent = 'Mozilla/5.0 (Windows NT 6.1; rv:21.0) Gecko/20100101 Firefox/21.0';

$headers[] = "Connection: keep-alive";
$headers[] = "Accept: */*";


/* Begin Program Execution */

init_curl();
get_cookie();
sleep(30);
login();

function get_cookie()
{
    global $ch, $ck, $h_attr, $headers, $cookie_page;
    global $ck;

    curl_setopt($ch, CURLOPT_URL, $cookie_page);

    //curl_setopt($ch, CURLOPT_VERBOSE, true);
    $output = curl_exec($ch);
    //echo $output;

    /*
    $html = new simple_html_dom();
    $html = str_get_html($output);
    $e = $html->find(".loginform");
    $a = $e[0]->find('input');
    $str = $a[0]->outertext;
    preg_match("/value=\"(.*)\"/",$str,$match);
    $h_attr = $match[1];
    */
}

function init_curl()
{
    global $ch, $ck, $h_attr, $headers, $agent;
    global $ck;

    ini_set('max_execution_time', 300);

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_HEADER,  0);
    curl_setopt($ch, CURLOPT_HTTPHEADER,  $headers);

    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);

    curl_setopt($ch, CURLOPT_CAINFO, getcwd() . '/cacert.pem');

    curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 

    curl_setopt($ch, CURLOPT_COOKIEJAR, $ck);
    curl_setopt ($ch, CURLOPT_COOKIEFILE, $ck);
}

function login()
{
    global $ch, $login_url, $password, $username, $ck, $h_attr, $headers;

    //$fields['org.apache.struts.taglib.html.TOKEN'] = 'abc';//$h_attr;
    $fields['currentLocale']= 'en_US';
    $fields['username'] = $username;
    $fields['password'] = $password;
    $fields['x'] = 11;
    $fields['y'] = 4;

    $POSTFIELDS = http_build_query($fields); 
    //print_r($fields);
    //echo $POSTFIELDS;

    $headers[] = "Accept-Language: en-US,en;q=0.5";
    $headers[]="Referer: https://toefl-registration.ets.org/TOEFLWeb/extISERLogonPrompt.do";

    curl_setopt($ch, CURLOPT_URL, $login_url); 
    curl_setopt($ch, CURLOPT_HTTPHEADER,  $headers);
    curl_setopt($ch, CURLOPT_VERBOSE, true);
    curl_setopt($ch, CURLOPT_POST, 1); 
    curl_setopt($ch, CURLOPT_POSTFIELDS, $POSTFIELDS); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 
    $result = curl_exec($ch);
    print $result;
}
Lonewolf
  • 497
  • 5
  • 13