Wednesday, July 29, 2015

Building a Click-Tracking System, Part the First

In my position as Special Projects Lead for FutureStay.com, I have several interesting projects on my plate. They mostly fall under the analytics umbrella: onboarding of users, dynamic pricing, that sort of thing. The first one we decided to build is a custom click-tracking system.

In this series, I want to document the process of building said click-tracking system: what software was used, what I learned by using said software, problems I came up against, etc. Hopefully others can learn from my experience (and mistakes). So let's go...

Why a custom click-tracking system? 

Doesn't Google Analytics do that for you? There's plenty of other free web analytics available. Why go through the hassle? Yes, Google Analytics is very good at what it does, but it's not granular enough for our tastes. We want to do things such as determine how the system is being used ("Why do users take ten clicks to get through this process instead of four?"), how people move through the system ("Why did they go from the payment page, then to the about page, then never returned?") or through a page. Also we want to aggregate the raw data in ways that we're not aware of yet. Finally, playing with lots of data is fun! :-)

Lil Brother

When I develop solutions, I'm of the "Don't reinvent the wheel" school of philosophy, so naturally I went looking for software that already did what I wanted.  Most click-tracking software out there does what Google Analytics does, which is a fine thing but as discussed in the previous paragraph, they're not for us. Fortunately, I came across Lil Brother from the fine people at ShutterstockLil Brother "tracks clicks and other events in the browser and reports back in real time". Perfect!

While the documentation is sufficient I think there are some things that should be spelled out better and one or two things that are just plain typos, hence this document. So let's get technical!

First off, grab the code from their repo. There are two aspects to the code, the client-side and the server-side. Let's start with the client-side code.

Client-Side Code

The first thing you need to be aware of is which lilbro.js library to load. If you read their documentation, it says to include the following line in your client-side code:

<script src="http://server:8000/lilbro.js"></script>

This is just plain wrong. The lib/lilbro.js file you find in the repo is for the server-side code, not the client-side. The client-side actually uses the three files located in the client/src directory: LilBro.BrowserDetect.js, LilBro.Schema.js, and LilBro.js. Include those in your client-side code and you're almost ready to rock. 

In the usual Javascript fashion, you need to create a LilBro object. preferably on the body of the DOM. That's all you really need to do; Lil Brother will track every click on the page.  It will also track focus and blur events if you add the watch_focus: true attribute. Be careful here! The example wrongly says track_focus; the attribute is actually called watch_focus.

At this point, I'm going to tell you to RTFM as the examples given are pretty good. I will, of course, make a few comments. :-)
  • LilBro essentially passes just an array of numbers as key-value pairs back to the server. It uses the LilBro.Schema.js to determine what the keys mean. So be sure both the front-end and back-end are using the same schema file(s). If they don't, you will see odd things, e.g. the fields will be in your object in Firebug but won't see the fields in the back-end and you'll be all, like, 0_o
  • You can add extra attributes to the LilBro object you create by assigning the key-value pairs as a JSON object to event_base but remember to update the schema (see above bullet)
  • You're going to need some type of taxonomy (structure) to your html ids and classes. If an element doesn't have an id or class, LilBro will climb the DOM looking for one. This is a Good Thing. However, we came across the problem where two calendar widgets on our page both have a class="day" on the date elements for styling, but we don't know which widget it is from.

Server-side Code

On the backend, LilBro comes with a node.js based server called, confusingly enough, lilbro.js. It works pretty well and easily.  You will probably have to install some dependencies, mainly nomnom.js. To do that run 

$ cd lil-brother
$ npm install

When I did that, I got an error that npm wanted to install nomnom version > 2.0.0 but the highest available was 1.8.2. If you get the same error, just edit the packages.json and set the version number to > 1.8.1. Worked for me!

The server can write to files or to a ZeroMQ queue.  We're writing to files for the moment. When I get back into the office later this week, I'm going to write a Python script to tail -f the file to dump the data into a MySQL database.

That's All For Now

This should get you up and running with Lil Brother so you can start playing around with it. I'll keep you posted about my progress. Let me know about yours.

Saturday, July 11, 2015

SugarCRM/SuiteCRM Programming Examples

This is a long post about programming against he SugarCRM/SuiteCRM SOAP server. It's all technical so if you came by for some light, witty banter I'm sorry to disappoint you.

The Problem

So a buddy of mine approached me with a problem. He's migrating data from a custom CRM to SuiteCRM. The old CRM has ten years worth of data, mostly consisting of 147,000+ files. My buddy can handle the custom modules, importing the data and all that but he's a point-and-click kinda' guy and he's not going to point-and-click his way through 147,000+ uploads. That's where I came into the picture; I fix technical problems for businesses (I do more than that, but that's a post for a later time).

So I start digging through the SugarCRM/SuiteCRM docs. They've got quite a bit of documentation but it's not very useful. Much like Javadocs or man pages; they're reference documentation, which is great if you already know the topic and want to look something up, but lousy if you need to learn the topic. Examples of programming this CRM I found around the net helped some but left out some important points that tripped me up for awhile, hence this blog post.

So here's the setup: one SuiteCRM server running in a VM; two external drives, one holding 147,000+ files (about 3/4 of a terabyte), the other holding the files for the SuiteCRM installation, a custom module called Assignments, and an SQLite database holding information about which file(s) goes with which Assignment.

My job was to write a program which will do the following:
  • read the database to determine which Assignment a file belongs to
  • create a Document (set_entry()
  • upload the document (set_document_revision())
  • get the assignment id (get_entry_list())
  • set a relationship between the Assignment and the new uploaded Document (set_relationship())
  • update the database with a status
I have some illustrative code up on GitHub. It's not runnable code. It's just two files: the main program and the SuiteCRM code. I left out things like the logging code and the database code. Those are unique to my situation and would just muddy the message I'm trying to get across.

So, let's walk through this:

Initialize SuiteCRM SOAP Object

Our constructor looks like this:

class SugarCrmSoap{
      var $sess;
      var $sess_id;
      var $soapclient;
      var $log;
      var $soap_url;
      var $login_params;

      function SugarCrmSoap(){
        
        $this->soap_url = 'http://192.168.1.10/sugar/service/v4_1/soap.php?wsdl';
        $this->login_params = array(
            'user_name' => 'admin',
            'password'  => md5('admin'),
            'version'   => '.01'
        );
        // the trace option allows for better debugging
        $this->soapclient = new SoapClient($this->soap_url, array('trace' => 1));
        return $this->soapclient;
      }

(Yeah, I'm not too keen on how Blogger formats code. If anyone knows how to do better, let me know)

The soap_url is really important; it determines which version of the interface you use. While that may be obvious, most of the examples I saw didn't include which version of the SOAP interface they were using. This meant when I copypasta their code, I ended passing the wrong parameters in the wrong positions. :-/

The trace option is very useful. We'll see what that does at the end.

Login

We need to create a Session and get its ID to pass around for authenticating. That's easy enough to do:
      
function login(){
  $result = $this->soapclient->login($this->login_params);
  $this->sess_id= $result->id ;
  return $this->sess;
}

Main Loop

Now we're ready to do the work. We create a new document by passing in the name of the file to set_entry() like this:

 function processFile($filename) {
  try {
  //Creating new Document '$filename'
  $result = $this->soapclient
                 ->set_entry( $this->sess_id,
                                 'Documents',
                                  array(
                                        array ( 'name'  => 'new_with_id',
                                                'value' => true
                                              ),
                                        array ( 'name'  => 'document_name',
                                                'value' => $filename
                                              )
                                        )
                        );
          return $result->id;
    } catch (Exception $e) {
      $this->catchError("processFile", $e);
    }
 }

Then we upload the file by setting a new Document Revision. The $docID is from the previous function and $rec is just an array of information about the file.

function uploadFile($docID, $revision=1, $rec) {
  try {
    // file_get_content spits out a warning about there being no file
    // then successfully gets the content. :-? Hence the
    // warning suppression
    $file_contents = @file_get_contents($rec['full_path']);

    $docArray = array( 'id'       => $docID,
                       'file'     => base64_encode($file_contents),
                       'document_name' => basename($rec['file_title']),
                       'filename' => $rec['file_title'],
                       'revision' => $revision,
                       'assignment_no' => $rec['assignment_no']
        );

    $result = $this->soapclient->set_document_revision ( $this->sess_id, $docArray);

    //New document_revision_id is $result->id"
    return $result->id;

  } catch (Exception $e) {
    $this->catchError("uploadFile", $e);
    return (-1);
  }

}

When you upload a document, say example.jpg, SuiteCRM does not store a jpeg file called example.jpg. It stores a base 64 encoded file with a filename that looks like a UUID. I assume this lets them track revisions better.

Setting the Relationship

We finally get to the most important part of this program: setting a relationship between a (custom module) Assignment and the Document we just uploaded. As you might guess that means getting the assignment_id:

function getAssignments($query='', $offset=0, $maxnum=0, $orderby=''){
  try {
  $result = $this->soapclient->get_entry_list(
      $this->sess_id,
      'Assignments',
      $query,
      $orderby,
      $offset,
      array(
      ),
      array(),
      $maxnum,
      0,
      false
  );
  return $result;
  } catch (Exception $e) {
    $this->catchError("getAssignments", $e);
  }
}

This essentially does an SQL query on the back end. The $query parameter is the WHEN clause of that query minus the 'WHEN'. 

This function returns the entire Assignment entry. The id is found thusly:

$assignment = $sugar->getAssignments($assignStr, 0, 1, '');

$assignId = $assignment->entry_list[0]->id;

And finally, we are ready for the final step: setting the relationship! SuiteCRM does some weird internal stuff to set relationships; it's not as simple as a 1-to-many database relationship. Oh no, that would be too easy!  This is the whole reason we have to go through their SOAP server.
Our function for that is fairly straight forward:

function setAssignmentDocumentRelationship($assignId, $docId) {
try {
  //setting relationship for assignment id $assignId and document id $docId
  $result = $this->soapclient->set_relationship(
                    $this->sess_id,
                    "Assignments",
                    $assignId,
                    "assignments_documents_1",
                    array($docId),
                    array(),
                    0
          );
  return $result;
} catch (Exception $e) {
      $this->catchError("setAssignmentDocumentRelationship", $e);
  }
}

Since an Assignment can have several Documents associated with it, I could have passed in several $docIds in the fifth parameter but that would have added complexity to the main program that I didn't feel was justified. 

Capturing errors

Remember up above when we initialized our SOAP object and we set 'trace' => 1? With that set, we can capture the headers of the last request which lets us write error functions like this one:

function catchError($function, $e) {

 $this->log->error( "====== REQUEST HEADERS =====");
 $this->log->error($this->soapclient->__getLastRequestHeaders());

 if ($function != 'uploadFile') {
   $this->log->error( "========= REQUEST ==========");
   $this->log->error($this->soapclient->__getLastRequest());
 }
 $this->log->error( "====== RESPONSE HEADERS =====");
 $this->log->error($this->soapclient->__getLastResponseHeaders());

 $this->log->error( "========= RESPONSE ==========");
 $this->log->error($this->soapclient->__getLastResponse());

 $this->log->error("$function error: $e()");
 // continue on
 throw new Exception($e);

}

The reason for the if ($function != 'uploadFile') is if an error occurs while we're uploading, the entire contents of the base-64 encoded file will appear in the log and you don't want that, believe me!

Conclusion

That's it; how to upload files and set relationships via SuiteCRM's SOAP server. Easy once someone shows you how, eh? ;-)

In my opinion the SuiteCRM documentation is not very helpful, the other examples I've seen were either out of date or missing some important information (I hope I'm not!) and I've heard grumblings that getting information about programming this stuff is very hard to come by. I hope this helps someone.