Phil Hearn of MRDC did a four-way review recently on Ruby vs Quantum vs Merlin vs Uncle – see

http://www.mrdcsoftware.com/blog/ruby-quantum-merlin-uncle-what-matters-in-crosstab-software

We have converted many Quantum, Merlin and Uncle jobs over the years and appreciate their strengths. We have a lot of experience with Quantum, arguably still the gold standard for pure batch processing. Ed Ross once graciously reviewed our Ruby-Quantum interface (whereby we can read and execute Quantum code) and remarked “Ruby is where Quantum/Quanvert would have been, had development continued”.

We generally agree with Phil’s comments regarding Ruby: yes, the GUI can be a little overwhelming, because Ruby is a full DP tool as well as graphical cross tabulator, and nobody else has been crazy enough to try to commit full DP functionality to a GUI (one reason for this is transparency for tracking jobs). But you can run in Analyst mode (sans most DP) or Exec mode (basically, an interactive viewer) and the online version Ruby Laser is tailored explicitly to analysis and dashboards. Like Excel (which I still find overwhelming) one soon learns the pathways to achieve the common tasks.

Phil’s main negative against Ruby is our use of Microsoft programming platforms for scripting engines – COM for VBScript or JScript using a text editor (old school), VBA via any MS Office application (typically as Excel macros), and .Net languages (mainly VB.Net or C#) using Visual Studio or free IDEs such as SharpDevelop. The point of this post is to clarify Red Centre Software’s reasons for this design.

We take polite exception here, or, at least, would limit the scope of the comment to only the most unusual and extreme circumstances:

In practice, the problem is that if the user interface runs out of steam and you want something more complex or something repetitive, you may need to find a VB programmer. If you don’t have a VB programmer in the house, this may be impractical and expensive. So, Ruby can claim that they can do anything, but it means going outside the software in practice and using someone who has different skills in most cases.

Scripting is just an option for those who prefer doing things that way. Many analysts run complicated tracking jobs, with substantial changes at each wave, using just the Ruby GUI. All standard batch procedures are supported with a minimal set of user actions, such as banners by unlimited sets of variables, holecounts, refiltering/reweighted folders of reports, regenerating reports and constructions against updated case data, and so on. The GUI for variables allows users to construct the most weird and wonderful structures from case data, and the table specification form supports (and can generate) all table syntax.

The sorts of tasks where scripting is essential are not in the normal purview of survey data analysis – it’s when you want to go beyond that, into territory where the dedicated languages used by Quantum, Merlin and Uncle, being of a different era, are no use at all. I mean things like direct links to R or SPSS for extended statistical analysis, connecting Twitter feeds to case data imports for automatic job updates with full reporting updated hourly, or for feeding a verbatim into a neural net and getting back a new variable of sentiment scores which can be used immediately as a table axis or filter or weight.

Does this require deep programming skills? In our opinion, no. Nearly all the time, all that the scripting engines provide are basic language constructs like if…then…else, for…next, named subroutines to perform discrete tasks, variable types and assignment, and the underlying technology for interprocess communications. All the meat is in the proprietary Ruby API, which we write and maintain, and which is hosted by the script engine. If you want to script Ruby tables, you must learn the Ruby syntax for axes, filter and weight specifications, and in this regard Ruby is no different from Quantum et al.

A simple example: I want a demographic banner by Favourability ratings, with code means, standard deviation about the codes, column percents and respondent bases, filtered to aware of Brand1 and weighted by income. The script for this is

name = "Demogs by Fav 1"
top = "Count(cwf),Gender,Age,Region"
side = "FavRat_1(cwf;*;cmn;csd)"
filt = "BA(1)"
wght = "Income"
GenTab(name, top, side, filt, wght)

The Ruby-specific knowledge required to write this is: cwf means cases weighted filtered, * means all defined codes, cmn means code means, csd means code standard deviation, axis items are delimited by comma between variables and by semi-colon within variables, BA(1) as a filter means BA is a code 1, and a weight can be just the variable on its own. Understanding the VB-specific syntax in the above (use of = for assignment and parentheses around the GenTab parameters) is so trivial most would never even notice.

Can you write insanely complicated scripts? Yes. DP needs to have some fun in life. You can loop things up like a Gordian knot and create delegates and object hierarchies and write your own libraries, but none of that is necessary – it’s all pure bonus – the sorts of things you do because you can, not because the software has painted you into a corner and your only way out is a six month’s advance programming course.

But back to basics: the major advantages of the .Net platform are

  • Much more user-friendly for those with little or no programming experience
  • Professional editor, with syntax highlighting, real-time syntax checking as you type, search/replace
  • Optional code completion
  • Optional auto-formatting of all code
  • Run-time debugging and variable/object Inspection
  • Execution can be stopped at any time
  • Any fails in the Ruby libraries will drop you at the problem line
  • Early binding to COM objects for faster performance
  • Drag/drop dialog design if a Windows Forms application
  • Far greater functionality for character encoding, math routines etc
  • Much faster for string operations (compare, concatenate, etc)
  • All related scripts can be made a project and dealt with collectively (for things like search/replace)
  • All related scripts can be run sequentially from Main()
  • Supports customised GUI front-ends to drive Ruby and display outputs
  • Access to the huge .Net software eco-system via NuGet packages
  • For Visual Studio, in-built R

There is also a strong synergy between any VB* language and the wider business and research community. Among non-IT/programmer data analysts and researchers VB is routinely used to automate common tasks, particularly VBA as Excel macros. VB is the lingua franca of small dedicated business apps and spreadsheet automation. Therefore, many potential Ruby scripters already have a strong head start, and if scripting is new to you, then learning Ruby gets you Excel macro skills as a side benefit.

For more on how to use Ruby scripting, and for a short tutorial on the VB language(s) see Beginner Guide to VB for Ruby.

Categories:

Tags:

Comments are closed