Despite our best efforts to get Triple-S established as the preferred format for survey data interchange between the collection system and Ruby, the SAV (or SPS/ASC – logically identical) continues to reign supreme. Despite the rapidly diminishing market share of SPSS elsewhere, it seems entrenched in MR – for the near term at least. The appeal of a SAV would seem to be
- Can do extra stats not possible in Ruby
- Easy to manipulate variables in Ruby, then back-import to the SAV for client delivery
- Is the lingua franca in MR by default
- Can verify Ruby tables in SPSS to assure clients that Ruby processing is sound
- Large (if slowly decreasing) pool of SPSS expertise within MR.
So, we have to import a lot of SAVs or SPS/ASC pairs. The problem is, to avoid a one-to-one match of SPSS to Ruby variables (making a practically unusable job in Ruby) a blend file has to be written which configures the variables for multi-response and loop structures at import time. And this is a pain. I know because I’ve done it many many times.
The last time I needed to write a blend file it was for a huge questionnaire with hundreds of multi-response sets, so I had a think about how to automate the process, at least for multi-response. What I came up with was this:
- Check that all dichotomous variables in the SAV have the same format
- Copy the Name column to Excel
- Copy the Values column to Excel
Save as a text file called VarList.txt to the Source subdirectory of your job.
Note that all multi-response sets are stored in the same way, with {0, No} as the first code. This is used to identify candidates for blending. This VB.Net subroutine…
Sub WriteBlendFile() Dim fpath As String = sourcedir & "VarList.txt" Dim infile As New System.IO.StreamReader(fpath, True) Dim line, subarray(), subsubarray(), varname, varparams As String Dim blendvarlist As List(Of String()) = New List(Of String()) While Not infile.EndOfStream ' copy the code definitions into string lines line = infile.ReadLine varparams = "" If InStr(line, "{0, No}") Then subarray = Split(line, vbTab) varname = subarray(0) subsubarray = Split(varname, "_") If UBound(subsubarray) = 1 Then '' like QUOTA_01 varparams = subsubarray(0) If IsNumeric(subsubarray(1)) Then If Left(subsubarray(1), 1) = "0" Then '' like q4_01 '' store the blend target varname and the #width varparams = varparams & vbTab & subsubarray(1).Length Else varparams = varparams & vbTab & "-1" End If End If ElseIf UBound(subsubarray) = 2 Then '' like Q4_1_01 varparams = subsubarray(0) & "_" & subsubarray(1) If IsNumeric(subsubarray(2)) Then If Left(subsubarray(2), 1) = "0" Then '' like q4_01 '' store the blend target varname and the #width varparams = varparams & vbTab & subsubarray(2).Length Else varparams = varparams & vbTab & "-1" End If End If Else '' do nothing End If If InStr(varparams, vbTab) Then blendvarlist.Add(varparams.Split(vbTab)) End If End While infile.Close() fpath = sourcedir & "ToyotaBCT.bln" Dim outfile As New System.IO.StreamWriter(fpath, False, System.Text.Encoding.UTF8) For i = 1 to blendvarlist.Count-2 '' next one found, so write the entry If blendvarlist(i)(0) <> blendvarlist(i-1)(0) Then outfile.WriteLine("[" & blendvarlist(i)(0) & "]") If blendvarlist(i)(1) <> "-1" Then outfile.WriteLine("pattern=" & blendvarlist(i)(0) & "_#" & blendvarlist(i)(1)) Else outfile.WriteLine("pattern=" & blendvarlist(i)(0) & "_#") End If outfile.WriteLine("label=L") outfile.WriteLine(vbNewLine) End If Next outfile.Close() End Sub
…writes this blend file
[QUOTA1] pattern=QUOTA1_#2 label=L [QUOTA6] pattern=QUOTA6_#2 label=L [S1a] pattern=S1a_# label=L … etc
The above subroutine handles extensions like _01, _02 (a leading zero). It is not production code, however, and you will probably have to modify it for your circumstances.
Blend files work best when the SAV is internally consistent. Some simple guidelines are
- Names as var_1, var_2, … and not var_01, var_02,…
- Multi-response and grid/cube variable sets all consistently named
- Variable descriptions to all have the same format, eg.
Q1_1 Brand Last Bought – McDonald’s Q1_2 Brand Last Bought – Hungry Jack’s
Here, the format is <varname>_<var index> <var description> <hyphen> <code label>.
We do NOT want something like
Q1_2 Brand Last Bought – Hungry – Jack’s
(has two hyphens)
or
Q1_3 Wendy’s – Brand Last Bought
(var description and code label reversed)
If the SAV (or SPS) file is internally consistent, then you should be able to script a blend file writer appropriate for each job.
Comments are closed