Pattern Matching within SSIS (How to replicate SQL LIKE)

One of the problems with moving a SQL insert statement to SSIS based packages is that logic which is freely available in SQL needs some work arounds to get the same result in SSIS. Such work arounds needs to be developed when working with pattern matching.

In SQL, doing pattern matching is fairly easy. We may match patterns for logical tests using the SQL LIKE clause as follows:

…,[SLAM_HRG_Code] = CASE
LEFT(r.[Service],1) LIKE ‘[A-Za-z]‘
AND SUBSTRING(r.[Service],2,1) LIKE ‘[A-Za-z]‘
AND ISNUMERIC(SUBSTRING(r.[Service],3,2)) = 1
AND SUBSTRING(r.[Service],5,1) LIKE ‘[A-Za-z]‘

The above logic will return a value in HRGCode where the service matches a pattern such as AB09A, otherwise will return NULL.

However, how to do this in an SSIS data loader?
You need to create a Data Flow Script

In order to replicate this, we need to replicate the logic via a script component within the row flow of the Data Flow within a data loader task (2nd tab of the SSIS environment). But don’t be put off, the logic is actually fairly simple.

The steps are:

1) In the Data flow, add a new script task (note, you are not adding it to the main control flow (1st tab), but the data flow – in-between the source data and the destination.
2) Within the script properties, change the language to visual basic (1st set of options)
3) On the input columns, add the columns that will be used in your logic test
4) On the output, add a new data column (select output, then output columns, then press the ADD button)– you will need to change the data type as required in the column properties. For text strings, I suggest using DT_WSTR and give it an appropriate length.
5) Next, you need to code the script – go back to the first set of options (script) and press the EDIT SCRIPT button – a script editor will appear. You are now in the land of code.
6) At the top of the script, you will see a series of IMPORTS lines – add the following line (which adds pattern matching functionality)…. “Imports System.Text.RegularExpressions” (add without the quotes)
7) Now you pattern matching logic goes in the Input0_ProcessInputRow logic – for each row which passes down the line (each record), the subroutine will be called passing in the ROW variable, with the columns you have specified.
8.) So, using the above SQL pattern matching, the VB logic is as follows (note, in SQL, left most character of a substring (get part of the text) is “1”, whereas in the script the left hand character starts at position zero):

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

‘ Pattern Match to generate an HRG Code

‘ First, deal with null
Row.HrgCode_IsNull = True ‘ Set value to null (if it does not match the pattern)
If Row.Service_IsNull Then ‘ Add this clause, tests to see if the incoming test value is null – in which case, don’t process
Row.HrgCode_IsNull = True

‘ Deal with pattern match

If Regex.Match(Row.Service.Substring(0, 1), “[A-Za-z]“).Success = True Then
If Regex.Match(Row.Service.Substring(1, 1), “[A-Za-z]“).Success = True Then
If IsNumeric(Row.Service.Substring(2, 2)) = True Then
If Regex.Match(Row.Service.Substring(4, 1), “[A-Za-z]“).Success = True Then
Row.HrgCode = Row.Service.Substring(0, 5) ‘ Success – all tests are ok
End If
End If
End If
End If
End If

2 Trackbacks / Pingbacks for this entry

arnold, July 26th, 2014 on 6:47 pm

darryl, July 27th, 2014 on 2:45 am