Preventing SQL Injection in BizTalk

I was reviewing a solution recently using the BizTalk WCF-OracleDB Adapter and detected an interesting security vulnerability.

Before you dismiss this as “I’m using SQL Server, I’ll be right mate”, this issue probably also affects the BizTalk WCF-SQL Adapter and while I haven’t addressed fixes for WCF-SQL specifically in this article, there may be helpful overlap.

In our case we were using plenty of code in Construct/Assign shapes to create new Oracle Select message instances. Something like this:

// Create message
InvSelectMsg = Utility.CreateMessage("OurSchemas.InvoiceSelect"); 
InvSelectMsg.COLUMN_NAMES = "INVOICE_ID"; // SELECT
InvSelectMsg.FILTER = System.String.Format("REFERENCE = '{0}'", reference); // WHERE

This looks innocuous enough, but for anyone who’s worked on web apps without an ORM (including the performance-constrained or those simply caught in the crossfire of an ORM battleground) this stands out as a clear-cut SQL Injection nightmare.

Sure enough, testing confirmed there was a vulnerability, which so far I’ve seen when:

Counter-intuitively, SQLEXECUTE messages are less likely to be affected by this issue than the “strongly-typed” message schemas.

As a side note, one of the more concerning things for me is that I’ve regularly heard lines such as “Oracle prevents SQL Injection” by people who when pushed don’t seem to know what SQL injection is. What I think they mean is that Oracle prevents multiple statement execution, which while preventing a section of SQL Injection attacks, does not make Oracle immune.

SQL Injection?

The above sample renders to some SQL like this:

SELECT INVOICE_ID FROM OracleInvoiceView WHERE REFERENCE = 'Some Invoice Ref'

But consider the case where REFERENCE includes a single quote character (e.g. Invoice for Joey's Pianos). The query is now:

SELECT INVOICE_ID FROM OracleInvoiceView WHERE REFERENCE = 'Invoice for Joey's Pianos'

You can probably see from the syntax highlighting that the query ends at Joey' and then has the random extra text Pianos' sitting after the complete query which simply causes a parse error (the best-case in a SQL Injection scenario).

But what if that query was less innocuous? What if, Joey decided to send his REFERENCE as Joey' OR 1=1 --? The query would now return every INVOICE_ID in the system, letting Joey see his competitor’s data:

--// This query shows Joey all invoices. 
--// Note the end -- to comment out the end single-quote and avoid an error 
SELECT INVOICE_ID FROM OracleInvoiceView WHERE REFERENCE = 'Joey' OR 1=1 --' 

Even worse, Joey might send '; DELETE FROM BillsOutstanding; -- which could (depending on configuration as Oracle will usually prevent multiple statement execution preventing this particular extreme case) lead to both a SELECT and DELETE being executed against the database and all of your outstanding bills being wiped from the database.

SELECT INVOICE_ID FROM OracleInvoiceView WHERE REFERENCE = ''; 
DELETE FROM BillsOutstanding; --'

If Joey was subtle enough in what was modified or deleted you might never discover there was an issue.

Exploits of a Mom

[Source: xkcd]

Why hasn’t everyone noticed this already?

I’m not sure, I have a couple of theories:

  • Everyone else was smart enough to fix this issue and thought it too obvious to mention.
  • There’s a really good article out there already that I haven’t come across yet.
  • Everyone else is using the SQLEXECUTE schemas, and only using user-input via the PARAMETERSET.
  • BizTalk is often used for integration of controlled systems where this issue might not arise. (I’d still expect it to cause errors with standard non-malicious data though).

Why doesn’t this (usually) affect SQLEXECUTE messages?

It appears that SQLEXECUTE schemas circumvent this issue by using Oracle Bind Arguments, so the filter clause sent to Oracle looks like this WHERE REFERENCE = :variable, where the :variable value is set from the PARAMETERSET of the SQLEXECUTE message. For this reason SQLEXECUTE messages are most easily created in a map:

SQLEXCEUTE BizTalk Map

These PARAMETERSET values are treated specially by Oracle to prevent injection (you’ve created a parametrised query, so the values are never executed as PL/SQL). Of course, if you generate the actual SQLSTATEMENT section of your SQLEXECUTE message using user input (for example, System.String.Format("SELECT Id FROM SomeView WHERE Column='{1}'", filterValue)) then you’re back to square-one and facing SQL Injection issues. One interesting case I haven’t tested yet is SQLEXECUTE messages built using the PARAMETERSET using wildcard/LIKE filters.

So what do I do?

There’s a range of advice out there on how to address SQL Injection in your language of choice, including:

Usually this boils down to making sure that user-input characters are always treated as user-input and not part of the actual SQL query syntax. This is usually achieved by:

  • Parametrising the query.
  • Escaping non-parametrised special characters (e.g. single quotes)
  • Ensuring non-string values match their expected data type
  • Implementation specific details (depending on the specific SQL database and client used)

For Oracle in particular some resources I came across included:

Specifically?

My solution for resolving Oracle SQL Injection in BizTalk, which is by no means fool-proof or applicable to every scenario, relies upon:

  1. Calling a utility function (to be detailed below) when setting FILTER values
  2. Ensuring that the format string provided to the utility function places single-quotes around string parameters (WHERE val = '{0}' rather than WHERE val = {0})

Some of the known shortcomings of this method, which could be addressed with some further thought are:

  1. Does not ensure that numerics (or other non-string values?) match their expected data type (relies on the calling developer to enforce this)
  2. Relies on developers to ensure their query format-string has quotes around string parameters
  3. Relies on the ToString method of your provided parameters to return the desired representation for your query value (can’t see this being an issue, but it bears noting)

At least one other alternative to my method is enforcing developers to create FILTERs with a fully parametrised class – something similar in usage to System.Data.SqlClient.SqlCommand – although you would need to ensure the implementation fully supports your target database and usage (in our case Oracle).

Out with the code already!

The most primitive bit of code is a simple string escape for single-quotes. Something like:

static string EscapeParameter(object unescapedParameter)
{
    if (unescapedParameter == null)
        return String.Empty;

     // Escape quotes
     return unescapedParameter.ToString().Replace("'", "''"); 
}

You can then wrap this up with something like:

static string EscapeWhere(string format, params object[] unescapedParameters) 
{ 
    /* Escape parameters */
    object[] escapedParameters = unescapedParameters
      .Select((param, index) => EscapeParameter(param))
      .ToArray();

    /* Return filter string with escaped parameters */
    return String.Format(format, escapedParameters);
}

This allows simple usage like this:

SelectMsg = Utility.CreateMessage("OurSchemas.InvoiceSelect"); // Create message
SelectMsg.COLUMN_NAMES = "INVOICE_ID"; // SELECT
SelectMsg.FILTER = Utility.EscapeWhere("REFERENCE = '{0}'", reference); // WHERE

However, this only supports your standard equivalence-style filters. What about wildcard filtering (WHERE val LIKE '%input%')?

Why is LIKE special?

Wildcards/LIKE statements in your WHERE clause are a special case for SQL Injection, at least in Oracle.

Imagine the case where you want to search for products starting with 20% Off:

WHERE Title LIKE '20% Off%' --// This doesn't give us what we want!

Because percent (%) is a special wildcard character when using LIKE, we won’t get the results we are after (it will match 20 Special Things About Falling Off a Horse for instance).

To address this, Oracle (like many languages) allows you to escape special characters like % with another special character. This escape character is usually a backlash (\) although in Oracle you can define your escape character by following your LIKE statement with an ESCAPE '\' statement.

You can include the actual characters % or _ in the pattern by using the ESCAPE clause, which identifies the escape character. If the escape character precedes the character % or _ in the pattern, then Oracle interprets this character literally in the pattern rather than as a special pattern-matching character. You can also search for the escape character itself by repeating it. For example, if @ is the escape character, then you can use @@ to search for @.

Source: Pattern-matching Conditions, Oracle Database SQL Reference

This means we can get what we were after like this:

WHERE Title LIKE '20\% Off%' ESCAPE '\' --// What we were after!

This is great, except now our customers have a whole new set of special characters they can provide that change the behaviour of our queries to Oracle. While this probably doesn’t leave your database open to all of the SQL Injection attacks we’ve discussed, it probably still allows your customers access to data they probably shouldn’t have.

This means if we have a parameter used in a LIKE filter we need to:

  1. Escape all special Oracle wildcard characters
  2. Escape whichever ESCAPE character we’re using
  3. Ensure all LIKE statements have a corresponding ESCAPE clause (this has to be applied per-LIKE which is something that I didn’t find clear from the docs)

The Oracle documentation lists the special wildcard characters as:

An underscore (_) in the pattern matches exactly one character (as opposed to one byte in a multibyte character set) in the value.

A percent sign (%) in the pattern can match zero or more characters (as opposed to bytes in a multibyte character set) in the value. The pattern ‘%’ cannot match a null.

This meant I needed to update my EscapeParameter function to:

  1. Escape existing ESCAPE characters (e.g. \)
  2. Escape LIKE wildcard characters (% and _)
  3. Only do this when the parameter was being used in a LIKE clause

I also needed to make sure the EscapeWhere function was:

  1. Smart enough to work out which parameters need wildcard escaping or not.
  2. Adding in an ESCAPE clause for every LIKE clause (or rely on calling developers to remember this, and be consistent in their selection of an ESCAPE character).

Final Code

Incorporating these new requirements led to the code below.

Before you jump in remember THIS CODE PROBABLY HAS ISSUES, TEST THOROUGHLY! (this is true for any code you find online).

Updated EscapeWhere method

/// String parameters should be enclosed in single-quotes
/// Parameters of other data types (e.g. numerics) should be validated 
/// by the caller before calling this method.
/// USE WITH CAUTION AND EXTENSIVE TESTING
static string EscapeWhere(string format, params object[] unescapedParameters) 
{ 
    /* Regex for LIKE clauses
      * LIKE (Whitespace) 
      * (Single Quote) (Optional Anything But Single Quote)
      * (Start Brace) (Number) (End Brace)
      * (Optional Anything But Single Quote) (Single Quote) */ 
    string likeRegex = @"LIKE\s+'[^']?{(\d+)}[^']?'";

    /* Extract indices of parameters that need wildcard escaping */
    List<int> wildcardParameterIndices = Regex.Matches(format, likeRegex)
      .Cast<Match>()
      .Select(m => m.Groups[1]) // Select the "Decimal" group of each Match
      .Select(g => int.Parse(g.Value)) // Extract the parameter number
      .ToList();

    /* Append ESCAPES to LIKE statements. */
    format = Regex.Replace(format, likeRegex, @"$0 ESCAPE '\'");

   /* Escape parameters, including wildcard escaping where applicable */
   object[] escapedParameters = unescapedParameters
     .Select((param, index) 
        => EscapeParameter(param, wildcardParameterIndices.Contains(index)))
     .ToArray();

   /* Return WHERE filter string with escaped parameters */
   return String.Format(format, escapedParameters);
}

Updated EscapeParameter method

(Which Prettify struggles with due to C# String Literals)

/// If escapeWildcards is true, assumes parameter used in a LIKE clause
/// and also assumes an "ESCAPE '\'" statement will follow the parameter.
/// USE WITH CAUTION AND EXTENSIVE TESTING
static string EscapeParameter(object unescapedParameter, bool escapeWildcards)
{
    if (unescapedParameter == null)
        return String.Empty;

    // Escape quotes
    string escapedParameter = unescapedParameter.ToString().Replace("'", "''"); 

    // Escape wildcards when used in a `LIKE` clause
    if (escapeWildcards)
    {
        escapedParameter = escapedParameter
          .Replace(@"\", @"\\") // Escape escaping slashes (before adding more)
          .Replace("_", @"\_") // Escape underscore for LIKE statements
          .Replace("%", @"\%"); // Escape percentage for LIKE statements
    }

    return escapedParameter;
}

This allows the following usage

SelectMsg = Utility.CreateMessage("OurSchemas.InvoiceSelect"); // Create message
SelectMsg.COLUMN_NAMES = "INVOICE_ID"; // SELECT
SelectMsg.FILTER = Utility.EscapeWhere("NAME LIKE '%{0}%' AND REFERENCE = '{1}'", 
    name, reference); // WHERE

Now as I’ve mentioned, this isn’t tested for every possible scenario. Ideally I’d upload a Visual Studio Test Project with some sample tests, but for the purposes of this blog post I have instead thrown a quick script through LINQPad as Joseph Albahari has promised me an island I really like LINQPad and it’s amazing.

EscapeWhere("WHERE Col = {0}", 1).Dump(); // Test non-string
EscapeWhere("WHERE Col = '{0}'", "Test").Dump(); // Test string
EscapeWhere("WHERE Col = '{0}'", "Test'").Dump(); // Test single-quote
EscapeWhere("WHERE Col = '{0}'", "Test''").Dump(); // Test 2x single-quote
EscapeWhere("WHERE Col LIKE '{0}'", "Test").Dump(); // Test LIKE
EscapeWhere("WHERE Col LIKE '%{0}%'", "Test").Dump(); // Test LIKE %
EscapeWhere("WHERE Col LIKE '%{0}_'", "Test").Dump(); // Test LIKE _
EscapeWhere("WHERE Col LIKE '%{0}%'", "Test%").Dump(); // Test Escapes %
EscapeWhere("WHERE Col LIKE '%{0}%'", "Test_").Dump(); // Test Escapes _
EscapeWhere("WHERE Col LIKE '%{0}%'", @"Test\%").Dump(); // Test Escapes \
// Test 2x LIKE
EscapeWhere("WHERE Col LIKE '%{0}%' AND Col2 LIKE '%{1}%'", "Test", "Test2").Dump(); 
// Test LIKE + non-LIKE
EscapeWhere("WHERE Col LIKE '%{0}%' AND Col2 = '{1}'", "Test", "Test2").Dump(); 

Which prints the following output:

/*
WHERE Col = 1
WHERE Col = 'Test'
WHERE Col = 'Test'''
WHERE Col = 'Test'''''
WHERE Col LIKE 'Test' ESCAPE '\'
WHERE Col LIKE '%Test%' ESCAPE '\'
WHERE Col LIKE '%Test_' ESCAPE '\'
WHERE Col LIKE '%Test\%%' ESCAPE '\'
WHERE Col LIKE '%Test\_%' ESCAPE '\'
WHERE Col LIKE '%Test\\\%%' ESCAPE '\'
WHERE Col LIKE '%Test%' ESCAPE '\' AND Col2 LIKE '%Test2%' ESCAPE '\'
WHERE Col LIKE '%Test%' ESCAPE '\' AND Col2 = 'Test2'
*/

For peace of mind I’d like someone to point me to a reliable and well-tested alternative I can leverage in a similar manner. In the meantime if you do happen to find any issues, or have some feedback, please make sure to let me know either in the comments or via email.