Code Repo    |     RSS
MD's Technical Sharing



Sunday, June 13, 2010

Extension method and VB.NET With construct

Recently I came across a codeproject article about extension methods in VB.NET, where one reader complained that he could not get it to work when using VB.NET's With construct. The comment by the reader is very long and contains irrelevant code, but the issue is valid. This post will demonstrate the issue and summarize some of my findings.

Suppose you have the following code:

Snippet #1:

    <System.Runtime.CompilerServices.Extension()>
    Public Sub ModifyString(ByRef source As String)
        source += " modified"
    End Sub

    Sub Main()
        Dim str As String = "original"
        str.ModifyString()
        Debug.WriteLine(str)
    End Sub 

When the code is run, you will get the value "original modified" as expected. Now change the extension method call to be inside a With construct:

Snippet #2:
 
    Sub Main()
        Dim str As String = "original"
        With str
            .ModifyString()
        End With
        Debug.WriteLine(str)
    End Sub

You will now get "original", e.g. the string "temp" is never modified! This is clearly a bug in the framework itself so no workaround is available. Neverthess, let's look at the IL code for the above 2 code samples.

Snippet #1 in IL (No 'With' is used):


         .method public static void Main() cil managed
        {
            .custom instance void [mscorlib]System.STAThreadAttribute::.ctor()
            .entrypoint
            .maxstack 1
            .locals init (
                [0] string str)
            L_0000: nop
            L_0001: ldstr "original"
            L_0006: stloc.0
            L_0007: ldloca.s str
            L_0009: call void ConsoleApplication1.Module1::ModifyString(string&)
            L_000e: nop
            L_000f: nop
            L_0010: ret
        }

Snippet #2 in IL ('With' is used):

         .method public static void Main() cil managed
        {
            .custom instance void [mscorlib]System.STAThreadAttribute::.ctor()
            .entrypoint
            .maxstack 1
            .locals init (
                [0] string str,
                [1] string str2)
            L_0000: nop
            L_0001: ldstr "original"
            L_0006: stloc.0
            L_0007: ldloc.0
           
L_0008: stloc.1
            L_0009: ldloca.s str2
            L_000b: call void ConsoleApplication1.Module1::ModifyString(string&)
            L_0010: nop
            L_0011: ldnull
           
L_0012: stloc.1
            L_0013: nop
            L_0014: ret
        }

The different is obvious when a text comparison tool such as ExamDiff is used. When a With construct is used, the compiler generates extra code to create a copy of the string variable (str is copied to str2) and pass it to the extension method ModifyString! So whatever changes made to the string have no effect on the original variable, in spite of the ByRef keywork to pass the string by reference. This explains why we get the original value of the string variable.

Now change the code to:

Snippet #3:

    Sub Main()
        Dim str As String = "original"
        With str
            str.ModifyString()
        End With
        Debug.WriteLine(str)
    End Sub

We still use the With construct, but instead of using shorthand to call the extension method, we explicitly refer to the string variable. Guess what, now you'll get the correct result "original modified"! Let's look at the IL code to see what happened:

Snippet #3 in IL:

         .method public static void Main() cil managed
        {
            .custom instance void [mscorlib]System.STAThreadAttribute::.ctor()
            .entrypoint
            .maxstack 1
            .locals init (
                [0] string str,
                [1] string str2)
            L_0000: nop
            L_0001: ldstr "original"
            L_0006: stloc.0
            L_0007: ldloc.0
           
L_0008: stloc.1
            L_0009: ldloca.s str
            L_000b: call void ConsoleApplication1.Module1::ModifyString(string&)
            L_0010: nop
            L_0011: ldnull
           
L_0012: stloc.1
            L_0013: nop
            L_0014: ret
        }

A copy of the original string variable (str2) is created as usual, but it was never used. Instead, the original string variable (str) is passed to the extension method. This explains why everything works as intended.

The conclusion is to never use With...End With together with extension method as you may get unexpected results. As for the solution, well, I'll leave it up to whoever designs the .NET framework...

UPDATE (17 June 2010): The issue was submitted to Microsoft Connect here. They acknowledge the issue, yet decided to do nothing, not even adding a compilation warning.

3 comments:

  1. The problem is not just the extension method, it is that the extension method takes the source parameter by reference. This is not an expected use case for extension methods, and in fact C# disallows it.

    I do agree that if VB is not going to handle this case correctly, it should not allow ByRef extension methods.

    ReplyDelete
  2. Yes, passing the parameter by reference (and not by value) causing the problem mentioned in my post. I am not sure why C# disallows this, but I believe there are valid use cases for this. An example below:

    <System.Runtime.CompilerServices.Extension()>
    Public Sub ToHTMLBoldString(ByRef str As String)
    str = String.Concat("<b>", HttpUtility.HtmlEncode(InputValue), "</b>")
    End Function

    Dim test As String = "Hello World"
    test.ToHTMLBoldString()

    ReplyDelete
  3. It depends on how you look at it. I would say that is a bad extension method, because it transparently modifies a reference that appears to be immutable. Most string manipulation methods return a new string instance; this reinforces the fact that strings are immutable, and makes working with strings easier. Breaking this paradigm results in non-intuitive usage that will likely lead to bugs.

    This is a specific case of a more general design issue. C# requires that arguments which are passed by reference include the "ref" keyword at the call site. This is the result of a design decision that all reference modifications should be visible, for easier reading comprehension. Of course, VB allows arguments to be passed by reference transparently, so there is obviously some disagreement.

    ReplyDelete

Note: Only a member of this blog may post a comment.