Ivan Krivyakov's Blog

Premature optimization is the root of all evil

June 24, 2010

Researching XML Serializers for .NET – work in progress

After a little discussion at work on which XML serializer for .NET to use, I decided to do a little research. The candidates were the regular XmlSerializer, the XAML serializer (XamlWriter / XamlReader) and the DataContractSerializer from WCF.

Frankly, all of them suck in different ways, but the DataContractSrerializer shows the best results so far (for our purposes). I was looking for two things in a serializer:

1. No or little restrictions on what can be serialized.
2. No unnecessary noise in the resulting XML
3. Support for versioning.

The second requirement is somewhat subjective, but we want our XML to be as human-readable as possible. Class names with some attributes work well. Long namespace URLs on every corner don’t..

Restrictions

All serializers use CLR attributes to alter their behavior, which puts you out of luck if you want to customize serialization of a class that you cannot modify. Also, there can be only one way to serialize a class, if you need to serialize it differently under different circumstances, this is difficult.

XmlSerializer does not do dictionaries. Anything implementing IDictionary is specifically excludedThis was partly due to schedule constraints and partly due to the fact that a hashtable does not have a counterpart in the XSD type system“. Apparently, dictionaries are not used widely in Microsoft, since they did not find a time to revisit it in 7 or so years. It will, however, happily serialize generic lists. There are also versions of serializable dictionaries out there, and you can always write your own serialization of difficult parts by implementing IXmlSerializable.

XAML Serializer produces relatively concise XML by putting primitive properties in attributes. It also handles WPF attributes like colors very well. Unfortunately, XAML serializer is very picky about generics in general, and lists and dictionaries in particular. They might occasionally work, but it is very quirky. Unlike other serializers, you cannot get around it by implementing your own custom serialization strategy – AFAIK there is no equivalent of IXmlSerializable for XAML. Also, by default XAML serializer will include all null attributes, so if your class has a lot of nulls, your XML will be large.

WCF Serializer can serialize dictionaries and lists. It is the only serializer that can (in special mode) preserve references (the same object referenced twice) and process circular dependencies.

Shape of XML

By default XML serializer produces verbose output: every single thing is an element. It can be made very nice, but it may require applying lots of attributes to your classes.

XAML serializer does not allow much control over the output. It has quite a few namespace declarations, and may also get littered with null values.

WCF serializer does not give you much control either, and everything is as an element. If you work in “preserve object references” mode, everything has an “id” attribute on it.

Versioning

XML serializer does not tie your code to class names, so with some tweaking you can achieve a considerable amount of backward compatibility. It is also possible to serialize the output with one class and deserialize with another.

XAML serializer is the worst – it tightly couples the output to classes, assembly names, and down to property names, which makes versioning difficult. It will also fail if a particular property is not available upon deserialization. The only viable option I can think of is to add XSLT before deserializing.

WCF serializer allegedly has some built-in versioning considerations, but I need to research what they are.

May 23, 2010

C# Lambdas: Do you know what you captured?

Posted a larger and more structured article on the (somewhat surprising) capture rules for outer variables inside C# lambdas.

http://www.ikriv.com/en/prog/info/dotnet/lambdas.html

May 17, 2010

Custom build step on file

BTW, in earlier versions of Visual Studio (definitely VC++ 6, but maybe later) it was possible to add an arbitrary custom compilation step for a file, that worked just fine – no need to modify projects and the like. They removed this capability in favor of “custom tools” that, in my opinion, is inferior: you no longer can use simple text transformation command-line utilities “as is”, you need some COM-bloated integration layer around them, to make them either MSBuild tasks, or custom tools, or add-ins, or whatever.

T4 – Too Troublesome to Tackle?

Text Template Transformation Toolkit (T4) is a code generator built into Visual Studio 2008 and 2010. Yes, you have it now on your machine :) It was proposed to me as a possible solution to the C# macros problem.

I played with it a little bit and my current verdict is it would be quite difficult to use in a serious project. Here’s why:

1. No automated build integration. This is a killer, especially if your templates include other templates. The only time template code is executed is when you save a template file. Not when you (re)build your solution. Not when you get your stuff from source control. If foobar.tt has changed and there are 50 files that include foobar.tt, you will have to hunt them down, open them in Visual Studio and save them to re-generate the code. That’s 51 files.

This is not T4′s fault per se – all custom tools suffer from this problem, but it does not make developer’s life any easier. There is some build integration in VS 2010, but “in order to configure a Visual Studio project for build-time template transformation, you have to manually modify the MSBuild definition in the project file“. Not cool.

2. No proper source control integration. When you add your .tt file to source control, the generated file is added along with it, and every time you change anything, both files get checked out.

3. No ability to group all files of a class under one node. This is not a show stopper, but an inconvenience. I would like to have all code for class Foo to be under one node. Something like Foo.cs with Foo.props.tt below it, which generates Foo.props.cs. I am not sure how it can be accomplished with TT.

See also:
T4: Text Template Transformation Toolkit (Oleg Sych blog)
Understanding T4-MSBuild integration

May 14, 2010

C#: Trouble with Lambdas in For Loops

I had an interesting bug the other day. I wrote a foreach loop along these lines:

foreach (var entry in controlsByName)

{

    entry.Key.SomeEvent += (sender,args)=>{ProcessControlName(entry.Value);}

}

Looks innocent enough, right? There is a big catch here. In functional languages we are accustomed to the fact that everything is immutable. I subconsciously transferred this notion to C# lambdas. I thought that the lambdas capture outer variables by value and carry this immutable value with them forever. This is wrong!.

C# language specification (found at C:\Program Files\Microsoft Visual Studio 9.0\VC#\Specifications if you have Visual Studio 2008) states in paragraph 7.14.4 that outer variables are captured by reference. They don’t use these exact words, but that’s the idea. If you change the value of the variable, all lambdas that captured it will be affected.

Even more surprisingly, there will be only one copy of the loop variable such as entry above, and all lambdas will share this single copy. This means that the code above has a bug. After the loop has finished, if SomeEvent is fired by one of the controls, the name passed to ProcessControlName() will always be the name of the last control, regardless of which control fired the event!

It turns out, you can even exploit this variable sharing making lambdas to communicate like this:

private void GetTangledLambdas(out Func<int> getter, out Action<int> setter)

{

    int x = 0;

    getter = () => x;

    setter = v => {x = v;};

}

 

[TestMethod]

public void Lambads_Can_Communicate_Through_Captured_Variable()

{

    Func<int> getter;

    Action<int> setter;

    GetTangledLambdas(out getter, out setter);

    setter(10);

    Assert.AreEqual(10, getter());

    setter(20);

    Assert.AreEqual(20, getter());

}

This, of course, is not even close to functional programming, even though it uses lambdas :)

May 6, 2010

Lost your thumbnails? Refresh Windows Components

For real. When you install a program that associates itself with image files, you may lose your thumbnails capability in Windows XP. It shows a giant icon of the program instead of the thumbnail. How to fix it? It’s elementary, Watson, Just go to Control Panel->Add or Remove Programs -> Windows components, don’t change anything and click “next” a couple of times. That’s it! The rumor goes, that if while doing that you three times say loudly “Microsoft rules”, it works even better. :)

May 4, 2010

Webdav – a nice alternative to FTP

Recently I needed to let remote users to transfer several hunderd megabytes of data to my machine. In the old days I would just go ahead and setup an FTP server, but it is so-o 1975. It is insecure, requires pinching wholes in the firewall, et cetera, et cetera.

So I googled for the alternatives and chose WebDav. I setup a WebDav folder on my Apache with relative ease, and then they used built-in Windows client (“My Network Places”) to access it. Of course, for it to be secure one needs to have https, and it is not very easy to setup, but fortunately I’ve already got it.

Of course, we used manual uploads. The user just opens my webdav folder in Windows Explorer and drags files to/from it. if automated uploads are required, webdav may be harder to tackle then FTP, since there are fewer command line clients. But you need only one that works, right?

Bottom line – if you have your own web server with HTTPS, webdav is highly recommended for secure file sharing.

BTW, I ran afoul of Windows data redirection again – Apache simply would not pick up my configuration changes, until I realized I am making them in a legacy editor (FAR) without elevated permissions, and all changes go into my local profile.

May 3, 2010

Live+Press: Victory!

I have downloaded Live+Press plugin for WordPress that can synchronize my WordPress-based blog and my LiveJournal. Of course, with my luck, nothing worked out of the box. After several debugging attempts I found that

1. PHP will automatically encode special characters in any posted form field, whereas “foo’bar” becomes “foo\’bar”.

2. This happens even to passwords.

3. The password hash then comes out wrong and LJ won’t accept it.

4. Needless to say, my password contained a special character.

Moral: never, ever do good deeds prematurely. If someone wants to write strings to the database, let the database worry about the special characters. Not everything POSTed is intended to a database, so there is no point (and it is even harmful) to automatically escape it.

PS. Why does the trackback link look so ugly? Well, I will deal with it later. Meanwhile, sorry for its appearance.

April 29, 2010

Once more about macros in C#

I am writing a WPF app. In WPF they like very much all kinds of notifications about property changes. Thus, I’ve got a class with a bunch of properties similar to this:

    class Person
    {
        void OnPropertyChanged(string propertyName)
        {
        …
        }
 
        public string LastName
        {
            get { return _lastName; }
 
            set
            {
                if (_lastName != value)
                {
                    _lastName = value;
                    OnPropertyChanged("LastName");
                }
            }
        }
        string _lastName;
 
        public int Age
        {
            get { return _age; }
 
            set
            {
                if (_age != value)
                {
                    _age = value;
                    OnPropertyChanged("Age");
                }
            }
        }
        int _age;
    }

This is long, boring, and bloated. Of course, I created a code snipped for generating the properties, but it works only once, at the time of writing, the code is still bloated, and all the changes must be done by hand. Besides, it is unable to change case automatically, so that Age becomes _age.

I would like to see something like:

    class MyClass
    {
        public Observable<string> LastName;
        public Observable<int> Age;
    }
 
    property Observable<T>
    {
        T _value;
        get { return _value; }
        set
        {
            if (_value != value)
            {
                _value = value;
                OnPropertyChanged(#PropertyName);
            }
        }
    }

Of course, the syntax is approximate. Unfortunately, nothing of the sort exists. Generics are of no help here, and there are no macros in C#.

April 14, 2010

WPF drag and drop

I think I have finally found one that seems to do everything I need. This is quite an elaborated piece of software, I am only starting to investigate it, but the demo looks very promising:

http://www.codeproject.com/KB/WPF/gong-wpf-dragdrop.aspx

UPD: Indeed, great stuff. One of the best drag-and-drop solutions out there.