Premature optimization is the root of all evil

July 23, 2016

Shibboleth Installation and Configuration

Shibboleth is an open-source SAML implementation that is used for single sign-on. We are developing a SAML IdP, and I was testing it against Shibboleth. Here are some pieces of information that I want to keep for my records. I used Shibboleth with IIS 8.

Installation

I downloaded the latest package from the the download page. At the time of writing it was shibboleth-sp-2.5.6.0-win64.msi. The installer targets IIS 6, so I had to perform manual steps outlined in the IIS 7 Installer page.

The scripts on that page more or less work, but watch out for that extra new line before “/+[path”: you will need to remove it for the script to work. If you don’t, you will get this error

Handler "Shibboleth" has a bad module "ManagedPipelineHandler" in its module list

If this happens, make sure to repeat the manual steps for mapping .sso extension to Shibboleth filter.

After the installation is done, Shibboleth will protect the content of the http://yourhost/secure virtual directory.

Configuration

My Shibboleth configuration file. Note: this is not the actual file we use, some company specific information was removed.

Shibboleth configuration is stored in the shibboleth2.xml file. Standard configuration process involves going through the file and replacing default values such as host names with our specific data. However, default configuration uses SAML discovery protocol, which we do not support, so I had to perform more significant modifications:

  1. Copy IdP’s SAML metadata to Shibboleth configuration directory (C:\opt\shibboleth-sp\etc\shibboleth).
     
  2. Add the following node under <ApplicationDefaults>:
    <MetadataProvider type="XML" file="YourIdpMetadata.xml"/>
     
  3. Remove <SSO> and <Logout> nodes under ApplicationDefaults/Sessions.
     
  4. Add <SessionInitiator> node in their place:
    <SessionInitiator type="SAML2" Location="/Login" isDefault="true" id="Intranet"
    relayState="cookie" entityID="https://your.idp/EntityId">
    </SessionInitiator>

    I am not certain what is the role of the “location” attribute. It appears to be ignored: SAML requests are sent to the URL in the metadata.
     
  5. If you leave it at that, Shibboleth will come up with a very cryptic message when you try to access http://yourhost/secure:
    Unable to locate a SAML 2.0 ACS endpoint to use for response.
    “ACS” here stands for “Assertion Consumer Service”. To get rid of the error, add the following node under <SessionInitiator>:

    <md:AssertionConsumerService Location="/SAML2/POST" index="1" 
    Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" conf:ignoreNoPassive="true" />

    The documentation claims that an example is distributed with default shibboleth2.xml, but it is no longer the case. However, some examples can be found in the example-shibboleth2.xml file.

Restarting Shibboleth

Some configuration changes are picked up automatically, but for others you would have to restart Shibboleth. Run the following script as administrator

net stop shibd_default
iisreset
net start shibd_default

Log level

Log levels are set in configuration file shibd.logger. This is a standard log4j configuration file, change “INFO” in the second line to “DEBUG” to get more detailed output. Don’t forget to restart Shibboleth after that.

What’s Next

I will discuss how to integrate Shibboleth SP with your IdP to control access to various resources.

July 22, 2016

Outlook glitch

A couple of hours ago my Outlook at work stopped working getting stuck on “trying to connect”. The Exchange Server appeared to be alive. Restarting and rebooting did not help.

I managed to unstick it by going into File -> Account Settings -> Account Settings (button) -> Double click on account -> More settings -> Security -> Always prompt for logon credentials (wow, what a way to structure the settings dialog!).

Next time the Outlook ran it asked me for user name and password, I entered them and voila – now it can connect. After that I went back to the same setting and unchecked the box.

The weird part is that it did not give me “invalid credentials” prompt, it would just hung.

July 16, 2016

SAML and Shibboleth

What is SAML

I am currently dealing with a project that involves single sign-on (SSO) and SAML protocol. Single sign-on means you enter your credentials once, and then use your identity on multiple web sites. The web sites must talk between each other to verify your identity without exposing your user name and password. Two leading protocols for that purpose are OAuth and SAML.

An interesting twist is that we already had a quasi-SAML SSO implementation written long time ago by some consultants from a galaxy far, far away, but it fell quite short of actually implementing the standard. SAML is an XML-based standard coming from Oasis, like SOAP, WSDL, WS-Security, etc. Like most XML-based standards coming from OASIS, SAML is a verbose soup of XML tags governed by complex rules, which are easily misunderstood. Some of the errors in our old implementation came from simply not reading the standard carefully enough, but others arose from fundamental misunderstandings, which I will cover later.

Shibboleth

[read more…]

June 16, 2016

REST: PUT request and calculated fields

If I have a REST server and do PUT on an resource, should the server return new resource state in the response, or should the response body be empty?

RFC-7231 is suspiciously silent about that. It has a lengthy discussion on what status code to return, but says virtually nothing about the response body. It is clear that empty response body is legal, since one of possible return codes is 204 No Content, but there is no recommendation about non-empty bodies.

One can make arguments for both empty and non-empty body. PUT is allegedly supposed to create or replace the resource in its entirety (although RFC does not explicitly say that), so the client should know what resultant state it seeks. Thus, sending back full fledged resource state will only waste the bandwidth.

From the other hand, the RFC states that the server is allowed to modify client input to make it conform to the internal rules, e.g. convert the input to a different format. Additionally, the resource may contain calculated fields that are hard or impossible to obtain on the client: modification time, number of hits, current orbital momentum, link to the XKCD cartoon the server deems the most relevant to the subject, etc. In most of those scenarios the client will have to issue an immediate GET request to fetch the data back from the server, so why not just serve it back right away and save a round-trip?

Presence of calculated fields raises other interesting subjects. E.g. if the client makes the same PUT request twice and gets different calculated fields back, does it mean the PUT implementation is not idempotent? If the client sends wrong values of the calculated fields in, should PUT request be rejected by the server? But I digress.

If you search the Internet, you will find equally convincing opinions (with lots of upvotes) that the body should be empty, because it’s THE RIGHT WAY, and that the body should not be empty, because it makes client’s life unnecessarily hard.

After giving it some thought, I think the server SHOULD return new resource state when practical. I initially wrote a server that returns an empty body, and immediately found myself doing GET in the client immediately following any successful PUT. This does not feel right. Of course, there are cases when the state is too large to return (think of PUTting a 100MB file), but I feel these are the exception rather than the norm.

PS. I suspect the RFC says nothing about the response body, because the committee members could not agree on what recommendation to PUT in :)

Links:
http://tools.ietf.org/html/rfc7231#section-4.3
http://blog.ploeh.dk/2013/04/30/rest-lesson-learned-avoid-204-responses/
http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api
http://stackoverflow.com/questions/797834/should-a-restful-put-operation-return-something
http://stackoverflow.com/questions/29217881/in-rest-if-updating-an-object-automatically-changes-the-last-modified-date-s
http://stackoverflow.com/questions/9450611/put-and-idempotent
http://51elliot.blogspot.com/2014/05/rest-api-best-practices-3-partial.html
http://programmers.stackexchange.com/questions/211275/should-an-http-api-always-return-a-body

June 14, 2016

OOP Gone Wild

In a piece of code I inherited from some smart, well meaning, but a little disorganized East European consultants, I found the following class hierarchy:

RestClient (abstract)
   JsonClient (abstract)
      XyzClient*
          SessionClient
      ApiClient


* XYZ is our proprietary service. The name was changed to protect the innocent.

It’s been a while since I’ve seen a class hierarchy 4 levels deep, especially with concrete classes inheriting from other concrete classes. Somehow these days people prefer composition. None of the abstract classes contain any abstract methods, just protected virtuals with implementations. So why are those classes abstract then?

I believe someone long time ago established that inheriting concrete classes from other concrete classes is a bad idea, but I already forget who, and I hope his advice was not based on the C++ “slicing” phenomenon.

The separation of responsibilities between the classes is somewhat unclear. ApiClient is described as “Base client class for REST API clients that manages XYZ session”. I wonder then what’s the difference between it and the SessionClient. Anyhow, it probably does not matter, since ApiClient is not used anywhere :) In fact, deeper investigation shows that all of these classes are dead code. We do have a place that makes a REST call and parses JSON, but it does not use any of those classes, and neither does any other piece of callable code.

Don’t do this at home, folks.

May 30, 2016

Looking back at WPF/Silverlight/WinRT

Last week I had to write a small WPF application, and I was surprised how hard it is to get WPF right. By “right” I mean not just getting the code to work, but to write easily maintainable code I won’t feel ashamed of later. I haven’t being programming in WPF for about a year. A year is not long enough to lose touch, but long enough to get some perspective. While I was in the thick of it, I did not realize how ridiculous some things really are.

Of course, WPF is virtually dead, so its problems may not really matter, but WPF’s alleged successor “Modern UI” a.k.a. “Metro” a.k.a “WinRT” inherited most of WPF’s baggage. So, whenever I say “WPF” it really means “WPF, Silverlight, Metro and all other XAML-based technologies”. In fact, WPF successors made some problems worse, e.g. by excluding MarkupExtension.

WPF problem number one is that it is verbose. You have to write too much code to achieve simple things. Verbosity in WPF comes in multiple flavors:

  • XAML is verbose.
  • NotifyPropertyChanged is verbose and repetitive.
  • Value converters are verbose.
  • Dependency properties are verbose.
  • Attached properties are outright scary.

WPF problem number two is lack of consistent programming model. Putting business logic in views leads to mixing it with presentation, and your data binding looks awkward. Putting business logic in view models leads to ridiculous questions like “how do I handle double click”, “how do I open a new window from my view model”, or “how do I set focus to a control from my view model”. Any way you do it, it is either complex, creates unwanted dependencies, or both.

[read more…]

May 27, 2016

You ain’t gonna need it… You ain’t gonna need it…

Two parables about the YAGNI principle: both true stories.

Long time ago in a company far far away there was a piece of software. Every time we designed a new feature, we were told that “it must work when the computer is disconnected from the network”. Sometimes it did not matter, sometimes it complicated the design a lot. Punch line – no one ever actually checked that the software works when disconnected from the network. I tried it once, and it promptly hung on startup. Meaning: the rule about working without network existed only in theory, and its only real effect was to make things more complicated. No one ever cared about it in practice, but no one dared to question it either, so it poisoned our lives for years.

Different company, different piece of software, not so long ago. Pretty much every “update” operation in the software had a parameter for “as of” date. So, you could, say, cancel your subscription as of next Monday. This parameter existed only on the server side: no actual UI had it, and all real requests were to be executed immediately. Nevertheless, the parameter was passed around the entire body of the server code. No one asked questions like what would happen if I try to kill my own grandfather cancel my subscription as of before it was activated, or if I activate the subscription in the year 2038, but then get my account closed in 2017. There were no checks for that in the code. When we removed the parameter, not a single unit test failed (and we had quite a few). Meaning: someone thought it would be cool to have an “as of” date, but they never contemplated all the consequences, and this parameter was never used in practice. Its only real effect was to introduce annoying bugs when the client clock and the server clock went out of sync, giving users an opportunity to try to kill their own grandfather.

March 30, 2016

How to copy trusted root certificates to another machine

I have created a VM from an image with a very clamp down security setup. In particular, it had a very limited set of trusted root CAs. It would not even trust https://www.microsoft.com. So, I decided to copy the list of root CAs from my machine to that machine.

Exporting root CAs is easy: go to Control Panel, Administrative Tools, Manage Computer Certificates, select “Trusted Root Certificates” from the tree, go to Trusted Root Certification Authorities and then Certificates. Select all items Ctrl+A), right click, All Tasks, Export. I chose the .sst format and got myself a nice .sst file.

Importing that file into the VM proved to be more difficult. After some googling I found this article that contains a Powershell snippet that does the job:

[reflection.assembly]::LoadWithPartialName("System.Security")
$certs = new-object system.security.cryptography.x509certificates.x509certificate2collection
$certs.import("certificates.sst")
$store = new-object system.security.cryptography.X509Certificates.X509Store -argumentlist "AuthRoot", LocalMachine
$store.Open([System.Security.Cryptography.X509Certificates.OpenFlags]"ReadWrite")
$store.AddRange($certs)

I copied this snippet into a file named import.ps1 and then executed it from PowerShell (“./import.ps1”). It worked great. I am not sure why Microsoft provides Export UI and leaves us to hunt for the import UI, but that’s a different question.

February 24, 2016

Access denied when creating a certificate enrollment request

As part of creating self-signed certificate, we use the following code:

var enroll = new CX509Enrollment();
enroll.InitializeFromRequest(cert);
enroll.CertificateFriendlyName = friendlyName;
string csr = enroll.CreateRequest(); // may fail with ACCESS DENIED

The latter is a call to COM method IX509Enrollment::CreateRequest(). If you are not running with elevated privileges, it will return with “access denied”, because it wants write access to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SystemCertificates. Note, that in latest versions of Windows most programs run without elevated privileges, even if current user has administrator rights. The program executing the above code must be specifically started via “run as administrator”.

January 17, 2016

A year of using Git: the good, the bad, and the ugly

I have been working with Git for about a year now, and I think I am ready for some summary. Before Git, I used SVN, and before it Perforce, TFS, and some other products, but I will use SVN as a prototypical “non-Git” system.

Git buys you flexibility and performance for the price of greatly increased complexity of workflow.

SVN will work for you better if

  1. All your developers work under single management, AND
  2. All developers can get relatively fast access to a central server.

Conversely, you should prefer Git if

  • There are multiple independent groups of developers that can contribute to the project, AND/OR
  • It is difficult to provide all developers with fast access to a central server

Here is my take on the good, the bad, and the ugly sides of Git.

GOOD: great performance, full local source control that works even when disconnected from the server, ability to move work quickly between servers without losing history.

BAD: complexity of workflow, more entities to keep track of, lots of new confusing terms, “one repo is one project” policy, limited/outdated information about the remote, misleading messages. Commit labels are not integers, which complicates build/version number generation.

UGLY: you can lose work by deleting branches or tags.

[read more…]