Monday, March 29, 2010

No One Size Fits All

One of the things that gets me particularly hot and bothered under the collar is when people who should know better stand up and claim something as objective truth (I’m going to limit myself to software engineering here, but you can probably infer the rest), when it’s clearly a matter of opinion and circumstance.

Many pundits proselytize agile this way.

For example, people say things like “you should be aiming for 90% test coverage”, and round the room people nod sagely and take notes in their little pads, whilst I’m screaming into my head and fighting the urge to tackle the speaker to the floor and finish him off there and then.

No. There is No One Size Fits All.

It’s kinda the software equivalent of the cover shot, the airbrushed reality held up for us all to feel inadequate against. You’re not doing TDD, therefore you are stupid. You’re not using IOC so your project will fail. And yes, your bum does look big in that form-bound-to-tableAdapter.

Give me a break.

Don’t get me wrong: I like unit tests as much as the next man. That is, unless the next man is a rabid evangelical fanatic, feverishly copulating over a copy of Extreme Programming Explained. Test have a vital role in controlling quality, costs and regressions. But their value lies in helping you achieve your goals: they have no intrinsic worth in of themselves. And they are just one tool in the toolbox, whose relative value on a project is entirely contextual based on the team, the requirements, the business landscape and the technologies.

So the answer, as always is ‘it depends’. And this should always be your talisman for detecting shysters everywhere. If someone deviates from this pattern:

Q: (insert important question here)
A: It depends

…then you know they are either lying, or don’t know. If the question is worth asking, this should be the answer.

If you’re actually giving the answer you probably want to give a bit more than just a literal ‘it depends’ answer, otherwise you still look like you don’t know. You want to couch your answer in terms of various options, and the parameters within which each option becomes viable. But the answer is always ultimately a question for the asker, because there is no truth and all things are relative and beauty is in the eye of the beholder and so on.

So for example the level of automated unit testing on your team should consider things like whether any of your team have written any tests before; the opportunity cost (quality vs. time-to-market); the relative ratios of manual testing vs. developer costs; and especially the amenability of your tech stack to automated testing.

It’s a common - but facile - argument to suggest hard-to-test is somehow the fault of your design, when you may have to work with products like BizTalk, SharePoint, Analysis Services, Reporting Services, Integration Services and – hey – we might even have some legacy code in here too. Do these somehow not count, because in my experience this is where many (if not most) of the problems actually lie.

Similarly, many pundits have taken the ‘people over practices’ mantra to mean ‘hire only the top n%’ (where n is < 10), whereas on your team you need to consider the local market, your costing structure and your growth model. Clearly, not everyone can hire above the average, so how do you scale?

And sorry Dr Neil, but bugs are a fact of life. Nothing else in this world is perfect, why should software be any different? Everything has limits, some designed, some unforseen, but always there is a choice: fix it, or do something else. And that’s a business cost/benefit decision, not a religious tenet: is it worth the cost of fixing? If you are sending people to the moon, or running nuclear power stations[1] you look at things very differently than if you’re running a two week online clickthro campaign for Cialis[2]. Get over it. Bugs are risks, and there is more than one way of managing risk. Remember product recall cost appraisals? Fight Club? Oh well.

Ultimately there is only what works for you, on your project, for your client. Everything else is at best constructive criticism, at worst (more common) a fatal distraction.

There is No One Size Fits All

See also: Atwood and Spolski’s Podcast 38

 

[1] Though of course in either of those cases you wouldn’t be violating the EULA by using the CLR, or – I suspect – reading this blog anyway.
[2] You’re kidding right? Look it up

Friday, March 26, 2010

Break Back Into Locked Out Sql Instance

This is how to get ‘back into’ a SQL instance when the local administrators group have been ‘locked out’ by not being SYSADMIN on the sql instance (and the SA password has been lost / other admin accounts are unknown / inaccessible)

On more than one occasion people who should know better have flat-out told me that this can’t be done, so just while I have the link handy:

…if SQL Server 2005 is started in single-user mode, any user who has membership in the BUILTIN\Administrators group can connect to SQL Server 2005 as a SQL Server administrator. The user can connect regardless of whether the BUILTIN\Administrators group has been granted a server login that is provisioned in the SYSADMIN fixed server role. This behavior is by design. This behavior is intended to be used for data recovery scenarios.

http://support.microsoft.com/default.aspx?scid=kb;en-us;932881&sd=rss&spid=2855

This is also true for Sql 2008. See Starting SQL Server In Single-User Mode

Tuesday, March 23, 2010

Twitter

Microblogging?! Isn’t blogging bad enough?

“It’s a cacophony of people shouting their thoughts into the abyss without listening to what anyone else is saying”

This could have been me in the pub on any of a number of times someone was unfortunate enough to ask my opinion, but it’s not, it’s Joel Spolsky, and that makes it right, or at least marginally more authoritative.

Sadly, as the post above details, Joel is ‘retiring’ from the type of long opinionated tirades we’ve grown to love, and moving into more ‘objective’ territory (I suggest he bypass Atwood altogether, and get it on with McConnell directly). But from where will we get our invective? Wherefore the curmudgeon of the internet, the grumpy old man of programming? I think, with one huge exception, I’ve argued Joel’s side on most software engineering debates I ever had.

How will I know what to think now?

Monday, March 22, 2010

3 Races with .Net Events

I didn’t even know about #3 till recently, so time for a quick recap:

Race Between Null Check and Invocation

Since an event with no subscribers appears as a ‘null’, you have to check the event has been wired before you call it, right. Which typically is done like this:

	// post an event
if(ThingChanged != null)
ThingChanged(this, args);


This is the wrong way of doing it. In a multi-threaded environment the last subscriber to the event might unsubscribe between the null check and the invocation, causing a null reference exception:



	// post an event
if(ThingChanged != null)
// other thread unsubsubscribes here
// next line now causes null ref exception
ThingChanged(this, args);


Despite the MSDN guidance [1], this is a very, very common mistake to make. I found one the first place I looked (a CodePlex project), and also in the ‘overview’ page for the guidance above :-(. And most of the time you get away with it just fine: limited (if any) concurrency and a tendency to wire events 'for life' means it's very unlikely to happen. But as you ramp up the parallelism, and start hooking and unhooking events dynamically during execution, this will eventually bite you.



The easy fix is to cache the delegate locally first:



    var handlers = ThingChanged;
if (handlers != null)
handlers(this, args);


(I usually distribute this as a snippet to attempt to make sure people on my team do this automatically, as it’s easy to fall back on bad habits. The snippet also sets this up as a ‘protected virtual OnThingChanged’ method, uses EventHandler<T> and generally tries to encourage correct usage. ReSharper can also generate the correct usage for you)



These days you can ‘wrap up’ the pattern above as an extension method, but it’s not as flexible as actually just creating a member. You don’t have anywhere to put specific pre-event raising logic, and derived classes can’t override OnThingChanged to do their own thing first (something many UI controls and WebForms pages do a lot of).



Finally you can use the field initializer for the event to assign an empty delegate, and prevent the event field from ever being null. This isn’t actually my preference, but it is quite neat:



	public event EventHandler<EventArgs<Thing>> ThingChanged = delegate{};

// invoking the event then never needs the null check:
ThingChanged(this, args);


I don’t like the idea of a wasted empty delegate call, but I’m just fussy.



Delivery of Event to Stale Subscriber



Unfortunately the pattern above appears to trade one race condition for another, since now the event list that’s invoked is cached (and hence stale). A subscriber can unsubscribe but still subsequently receive an event if the deregistration occurs after the list is cached.



This has been discussed at length on StackOverflow, and on Eric Lippert’s blog, but the salient detail here is that this is unavoidable. The same race occurs if a subscriber unsubscribes during traversal of the event invocation list, but before that subscriber has been notified, or even between taking a reference to an item in the list and invoking it. So even the ‘empty delegate’ version has the same issue.



Eric says:




“event handlers are required to be robust in the face of being called even after the event has been unsubscribed”




…i.e. check your internal state, and act accordingly. In particular, for IDisposable classes, this means that you should not throw an ObjectDisposedException from your event handlers, even if you are disposed. Just don’t do anything.



It is a pity that this requirement is not more widely socialized than just his blog :-(



Race Condition on Event Assignments Within Declaring Class



I had no idea about this until recently when Chris Burrows started updating his blog again, but whilst event assignments are normally ‘thread safe’ (synchronised during the += / –= to avoid races on updating the (immutable) delegate list in-place), referencing the event from within the declaring class doesn’t bind to the event, it binds to the underlying private delegate. And there’s no automatic compiler voodoo synchronisation going on for you when you add and remove things from that.



If you are doing this you must lock on something, and to maintain consistency with the compiler-generated protection for the public event field, you have to lock(this). But again this will only be an issue if multiple threads are (un)subscribing simultaneously anyway, so if your in-class event hook up is in your ctor, before your ‘this’ reference got leaked or you spun off a background worker, you are safe as-is (I think).



For .Net 4 this issue has been fixed: using the += / –= syntax binds to the compiler-generated thread-safe assignment whether you are in the class or not. You can still do unsafe things with the private field if you start explicitly using Delegate.Combine, but that’s just weird anyway.



What’s nice here is the fix was part of removing the locking altogether. Now updates to events occur via a Interlocked.CompareExchange[1] spin, which is a classic no-lock approach:



public void add_Something(EventHandler value)
{
EventHandler handler2;
EventHandler something = this.Something;
do
{
handler2 = something;
EventHandler handler3 = (EventHandler) Delegate.Combine(handler2, value);
something = Interlocked.CompareExchange<eventhandler>(ref this.Something, handler3, handler2);
}
while (something != handler2);
}


This is actually a pretty good pattern to copy if you are targeting very high parallelism, because these atomic compare-and-swap operations are considerably faster than Monitor.Enter (which is what lock() does), so it’s nice to see a good ‘reference’ implementation to crib off (and one that will be pretty ubiquitous too).



Bonus: Robust Event Delivery



Nothing to do with race conditions per-se, but sometimes a subscriber to your event will throw an exception, and by default this will prevent all the subsequent subscribers from receiving the notification. This can be a real swine to diagnose sometimes, especially as the order of event invocation isn’t something you have any control over (strictly speaking it’s non-deterministic, however it always appears to be FIFO in my experience).



Anyway, if you want robust event delivery you should broadcast the event yourself, in a loop, collect the exceptions as you go and raise some kind of MultipleExceptionsException at the end:



    protected virtual void OnSomething(EventArgs e)
{
var handlers = Something;
if (handlers != null)
{
var exceptions = new List<Exception>();
foreach (EventHandler handler in handlers.GetInvocationList())
{
try
{
handler(this, e);
}
catch (Exception err)
{
exceptions.Add(err);
}
}
if (exceptions.Count > 0)
throw new MultipleExceptionsException(exceptions);
}
}


At this point the extension method approach beckons because this just blew right out.



Bonus: Lifetime Promotion Via Event Registration



Remember that subscribing to an event is giving someone a reference to you, i.e. an extra root that can prevent garbage collection. You are tying your lifetime to that of the objects that you are listening to.



Typically this isn’t a problem, because the publisher is a more short lived object than the subscriber, but if the publisher sticks around a while (or for ever, if its a static event) it’s very important that subscribers unsubscribe themselves when they are done or they will never get GC’d.

Tuesday, March 16, 2010

PowerShell <3

I was in a deep directory tree, and I wanted to CD into the same relative location in another tree (another parallel TFS workspace as it happens):

cd ($pwd -replace 'ITS_Spare','ITS')

Ah, bliss[1]

 

[1] In a deeply sad and nerdy way, admittedly

Monday, March 15, 2010

Why No ‘Average’ Aggregation Type in SSAS?

Just coming back onto some cube work for the first time in 12 months or so, and managed to get myself confused by the AverageOfChildren semi-additive aggregation type, just like I did the first time round :-/

(Aside: AverageOfChildren is a semi-additive average: it sums over all dimensions except time, and averages over time. This makes sense (for example) if you want to compare average monthly sales over the year for each of your branch offices. It’s useless, however, when you want to know the average sale price per unit)

Whilst creating calculated measures of the type [Measures].[Something Sum] / [Measures].[Something Count] is all very well, I don’t understand why this isn’t just supported out of the box, as a native Average aggregation type. Sure there are always ‘flavours’ of averages that need specific MDX (weighted averages for example), but surely the simple flat average over row count is so common as to warrant ‘out of the box’ support.

You might say having two ‘average’ aggregations would confuse, but AverageOfChildren causes plenty of confusion on it’s own: having two (and documentation on which to use when) would probably make the situation considerably better.

Fool me twice: shame on me.



Update June 2011: Rather than just complain, I raised a Connect Issue: Add AverageOfRows aggregation type for simple averages. Please vote for it

Cannot find one or more components: Please reinstall the application

[old post, languished in Drafts for a while…]

First day back from the Australia Day public holiday was no fun:

"Cannot find one or more components. Please reinstall the application"

Asking Visual Studio 2008 to ‘repair’ itself didn’t work (it actually came up with the error a few times during the repair), and neither did a complete uninstall-reinstall cycle (which also raised the same error a few times). :-(

A bit of Googling around determined that others had had this error ‘cannot find one or more components’ error, and it seemed to normally track back to a missing ATL90.dll. And that’s exactly what ProcessMonitor told me too, though for a different version of the dll. This folder was empty:

File:\\C:\WINDOWS\WinSxS\x86_Microsoft.VC90.ATL_1fc8b3b9a1e18e3b_9.0.30729.4148_x-ww_353599c2

The version number in that path, ATL90.dll 9.0.30729.4148, is the version number of a hotfix patch to the Visual C++ 2008 SP1 redistributable. Which isn’t applicable if you have Visual Studio 2008 installed, so god only knows how it got installed in the first place, but now it wasn’t, only Windows still thought it was, and no amount of Visual Studio 2008 / SP1 reinstalling was going to change its mind. Nor was there a hotfix entry in Add / Remove programs to get rid of it.

Eventually I gave up, uninstalled Visual Studio (again), installed the Visual C++ 2008 SP 1 redistributable, and then the ATL security update for that, and hey presto, an ATL90.dll in the right place:

image

But subsequently uninstalling both of those didn’t then remove this, nor (as far as I could tell) the registry entry pointing to that location:

Key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\SideBySide\PatchedComponents
Name {AD82707A-1779-38E5-9823-C666D7C05797}
Value c:\WINDOWS\winsxs\x86_Microsoft.VC90.ATL_1fc8b3b9a1e18e3b_9.0.30729.4148_x-ww_353599c2\\atl90.dll
{AD82707A-1779-38E5-9823-C666D7C05797}
c:\WINDOWS\winsxs\x86_Microsoft.VC90.ATL_1fc8b3b9a1e18e3b_9.0.30729.4148_x-ww_353599c2\
{AD82707A-1779-38E5-9823-C666D7C05797}

So I just left it there, installed VS 2008 + SP1 over the top, and now things work, although I am not exactly filled with confidence for the future.

[update: seems to have worked fine since…]

Thursday, March 11, 2010

3 PowerShell Array Gotchas

I love PowerShell, and use it for pretty much everything that I’m not forced to compile. As a result I’ve got fairly competent, and people have suggested to me that I should pull my finger out and do more blogging about PowerShell tricks and tips and suchlike. And they are right.

As a first pass, here are 3 PowerShell gotchas, all which revolve around array handling. PowerShell does some funky stuff here to make certain command line operations more intuitive, which can easily throw you if you are still thinking C# [1]

Converting Function Arguments To An Array By Mistake

When you call a .Net method, you use the familiar obj.Method(arg1,arg2) syntax, with a comma between each parameter.

However when you call a script function (or a cmdlet), you must omit the commas. If you don’t, you pass your arguments as an array to the first parameter, and many times the resulting error won’t immediately tip you off.

PS > function DumpArgs($one,$two){ write-host "One: " $one; write-host "Two: " $two }

PS > dumpargs 1 2 # Correct
One: 1
Two: 2

PS > dumpargs 1,2 # Incorrectly pass both to 1st parameter
One: 1 2
Two:


Subtle, yes? Anyone ever who has ever done any PowerShell ever has done this at least once. Ever.



Passing An Array to a .Net Constructor or Method



When you call a .Net method, as described above, that familiar comma syntax is actually still creating an array from your arguments, it’s just that’s what PowerShell uses to call .Net methods (and ctors) via the reflection APIs.



So there is a reverse gotcha. How do you call a .Net method (or ctor) that takes an array as its single parameter? Whether you create an array in-line (using comma syntax) or up-front as a variable, you will likely be told ‘Cannot find an overload for "xxx" and the argument count: "n"’, as PowerShell fails to find a method with the same number of parameters as the length of your array:



PS > $bytes = [byte[]]0x1,0x2,0x3,0x4,0x5,0x6
PS > $stream = new-object System.IO.MemoryStream $bytes
New-Object : Cannot find an overload for "MemoryStream" and the argument count: "6".


If you make the byte array smaller you’ll get other errors, as the length matches one of the ctor overloads, but not the types, or you may get semantic errors when the values bind to an overload, but fail imperative validation logic. eg:



PS > $bytes = [byte[]]0x1,0x2,0x3
PS > $stream = new-object System.IO.MemoryStream $bytes
New-Object : Exception calling ".ctor" with "3" argument(s): "Offset and length were out of bounds…


Worst of all, sometimes it can ‘work’ but not in the way you intended.

As an exercise, imagine what would happen if the array was 0x1,0x0,0x1 [3].



The (somewhat counter-intuitive) solution here is to wrap the array – in an array. This is easily done using the ‘comma’ syntax (a comma before any variable creates a length-1 array containing the variable):



PS > $bytes = 0x1,0x2,0x3,0x4,0x5,0x6
PS > $stream = new-object System.IO.MemoryStream (,$bytes)
PS > $stream.length
6


Indexing into a String, Expecting a String[]



PowerShell love to unpack things: it’s like kids at Christmas. So if a function ‘returns’ a collection with only one item in it (only one line in the file, or one file in the directory) you will get the item back, and not the collection.



Since a string itself can be indexed (as if it were char[]), this can lead to weird behaviour:



PS C:\Users\Piers> (dir .\Links | % { $_.Name })[0]
Desktop.lnk
PS C:\Users\Piers> (dir .\Links\desk* | % { $_.Name })[0]
D


In the first case the indexer retrieves the first file name as expected. However in the second only one item matched the wildcard. As a result we didn’t get back an array, but an item (the string name), and that's what we indexed into (yielding the first character). Not what we wanted.



The easy fix is to always use @ to ensure an expression produces an array, even if it only evaluates to a single item:



PS C:\Users\Piers> @(dir .\Links\desk* | % { $_.Name })[0]
Desktop.lnk


(NB: This is different from the ‘comma’ syntax described above that always introduces a parent array)



Bonus: Enumerating Collections of Arrays Without Unpacking



On a similar note, if you have a collection of arrays, and you pipe it, you will only ‘see’ the individual items, not the arrays. Again, the answer here is to wrap the arrays in arrays: only one level of unravelling is performed.



Bonus: Enumerating a Hashtable



By contrast, Hashtables don’t unravel automatically, though you might imagine they do. For example:



PS > $items = @{One=1;Two=2;Three=3}
PS > $items | % { $_ }

Name Value
---- -----
Two 2
Three 3
One 1

PS > # and yet...
PS > $items | % { $_.Name }
PS > # returns nothing.
PS > $items | % { $_.ToString() }
System.Collections.Hashtable


We're not actually enumerating the hashtable's *contents*, rather we are enumerating the hashtable as if it were a single item in a list. It just has very specific default rendering behaviour (which is why we see the contents spat out).



This normally happens for non-IEnumerable types, but presumably happens deliberately for Hashtable (which is enumerable) because it's quite 'special' within PowerShell. Anyway, to get round this you have to make the enumeration explicit:



PS > $items.GetEnumerator() | % { $_.Name }
Two
Three
One


Enough Bonus Already!



 



[1] In Hanselminutes 200 [2], Jon Skeet makes the point that C# and Java are – syntax-wise – similar enough for it to be confusing, as opposed to obvious ‘in your face’ transitions (eg between Java and VB#). I had the same experience when I travelled to the USA having spent 6 months in South America: when everyone was speaking Spanish you expected things to be different, but somehow in the US because they spoke the same language (nearly) my guard was down, and so every so often you’d be totally thrown by something being different. Anyway, I recon it’s like that between C# and PowerShell. It’s .Net and has {}’s so you are lured into a false sense of security and end up with trap all over your face. Or something.



[2] What is wrong with this URL? For show 200. That just freaks me out.



[3] No, the answer’s not down here. It’s an exercise for the reader.

Popular Posts