Cup(Of T)

Sunday, June 03, 2007

Production Debugging for Hung ASP.Net 2 applications – a crash course

If your app is down right now skip the intro – I’ll understand.

Introduction

First up: debugging – even in the development environment – should never be a substitute to detailed diagnostic logging and appropriate – meaningful – error handling. Any time I fix an exception, if the problem is initially non-obvious, I first fix the logging code. Then and only then do I fix the exception.

Baking in a support mentality early in the development process is the only way to avoid an application that’s hard (and by that I mean expensive) to support. The customer might be paying for the support time, but if you can’t turnaround support issues quickly – and practically any issue in a production business application needs turning around quickly – they start questioning your competence.

But sometimes logs aren’t enough, and even with the best will in the world, sometimes they’re not sufficiently detailed (they won’t - for example – be much help tracking down that memory leak). Sometimes you need to know what’s thrashing right now; to see the smoking gun.

Enter the world of crash dump debugging.

I’d been following Tess’s blog for a while now, but still had a big hole in my understanding. Sure I now knew I could use the !clrstack command to start looking at managed thread stacks, but where the hell was I typing that? This then is my introduction to Production Debugging for .Net applications (or ‘how to understand what Tess is talking about’).

I’m going to be talking about debugging hung .Net 2 ASP.Net applications, but much of this is relevant to debugging any .Net application in a production or testing environment.

Getting the Dump

Assuming you don’t have time to read a more detailed article on this (and the Microsoft guide is 170+ pages!), just do this:

WARNING: Installing the debugging tools on your production server isn’t recommended. That’s not to say you have any choice in the matter, but you should be aware there are security and availability implications that – if your app is hung – you probably don’t care about right now.

Install the Microsoft Debugging Tools for Windows and IISState on the machine where the hang occurs. You’ll need the Debugging Tools on your workstation too, but you can do this later if it’s happening right now.

If your app is thrashing CPU, use TaskManager to get the process ID (PID) of the W3WP process that’s using the most CPU (there may be more than one). NB: PIDs aren’t displayed by default: use View\Select Columns to turn them on.

Otherwise, at the command prompt, enter cscript iisapp.vbs to determine the process ID that matches your app pool.

At the command prompt CD to wherever you installed IISState (C:\IISState is the default). If there is existing content in the output folder, delete or rename the folder. Then execute the following:

IISState –p  <PID> -d

(…replacing <PID> with the PID you determined above)

WARNING: At this point your app will become unresponsive until the dump completes

IISState will put a dump file (.dmp) and a text summary (.log) of the process’s threads etc… in the output directory.

Rename the output folder to something that reflects the date/time, so future dumps don’t overwrite it.

Since we’re doing a ‘hang’, ideally at this point you’d wait a few mins, and then take another dump. If you don’t, you may regret it later and / or have to wait for the issue to re-arise before you can definitively pin it down. However, if the world and his dog want the site back up right now then skip it – it’s your call.

Kill that process. IIS will recycle it, and things should go back to normal for now. Panic over.

Analyzing the Dump

Copy the renamed output directory down onto your machine. For
the process ID you dumped there will also be an IISState summary file, eg IISState-5090.log. This log is a dump of the stack trace of all the threads in the process, and attempts to perform some additional diagnostics for you. Unfortunately, it doesn’t (at present) support .Net 2, so most likely it’s not going to tell you what the problem is. You should at least be able to see a few threads that look something like this:

Thread ID: 13
System Thread ID: d10
Kernel Time: 0:0:4.186
User Time: 0:20:25.702
Thread Type: Managed Thread. Possible ASP.Net page or other .Net worker
No valid SOS data table found.

Begin System Thread Information

# ChildEBP RetAddr  
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 01a9e76c 06ee91cd 0x6eea301
01 01a9eb6c 06cc6e91 0x6ee91cd
02 01a9ee44 06cc6278 0x6cc6e91
03 01a9ee80 06cc3ddd 0x6cc6278
04 01a9eeb8 06cc381c 0x6cc3ddd
05 01a9ef3c 0736fa0f 0x6cc381c
06 01a9ef84 0736f65f 0x736fa0f
07 01a9efec 0725853d 0x736f65f
08 01a9f050 79e9272b 0x725853d
09 01a9f060 79e926ad mscorwks!DllRegisterServerInternal+0x1d6b
0a 01a9f0e0 79e9450b mscorwks!DllRegisterServerInternal+0x1ced
0b 01a9f220 79e943f3 mscorwks!DllRegisterServerInternal+0x3b4b
0c 01a9f37c 07258408 mscorwks!DllRegisterServerInternal+0x3a33
0d 05b30b78 00000008 0x7258408

What does this mean? Well IIState doesn’t understand what’s going on with the stack below that mscorwks dll, which screams .Net 2 stack to you (as does the comment against Thread Type). But the likelihood is there’s not much else we’re going to get out of that file[1]. So it’s time to look at the mini dump.

Open a Visual Studio 2005 Command Prompt window, and from there launch C:\Program Files\Debugging Tools for Windows\windbg.exe (this ensures the .Net 2 framework is in the PATH for the WinDBG process, and makes life simpler –if you don’t have VS2005 installed do it manually). Do File \ Open Crash Dump and brose to the dmp file you created with IISState. If it asks you about saving workspace information just hit no.

It’ll open a text window with some summary information about the dump. This window has a little command bar at the bottom. This is where you do stuff.

The first thing it probably says is “Symbol search path is: *** Invalid ***”. So we type .symfix into the command bar to set it up to dynamically load symbols from Microsoft as required. You can do something to your %path% so you don’t have to do this each time, but I’ve not got round to that yet.

It probably also says

This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(13cc.84): Wake debugger - code 80000007 (first/second chance not available)

…but this is exactly what it says it is – it’s the console debugger attaching (how IISState got the dump), and not a ‘real’ exception per-se.

You can type ~ to get a list of the threads, or ~ *kb to get them with a bit of stack trace too, but we’re still only looking at unmanaged threads here, so pretty meaningless to me.

Type .load SOS to load the managed code debugging extension SOS.dll, then type !threads to see all the managed threads. Now that’s more like it.

NB: If you’re debugging on a different platform to the original crash (eg server was Win2003, you’re on XP), then you’re going to get an error at this point “Failed to load data access DLL, 0x80004005”. The error message contains the fix: execute .cordll -ve -u -l and the correct version of mscordacwks.dll for the dump will automatically be loaded, eg:

0:000> .cordll -ve -u -l
CLRDLL: C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\mscordacwks.dll:2.0.50727.42
f:0 doesn't match desired version 2.0.50727.91 f:0
CLRDLL: Loaded DLL C:\Program Files\Debugging Tools for Windows\sym\mscordacwks_x86_x86_2.0.50727.91.dll\442337B7562000
  \mscordacwks_x86_x86_2.0.50727.91.dll
CLR DLL status: Loaded DLL C:\Program Files\Debugging Tools for 
  Windows\sym\mscordacwks_x86_x86_2.0.50727.91.dll\442337B7562000\mscordacwks_x86_x86_2.0.50727.91.dll

Your !threads output might look like this:

ThreadCount: 9
UnstartedThread: 0
BackgroundThread: 9
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
      ID OSID Thread   OBJ    State   GC       Context           Domain   Count APT Exception
      12    1  d10 000dce98   1808220 Disabled 00000000:00000000 000d4008     2 Ukn (Threadpool Worker)
      14    2  9f4 000ede10      b220 Enabled  00000000:00000000 000daa20     0 MTA (Finalizer)
      15    3  6d0 00103ad8    80a220 Enabled  00000000:00000000 000daa20     0 MTA (Threadpool Completion Port)
      16    4 14b0 001074d0      1220 Enabled  00000000:00000000 000daa20     0 Ukn
      10    5  4f8 0012e408   880a220 Enabled  00000000:00000000 000daa20     0 MTA (Threadpool Completion Port)
      18    9 1184 06a32e20    800220 Enabled  00000000:00000000 000daa20     0 Ukn (Threadpool Completion Port)
      26    6 15a0 06ada200   180b220 Enabled  00000000:00000000 000daa20     0 MTA (Threadpool Worker)
      28    b   e0 06a5bd40   880b220 Enabled  00000000:00000000 000daa20     0 MTA (Threadpool Completion Port)
      29    c  fd0 06a3c368   180b220 Enabled  00000000:00000000 000daa20     0 MTA (Threadpool Worker)

Notice how the last column tells you what type of thread it is. You can get the managed stack trace for each one of these threads by selecting which thread you’re interested in from the Processes and Threads window (up on the toolbar) and typing !clrstack, but I find it more useful to just look at all of them straight off: ~*e !clrstack

You’ll see a lot of this:

OS Thread Id: 0x84 (0)
Unable to walk the managed stack. The current thread is likely not a managed thread. 
You can run !threads to get a list of managed threads in the process

...because what you're doing is looking at the managed stack trace for all threads (and many of them won't be .Net threads remember), but eventually you’ll see some managed stack traces pop up.

In my case it was a bit of an open-and-shut case, because there was only one .Net thread actually doing anything in the process – kinda a giveaway:

OS Thread Id: 0xd10 (12)
ESP       EIP     
[… trimmed 3^rd party vendor stack trace…]
01a9eec4 06cc381c Aspose.Pdf.Pdf.Save(System.IO.Stream)
01a9eed4 069d4a05 MyApp.Server.Utils.WordToPdf.AsposeImpl.Generate(System.IO.MemoryStream)
01a9ef44 0736fa0f MyApp.Server.Document.Implementations.AbstractWordToPdfDocumentGenerationImpl.Generate(...)
01a9ef8c 0736f65f MyApp.Server.Document.GenerationManager.Generate(...)
01a9eff4 0725853d MyApp.Server.ServiceImplementation.DocumentService.GenerateDocumentById(GenerateDocumentByIdRequest)
[… removed ASP.Net framework stack trace …]
01a9f710 65fbe244 System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
01a9f744 65fbde92 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest)
01a9f750 65fbc567 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
01a9f900 79f2d05b [ContextTransitionFrame: 01a9f900] 
01a9f950 79f2d05b [GCFrame: 01a9f950] 
01a9faa8 79f2d05b [ComMethodFrame: 01a9faa8]

[NB: I've had to edit that for brevity and wrapping]

Ok, so we’re dying in a 3^rd party component. But we’re still going to need to replicate this somehow because (presumably) it doesn’t do this all the time. Fortunately in this case all we need to know is the arguments supplied to our GenerateDocumentById() method to attempt to reproduce in dev. They’re in the dump too, so let’s see what they were.

Select that thread (12) using ~12s (or the Processes and Threads window, and no, I don’t know what the ‘s’ suffix means) and use !clrstack –p to view the stack with parameters. It’s a big list, but in there somewhere is the bit I’m interested in:

0:012> !clrstack –p
[… blah blah …]
01a9eff4 0725853d MyApp.Server.ServiceImplementation.DocumentService.GenerateDocumentById(GenerateDocumentByIdRequest)
     PARAMETERS:
          this = 0x02e596b0
          generateDocumentByIdRequest = 0x02e595b4

!dumpobject (or !do) then tells me the contents of that generateDocumentByIdRequest instance given it’s address as shown above (0x02e595b4):

0:012> !do 0x02e595b4
Name: MyApp.Server.ServiceContracts.GenerateDocumentByIdRequest
MethodTable: 05dcb5e0
EEClass: 06f935ec
Size: 12(0xc) bytes
   (C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\Temporary ASP.NET Files\blah blah blah\MyApp.Server.ServiceContracts.DLL)
Fields:
MT        Field    Offset    Type    VT Attr      Value   Name
790fedc4  40001c4  4         System.Int32 0  instance  8229991 documentId

The bit I care about on this is the documentId field itself, and being a value-type field (Int32) I can see the value right there: 8229991 (for a reference type like System.String I'd get another address to the instance itself, ie have to perform another !do).

And in this case, that was that. Well, it turned out to affect more than just this document ID, but at least I had enough to reproduce in the dev environment. And just in case anyone read the stack trace carefully, by the time we got to this stage Aspose already had a fix out for this issue, so I can’t really knock them. Finally whilst I was at it, I had a look at the objects in memory and began to wonder whether I’ve got a memory leak, so there’s probably a follow up to come.

Other SOS commands to look at:

!threadpool (also this is where the CPU utilization hides)
!eeheap –gc (see memory allocations that are subject to the GC)
!dumpheap –stat (see object allocations by type)
!help (gives help on the last extension loaded, so SOS commands in our case)

Other Resources:

Installing and Configuring IISState

SOS: It’s not just an ABBA Song Anymore: A good introductory article on MSDN with a dreadful title

If Broken It Is… Fix it You Should [Tess’s Blog]: This will all make a lot more sense now

Production Debugging for .NET Framework Applications: This 170 page document from Microsoft’s Patterns and Practices team is allegedly out of date, but is an invaluable reference, and what I based most of this article on

Debugging Microsoft® .NET 2.0 Applications: I haven’t read this MSPress book, but the v1 version got quite a lot of citations when I was researching this stuff

[1] So why did we take it? Because if it turns out it’s an IIS issue, and not your code, then someone will be able to get something out of it. Just not you.

Saturday, May 19, 2007

Nesting Transactions in Stored Procedures (Part 1)

Nesting transactional stored procedures is an absolute minefield.

Many projects I've worked on have just avoided opening transactions within stored procs altogether, and rely entirely on ADO / ADO.Net transactions in the application. This is certainally a simple way to avoid the problem, but isn't always appropriate. At some point there's always a data integrity rule that you want enforced at the lowest possible level (to avoid it being accidentally bypassed - I'm thinking auditing, or the classic 'move money between accounts' scenario). Putting that rule in a stored proc, and making that the only point of call for writes to the table involved makes it pretty safe. This is even more important if it's not just your app using the database.

But nevertheless avoidance is appropriate in many cases, if only because handling the nesting of transactions within nested stored proc calls is a complex matter. To summarize the problems:

Rollback and commit are asymmetric in nested transaction scenarios

Rollback aborts the whole nested stack, whereas commit just decreases the nesting level (only the final commit actually commits anything). So error handling has to behave differently in the proc at the 'root' of the transaction than it does in nested procs. Ulp!

SQL error handling is hard, and poorly-understood

Even basic TSQL error handling is poorly understood: having an exception raised in the calling application but the rest of the stored proc continue executing is pretty counter-intuitive, right? Whadyamean the transaction committed anyway!?!

The fact is that most SQL errors don't affect batch completion without explicit handling, some abort the batch but not the transaction, and some abort both. Furthermore, XACT_ABORT changes which do which. Erk!

Fortunately we don't have to come up with a solution, because there are already well-defined patterns for handling both these issues. There's an excellent article on the subject on Code magazine that's been the basis for my approach for a few years now:

Handling SQL Server Errors in Nested Procedures by Ron Talmage (I use the multi-level model)

Unfortunately barely any developers appear to be aware of the problem, let alone the solution patterns. This is particularly problematic, because either of Ron's solutions must be applied consistently in order to work properly. If that's an issue, maybe you're best going back to the avoidance pattern.

Anyway, the big question for me is "how has this pattern been affected by TRY ... CATCH functionality in Sql Server 2005?". We'll look at that next time...

Thursday, April 19, 2007

How to Disable Kerberos on Windows 2003 using NTAuthenticationProviders

If your IIS website / service

Is running as an account other than NetworkService
or isn't being accessed via the server's AD name (eg through an alternate URL, or load balanced alias)

...then Kerberos authentication will fail, because the client doesn't know who the server is to start the ticket exchange process.

In both cases getting the right SPN added into AD will fix things, but sometimes this can be problematic. However you can explicitly downgrade the virtual directory to only use NTLM authentication by setting the NTAuthenticationProviders property. There's no UI for this, so you have to set it on your IIsWebVirtualDir in the metabase, eg (for IIS 6 \ Windows 2003):

cscript C:\inetpub\adminscripts\adsutil.vbs //nologo SET W3SVC/1/Root/MyApp/NTAuthenticationProviders NTLM

Then everything (?) works again. Hooray.

This is discussed at the very bottom of this article: http://support.microsoft.com/kb/215383
NB: For Windows 2000 you can only set this at a site-level, not an application level, as the article outlines.

Working Directory Independence for Batch Files

%~p0\

(or: Many times your batch file will want to access resources in the same folder as the batch file. This can be tricky if the user calls the batch file from another folder, since the working directory is not the directory the batch file is in. Rather that resorting to pushd / popd everywhere, one can use the extended command line parameter handling to convert the full path to the batch file into a relative path to it’s container. %0 is the path the batch was called with, so %~p0 is the path to that location)

Thursday, March 29, 2007

Blogsphere faux pas

Don't confuse Atwood and Achewood

Thursday, March 22, 2007

Access denied when piping to :Null

It's quite a common thing in my batch files (yes, .BAT is still alive) to limit the verbosity of the output displayed to the user. It's a 'seeing-the-wood-for-the-trees' thing: I don't want to have the user waide through screens of output to know that a CACLS worked ok (and potentially miss something important)

So I quieten things down by piping to null:

aspnet_setreg -k:SOFTWARE\MyCompany\MyApp -u:"%username%" -p:"%password%" >:null

...and listen for ERRORLEVEL instead.

But recently I discovered that this fails with Access Denied if you don't have write permissions to the current directory (yes, you need write permissions to write to :null. Just don't ask). Not only that, but the whole statement fails, not just the piping bit.

So these days I pipe to %temp%\null instead.

Monday, March 12, 2007

When is a type not public?

[Keith's one this]

...when it's a nested type. So you have to do:


if(type.IsPublic || type.IsNestedPublic){
   // do stuff to public types
}

I can't even remember what we were doing when we ran into this, which (to me) indicates I'd better blog it before I forget about it altogether.

Friday, March 09, 2007

Encryption of connection credentials in .Net

In which I discuss techniques to manage the storage of a secret configuration setting (connection credentials) on a production webserver, when the development team (and their automated build/depoy process) shouldn't actually know the secret itself (and especially shouldn't store it in source control).

It's a bit of a ramble to myself as I think this stuff through.

What's new in .Net 2

In .Net 2, the configuration infrastructure includes support for encrypted connection strings out-of the box. One can simply call aspnet_regiis to encrypt (almost) any given section of a configuration file:

aspnet_regiis -pe "connectionStrings" -app "/MachineDPAPI" -prov "DataProtectionConfigurationProvider"

Pasted from <http://msdn2.microsoft.com/en-us/library/ms998280.aspx>

However

This encrypts the whole section, which makes inspection and verification of other settings harder
This requires the .config file to have the credentials in in the first place

So either the credentials sit in source control at the dev end :-(
Or the install process has to both prompt for them, and add them into the web.config before encryption. This is a pain, but manageable.

A variation on this technique is available for web farm scenarios, where the config is encrypted with a custom exportable RSA key.

aspnet_regiis -pe "connectionStrings" -app "/WebFarmRSA" -prov "CustomProvider"

aspnet_regiis -px "CustomKeys" "C:\CustomKeys.xml" -pri

Pasted from <http://msdn2.microsoft.com/en-us/library/ms998283.aspx>

Apart from being web-farm aware, this technique has the advantage that the public part of the key could be exported to the build server, and the settings encrypted prior to deployment

However this still means the build server (and by inference source control) contains the production credentials in plain text. This is generally what we want to avoid most - internal information leakage is far more likely than a compromise of the production environment, especially in an intranet scenario.

How did we used to manage this in Net 1.1?

Under .Net 1.1 the approach was to use the ASPNET_SETREG tool to move settings into the registry, under DPAPI encryption, then adjust the configuration using a special syntax that said 'read this from the registry instead'.

aspnet_setreg.exe -k:SOFTWARE\MY_SECURE_APP\identity -u:"yourdomainname\username" -p:"password"

Pasted from <http://support.microsoft.com/kb/329290>

identity impersonate="true"
userName="registry:HKLM\SOFTWARE\MY_SECURE_APP\identity\ASPNET_SETREG,userName"
password="registry:HKLM\SOFTWARE\MY_SECURE_APP\identity\ASPNET_SETREG,password"

Pasted from <http://support.microsoft.com/kb/329290>

This has the following advantages:

Secret only needs to be known once, on initial install, and is thereafter stored in the registry
Source control and build process doesn't need to know the secret
Same config file deployed to both boxes in a web farm

Whilst under .net 1.1 this was only supported for specific keys:

identity userName= password=

processModel userName= password=

sessionState stateConnectionString= sqlConnectionString=

Pasted from <http://support.microsoft.com/kb/329290>

…a common pattern, and one I've used a couple of times, was to use aspnet_setreg to put arbitrary settings into the registry in an encrypted manner, and use custom DPAPI code to retrieve them. There was no equivalent of the DPAPI wrapper class 'DataProtection' in .Net 1.1, so Keith Brown provided one:

"Version 2.0 of the .NET Framework introduces a class called DataProtection that wraps DPAPI. It's simple to use; in fact, it looks almost exactly like the wrapper class I provided above. I've shown an example in figure 70.2."

Pasted from <http://pluralsight.com/wiki/default.aspx/Keith.GuideBook.HowToStoreSecretsOnAMachine>

So where are we going with this?

Well to my mind at least the .Net 1.1 registry approach was actually better than the .Net 2 model - it separated the configuration and the secret in a way that perfectly reflected the normal division of labour between development and infrastructure teams (where the devs are expected to provide the configs, but they shouldn't know the secret).

There should be no reason why one cannot use the .Net 1.1 technique (ie use ASPNET_SETREG to store the secret), whilst still using the .Net 2 DataProtection class to unencrypt it.

This does require careful monitoring of the ACL on the registry key in question
There's no real requirement to use ASPNET_SETREG over a custom exec (using the DPAPI), bar convenience. That having been said a custom EXE could set the registry key ACL at the same time, which I remember as being a headache otherwise.

Tuesday, December 12, 2006

Watching files for changes

The moral of the story is, if you want to watch a file for changes, don't just listen to the Changed event on a FileSystemWatcher.

Different applications save files in different ways. Some just overwrite the original file, in which case Changed fires. Others attempt to create a two-phase commit transaction, so either the changes get saved or your original file is preserved (in case of a power outage). However depending on how they do this, they might never raise a Changed event against the file you're looking for - they might (for example) do all the changes to a temp file, then rename it.

Watching C:\*.* with a FileWatcherTestUI.NonBufferedFileSystemWatcher and updating a file 'UpdateSql.txt':


Notepad
[2006-07-10T10:45:27.1245966+08:00] Changed: C:\UpdateSQL.txt
[2006-07-10T10:45:27.1402217+08:00] Changed: C:\UpdateSQL.txt
[2006-07-10T10:45:27.2183472+08:00] Changed: C:\UpdateSQL.txt
 
VS 2003
[2006-07-10T10:45:17.3432840+08:00] Created: C:\ve-329.tmp
[2006-07-10T10:45:17.3589091+08:00] Changed: C:\ve-329.tmp
[2006-07-10T10:45:17.3589091+08:00] Changed: C:\ve-329.tmp
[2006-07-10T10:45:17.3589091+08:00] Changed: C:\ve-329.tmp
[2006-07-10T10:45:17.3745342+08:00] Changed: C:\UpdateSQL.txt
[2006-07-10T10:45:17.4214095+08:00] Changed: C:\UpdateSQL.txt
[2006-07-10T10:45:17.4526597+08:00] Changed: C:\UpdateSQL.txt
[2006-07-10T10:45:17.4682848+08:00] Deleted: C:\ve-329.tmp
 
VS 2005
[2006-07-10T10:45:37.3277869+08:00] Created: C:\ve-32B.tmp
[2006-07-10T10:45:37.3277869+08:00] Changed: C:\ve-32B.tmp
[2006-07-10T10:45:37.3434120+08:00] Changed: C:\ve-32B.tmp
[2006-07-10T10:45:37.3902873+08:00] Changed: C:\ve-32B.tmp
[2006-07-10T10:45:37.4371626+08:00] Changed: C:\ve-32B.tmp
[2006-07-10T10:45:37.4527877+08:00] Created: C:\UpdateSQL.txt~RFf7eb451.TMP
[2006-07-10T10:45:37.4684128+08:00] Deleted: C:\UpdateSQL.txt
[2006-07-10T10:45:37.4684128+08:00] Changed: C:\UpdateSQL.txt~RFf7eb451.TMP
[2006-07-10T10:45:37.4840379+08:00] Renamed: C:\UpdateSQL.txt (from ve-32B.tmp)
[2006-07-10T10:45:37.4996630+08:00] Deleted: C:\UpdateSQL.txt~RFf7eb451.TMP

Prefer Singletons to Static classes

It's a common scenario: you have some core facade class with a load of stateless methods on it that delegate activity down to other layers. There didn't seem much point in having the overhead of instantiating the class to use the methods, so you made all the methods static. You've made a static class.

Trouble is, static classes just aren't as flexible as the alternative: a singleton. For example:

* Static classes can't implement interface contracts
* Static classes can't be passed around or used polymorphically
* Inheritance of static classes is a minefield
* You can't make static members virtual or abstract.

This might not be an issue straight up - in many cases it may never be an issue at all - but when it does you'll be kicking yourself.

Take the scenario where you decide somewhere along the line that your facade isn't the be-all-and-end-all, and that you want to swap implementations in different circumstances. You don't want your objects to know about it, so you refactor the facade into some kind of proxy-by-composition. Still you're tied to having only one implementation active at any one time.

By contrast if you'd started as a singleton, it's pretty easy to change which subclass gets setup on the singleton. Its also far easier to migrate to a dependency-injection type architecture later down the line to support multiple implementations being used in parallel (this also then decouples the objects from the facade/singleton, which is a benefit in plugin-style architectures). And it's far easier to refactor some of the functionality out of the class into a base class, and start replicating your facade functionality in other scenarios.

On a more mundane point, when setting up a static facade you're doing work in the static class constructor. When this throws, it throws TypeInitializationException. With a singleton you get the choice - lazy init via static field init (still throws TypeInitializationException), or via first-property access, which will throw an exception you'll have a chance of debugging.

Go the singleton straight off (a rule I've once again re-learnt the hard way).

Sunday, October 22, 2006

VB.Net finally bug (.net 1.1)

Forgot to post this a while back, but a collegue (Matt) showed me this line that I wrote:

        Finally
            If Not cursorScope Is Nothing Then cursorScope.Revert()
        End Try

He was debugging the null reference exception that was thrown when cursorScope.Revert() was called when cursorScope was Nothing.

Yes, that's right, the IF condition was being blatantly ignored, and yes, this did seem to be related to being in the finally clause.

He broke it into a multi-line If / End If to made it all work. Go figure.

Using Generics to improve code readability

Obviously Generics in .Net 2 provides the ability to write type-parameterised code, which is great for collections and the like. But it was only when I started using it a bit I realised the potential for much wider cast-elimination.

Take for example some kind of factory method (or any method that has a type as a parameter, and returns an instance of that type):

SomeType instance = (SomeType)Factory.CreateInstance(typeof(SomeType));

Using generics can cut right through:

SomeType instance = Factory.CreateInstance<SomeType>();

It's particularly noticable in VB.Net (because it has such munted cast syntax in the first place), but the result is normally a lot more legible. Unfortunately the framework is full of missed opportunities to clean up:

thing = DirectCast(Enum.Parse(GetType(SomeType), someValue), SomeType)

could become

thing = Enum.Parse(Of SomeType)(SomeValue)

That's got to be better, right?

Friday, October 13, 2006

WorkflowRuntime.Dispose() doesn't clean up properly

Unfortunately, in the current RC5 build of Windows Workflow (the one that comes with .Net 3 RC1), calling Dispose() on the WorkflowRuntime does not clean up after itself properly. Specifically:

it doesn't unload workflows in memory
it doesn't reset the workflow performance counters

To do all that you have to call StopRuntime() first.

Try it for yourself:


          using(WorkflowRuntime workflowRuntime = new WorkflowRuntime())
          {
              workflowRuntime.AddService(new SqlWorkflowPersistenceService(Properties.Settings.Default.WorkflowPersistenceConnectionString));

              WorkflowInstance instance = workflowRuntime.CreateWorkflow(typeof(WorkflowConsoleApplication1.Workflow1));
              instance.Start();

              //workflowRuntime.StopRuntime(); // uncomment me to get the instance to be persisted
          }

This is particularly unfortunate given all of the samples for WF (and the boilerplate code that gets created in new WF projects) just allow the runtime to be disposed when it drops out of scope. And needless to say it goes against the grain of how Dispose() works in practice: as a 'clean up gracefully' rather than just a 'free unmanaged resources' (think SqlConnections getting closed, Transactions being rolled back etc...)

https://connect.microsoft.com/wf/feedback/ViewFeedback.aspx?FeedbackID=224627

Friday, August 11, 2006

Starting Over

A project manager I used to work with had a favorite phrase he liked to bandy around:

"You can't polish a turd"

There's always a point beyond which a particular class or module should just be tossed out and re-written, rather than fixed. A point where something's gone so badly in the wrong direction that it needs to be humanely put down so we can get on with the rest of our lives. But when to start over?

Only the developers who've worked in a given area know really how horrible it is, but even then it's hard to get an objective opinion. Many developers are over-precious, defensive of their own code, and unduly negative of the work of others [1]. Then again, other developers 'at the coal face' can be blind to how screwed something is, happily chiselling away when it's time to grab the dynamite. Even the system's architects can fail to grasp just how unworkable it is from a grunt's perspective. Management just can't win.

Here's some pointers to identify 'start over' candidates:

It's going to cost more to fix than to build from scratch
It's going to be quicker to fix than to build from scratch
There's a high defect rate in that area
There's no test coverage in that area, so any fixes walk a regression minefield
Even when fixed, it'll be sub-optimal compared to building from scratch
Even when fixed, it doesn't really do what you need it to
Even when fixed, it imposes significant architectural constraints on the rest of the system (eg perpetuates a coupling you're trying to get rid of, or continues with a depricated metaphor)
Even when fixed, there's a significant maintanance cost
There's no-one left on the project who understands how it works, yet it's not stable enough to just leave alone, or was left half-done
There's no documentation

Any of these are warning lights, but more than a couple should warrant a long hard look.

For the sake of completeness, all of these can be applied at the project level too. This is a really tough call, but sometimes a clean break is for the best. Salvage what you can and move on. The most reusable part of a project is always the IP anyway, not the implementation. Even if you don't reuse one line of code, you'll still have learnt something.

[1] It happens to us all. Every so often I find myself panning code I wrote myself (and subsequently forgot about)

Tuesday, August 08, 2006

Why ReSharper is (still) the dog's bollocks

Here's a pretty good post describing why ReSharper is still the dog's bollocks.

I think ReSharper is so utterly brilliant that I'd be deeply suspicious of any C# team that wasn't using it. It's not without it's faults I'll give you, but even then it's a joy to use, and dumps all over the built-in refactoring support in VS2005-C#.

Microsoft and CodeRush just don't get refactoring. They seem to think it's some kind of point-and-click design activity, all green bendy arrows or fiddly tool tips ('making a video game out of your code'). What they don't seem to understand is that I'm typing. If I want to go all clicky-clicky I'll go back to the design pane thanks, or the Class Designer.

ReSharper is zero friction. It's the TestDriven.Net of refactoring tools. Extract Local Variable: Ctrl-Alt-V done. Override method: Alt-Insert, select, done. That thing you were about to type: Ctrl-Shift-Space (pretty much). It's so quick and easy it's not funny. People looking over your shoulder will have difficulty keeping up.

Like many XP techniques, refactoring is one that suddenly makes a lot more sense with the right tool. Using ReSharper has totally changed the way I think about my code.

Saturday, August 05, 2006

Dealing with TechnicalDebt

I kind of copped out in a previous post by saying I didn't have any way of dealing with technical debt short of avoiding it in the first place. That was just plain lazy of me.

Dealing with Technical Debt cuts to the heart of the 'debt' metaphor's limitations. Debt implies regular interest payments, but that's not necessarily the case with technical debt. There's quite a continuum, from your 'high street bank' style debt: obvious, regular 'payments' impeding forward momentum; to your loan-shark style debt: maybe you'll get away without paying anything for a long time, but one night in a dark alley he'll want your kneecaps. This latter, unpredictable impact, is the more common scenario.

This then clarifies our approach towards technical debt: treat it as a risk. So, like all risks, you can:

Mitigate - Have a fall-back position if the debt becomes an issue
Accept - Assess that the risk is lower than the cost of fixing. You might get lucky
Reduce - Either deal with the debt, or avoid building new functionality on debt-ridden areas
Transfer - Make it SomeoneElsesProblem. Tricky.

Handwringing or winging about violating encapsulation doesn't tend to cut the mustard with project managers (and rightly so). Attempting to quantify implementation shortcomings in terms of project risk is far more likely to give the problem the attention it deserves.

Friday, July 21, 2006

Error Reporting - To send or not to send?

Don't think of it as 'can I be bothered to help Microsoft find their bugs' button. Think of it as a 'vote for this f#$%ing issue to be fixed' button. I find that clarifies the decision somewhat...

Broken URLs

Whyohwhyohwhy can't browsers deal with urls with line breaks being pasted into the address bar? All but the shortest urls get broken in emails, and yet the blindingly obvious solution (no it's not Shrinkster) still evades us.

Even a menu option ('Navigate to mangled url in clipboard') would do.

Friday, July 07, 2006

Avoid solution folders in VS2005 for your UserControl projects

In VS2003, whenever you had a form open in the designer, any user controls in your project[*] appeared in a 'My User Controls' tab on the toolbox.

This was essential because (for some reason unbeknownst to man) you can't drag a WinForms user control from the Solution Explorer onto a form, you have to do it from the toolbox (which sucks: the ASP.Net team managed it just fine, so what's the issue?).

In VS2005 the same functionality exists (with the additional feature that it can be disabled for performance reasons with the AutoToolboxPopulate option). But it doesn't work - or not for us at any rate.

Eventually it turns out (thanks Cam) that it doesn't work for projects within solution folders. How crazy is that?

"The fix probably won't make it into Service Pack 1, but I will forward your feedback to the appropriate team for consideration."
[Ben Bradley, MSFT, link above]

However the bug reports says it's fixed already:

http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=114542

...so I'm crossing fingers for SP1 (whenever that comes out).

[Update: Didn't publish this for ages, but no, SP1 doesn't fix it. Poo]

Sunday, May 07, 2006

TechnicalDebt

TechnicalDebt is a term used for all the hacks (in the pejorative sense) in the codebase. It's the shortcuts, the quick-and-dirties that come back to haunt you, the cruft.

The term was coined by Ward Cunningham, and is used in agile circles, though it is essentially at odds to YAGNI. Martin Fowler explains it well.

As I see it, there are two major problems in dealing with Technical Debt:

Inflexible project schedule. Unless you've got flexibility in your project planning (built in slack, or agile schedule) you've got no time to fix it. It is, after all, work you didn't plan for.
Inability to articulate benefits. Sure, maybe we shouldn't have done it this way, but we have, so why change it if it means yet more time spent.

This last is particularly pernicious. It's nigh-on impossible to estimate the eventual cost at the time the debt is 'incurred', and in many cases pretty darn hard to measure the cost of the debt whilst you're paying it. Without the data, how can you possibly justify the cost of rework?

So why can't we measure the cost? Well if it's a simple quick-and-dirty that must be fixed for further progress to be made, then it's just the cost of the rework - the debt is repaid. But it's never that clean cut: the decision can have almost subliminal rippling effects on all the code that touches it, and the developers who have to deal with the compromised areas may not be aware that there was a better way available originally. So the debt possibly doesn't ever get 'repaid' as such, and the 'interest payments' such as they are are hidden within the cost of other work.

[Random link that illustrates the costs of unpaid debt]

Plus of course even if you're aware of the hit, unless you do the same activity twice - once paying the tax and once without - it's impossible to measure the true cost. If you are directly aware of the impact, that probably means the impact is severe.

More than often technical debt is actually dealt with just by members of the project team caring enough about their future sanity to just fix it. But that's not a scalable solution: it depends on key people essentially doing work out of hours just to stay on track. Even this isn't viable if the problems involve exceed the 'weekend refactor' threshold (the amount your top developer can get done in a big burst over the weekend). Over this level requires a concerted effort, or at least a staged one, which has to be planned for and so can't be fixed 'under the radar'.

I don't have any magic formula for dealing with TechnicalDebt. My strategy has always centered around avoidance rather than anything else, and being really bullish about having to fix things then and there. It's a far more scalable strategy, but I'm very aware that it's not possible on all projects. How you deal with the debt then I'm not sure. My experience has been that you don't, and that forward progress just becomes progressively more expensive. I guess you hope your project finishes before the debt becomes insurmountable.

Thursday, March 16, 2006

Software Quality Characteristics

In CodeComplete, McConnell defines 15 'characteristics of software quality':

External Characteristics

Correctness
Usability
Efficiency
Reliability
Integrity
Adaptability
Accuracy
Robustness

Internal Characteristics

Maintainability
Flexibility
Portability
Reusability
Readability
Testability
Understandability

It's important to work out early on in a project what your priorities are (and especially what they aren't) so that you can steer the project effectively. It's also important not to underestimate the effect of all the unspoken tacit decisions made by developers in the course of their work: the unspoken assumptions. Without clear guidelines, developers will benignly pursue their own ideas about 'the right way' to do something. You can tick off all the use cases you like, but if they favour performance or reusability, and your client's real goals are accuracy and maintainability then you're heading for trouble. It’s always the non-functionals that bit you, IMHO.

Fortunately McConnell also points out that given clear guidance, developers will actually do what they're told, which is reassuring. Setting clear goals is a pretty key management tenant, but one easily forgotten. Some of these characteristics (say performance, code-complexity) can even yeild metrics, so you can deal with the project graphically: BigVisibleCharts.

On my previous project, we used this list as the basis for our peer code reviews, out of which came the categorisation I posted previously, but to re-iterate (slightly re-worded):

Customer Facing (most of the External Characteristics)
Maintenance and Design (most of the Internal Characteristics)
Defects (Just hunting for bugs)

I think it’s a good split. Each review role had a cheat sheet which consisted of the relevant characteristics (as defined above) to look for, and some check lists. This helps ensure visibility of the quality priorities into the code review, which makes them doubly hard to ignore.

PS: If you weren't sure what some of the characteristics meant, maybe you should re-read your copy of CodeComplete.

Wednesday, March 01, 2006

Focused Code Reviews

I was going to write something laying out the arguments for code reviews, but Hacknot beat me to it (and did a better job):

Hacknot: In Praise Of Code Review

To which I'll only add that not only is Code Review cheaper than formal testing, but it picks up whole classes of defects that can never be found by traditional testing (eg: plain unmaintainable code)

On my last project we did peer code reviews, with three random reviewers for the week, each performing a different role in the review:

Customer Focused & Usability (do the messages make sense, does the flow work, are we accurate and not robust (or viceversa if required))
Maintenance and Legibility (self-documenting code, documentation comments, architectural synergies)
Bugslaying (checking for obvious foobars, and reviewing test coverage)

We found this split worked pretty well at keeping reviewers focused, without which people can tend to miss the hard-to-spot stuff. Additionally having multiple people review each unit ensures that someone's bound to pick up on any serious issues, whilst also maximizing the cross-pollination of approaches.

Also I'm all for paper reviews. It's easier to annotate on paper, plus I tend to think along the lines that if someone missed it once on-screen, maybe printing it out reduces the chance someone else will do the same thing.

Monday, February 27, 2006

When to Automate

Generally speaking there are two main reasons to automate a process:

It's going to be done many times (so it'll save time), or
It needs to be done exactly the same each time

Doing a release build falls into both those categories, so from that perspective automated server builds (ala NAnt / CruiseControl.Net) start to look like a bit of a no brainer (especially if they can do the depoyment for you).

Interestingly, the Braidy Tester points out there's another important reason to automate, one that - as a developer - we invoke all the time: Morale. Even if I'm only going to do something once, if it doesn't take me much longer to automate it, then I will, for the sake of my own sanity. I'm a developer, not a computer.

It's worth factoring in the 'hidden cost' of not automating when making leadership calls, and err on the side of automation when it's a close call. Is not automating worth the morale cost of making people do really boring work?

Monday, February 20, 2006

Fluent Interfaces vs Currying

Martin Fowler's written recently about Fluent Interfaces, which are basically class API's designed from the perspective of creating highly-legible, flowing code. I've been dallying a bit with this over the last year, eg:

    TaskUtils.CopySettingsFrom(source).To(destination);

    Assert(someString, new Contains("some substring"));

... having been inspired by the NMock designers. There's even some examples in the .Net class library (eg: the SqlParameterCollection's Add method returns the parameter just added).

I'd only just got round to thinking of it as Currying (and prior to that, Syntactic Sugar). Terminology explosion! Martin's example is a bit more complicated than mine, and the more complicated schemes have their disadvantages, but it's always nice to see people going the extra mile to make their code usable. It adds polish, finesse.

A sharp tool[1] like ReSharper, or VS2005 (C# only) can really make this kind of stuff a breeze - you just type

    someobject.Add(thing).Substract(thing)

...and ReSharper will prompt you to create the appropriate Add and Subtract methods, with it's best guess as to the parameters you want. The result - clearer, more usable API's - is one of the reasons why the TDD guys have been advocating this approach all the time....

[1] As in Chapter 12, 'The Mythical Man Month', Frederick P. Brookes

Wednesday, February 15, 2006

VSS 2005 doesn't allow 'Leave this file' for writable files during Get Latest

Which is a bit of a pain really:

Unable to keep writable version of file in VSS 2005?

I tried the hack / workaround and it didn't work for me, so I've 'worked around' the problem by uninstalling VSS 2005 and going back to v6 (NB: need to re-run SourceSafe\Win32\SSInt.exe to re-register VSS6 for integrated source-control with VS / VB).

VSS 2005 was only mildly useful in that it allowed me to use an external diff tool (BeyondCompare, WinMerge) to view my changes, rather than the built-in one, but even then that meant I could only diff against the latest version (no dialog to allow you to put the previous numbered version syntax: eg File.cs;16), so I won't miss it much.

Saturday, February 11, 2006

Non transitive equality in VB.Net

One of the good bits of advice that floats around is to avoid overriding Equals() unless you know what you're doing[1]. The requirements for equality (reflexive, symmetric and transitive) are easily broken if you're not concentrating.

It's a pity no-one told the VB team[2]...

Public Sub TestStringEquality()
Dim someString As String
Assert.IsTrue(someString Is Nothing, "Of course someString is Nothing")
Assert.IsTrue(someString = String.Empty, "But it's also empty...")
Assert.IsTrue(String.Empty Is Nothing, "...so doesn't that mean that String.Empty = Nothing ?")
' last test fails
End Sub

[1] And that's without getting into the whole GetHashCode() trauma: When (not) to override Equals?, The Rules for GetHashCode
[2] This is tongue-in-cheek - the Is operator isn't really equality, I know. It's 'samey'.

Thursday, February 09, 2006

Doh! VB.Net and C# have different order-of-execution for field initializers...

He stared at the code for some time. The field was Nothing, when there was clearly a field initializer present. What the ...?

It turns out that C# and VB.Net differ in the order in which field initializers and constructors fire in an inheritance tree, whereas me being somewhat C# orientated had assumed the C# way was the .Net / CLR way (those kind of assumptions always get you in the end).

So in C# (as you know) all fields are initialized before any instance constructors run, working from most derived to base class, then the constructors fire from the base out:

Out: Initializing static field initializer in BaseClass
Out: Initializing static field initializer in DerivedClass
Out: Initializing field initializer in DerivedClass
Out: Initializing field initializer in BaseClass
Out: Initializing field in constructor in BaseClass
Out: Initializing field in constructor in DerivedClass

However in VB.Net all the base class field initializers and constructors run before the derived class gets a look in:

Out: Initializing shared field initializer in BaseClass
Out: Initializing shared field initializer in DerivedClass
Out: Initializing field initializer in BaseClass
Out: Initializing field in constructor in BaseClass
Out: Initializing field initializer in DerivedClass
Out: Initializing field in constructor in DerivedClass

This is probably due to the slightly looser rules VB has about what can go in a field initializer (eg dates). I'm guessing they probably fudge the IL so all those field initializers end up rolled into the constructor), but aaaaaaaaaarrrggghh... .

(I wonder what happens with cross-language inheritance....)

Of course none of this would be a problem if people[*] avoided the 'calling virtual methods from a constructor' trap, but at least in C# you can use the field initializers to (partly) ameliorate the problem. In VB, no.

ConstructorsShouldNotCallBaseClassVirtualMethods
Brad Abrams : New Design Guideline: Virtual Members

* by people, obviously I mean whomever originally wrote the class I was struggling with

Wednesday, February 08, 2006

Wierd stuff about VB.Net (from a C# perspective)

I knew that VB.Net was different, but it's funny how many extra things you find when you have to work on it day in day out:

Language

Wierd syntax (of course :-)
No documentation comments, or support for tool-tip documentation at break time (hover over String: C# "System.String represents text etc..." VB: "String" :-(
Can have dates as constants
WithEvents
Exposes static methods on instance members, but only by casted type (so no real advantage from a polymorphic perspective)
Shadow by name, not by signature
With keyword (kinda handy at times)

Syntax hording / Legacy syntax

The VB.Net team seem to be dragging a lot of compatability cruft with them.
eg. 7 different ways of declaring arrays, depending on where you feel like placing the brackets and the array length (watch out for the UBound / out-by-one syntax):

Dim someArray1() As String ' -> Nothing
Dim someArray2(1) As String ' -> Array[2] of Nothing
Dim someArray3() As String = New String(1) {} ' -> Array[2] of Nothing
Dim someArray4() As String = New String() {"One"} ' -> Array[1] as specified

Dim someArray5 As String() ' -> Nothing
' Dim someArray6 As String(1) ' -> Actually not allowed
Dim someArray7 As String() = New String(1) {} ' -> Array[2] of Nothing
Dim someArray8 As String() = New String() {"One"} ' -> Array[1] as specified

Also...

Assigning return values to the property / method name rather than using Return, which as a consiquence requires...
Default return value of 'Nothing'

And they seem to be making more as they go along:

Me.New() vs MyClass.New()

IDE

Funny (slightly disfunctional) pulldowns
No pre/post build events
No 'close all windows but this' (how wierd is that?)
Simplistic flat namespace model (everything in a namespace named by the assembly, hard to change this model)
Debugger 'rolls up' class heirachy if not dealing with most derived type - can make it impossible to see some fields / properties on base class that are later shadowed
'This' debugger window actually is a 'Me' window. However what's displayed in all the debugger windows does change from language to language (see above), so what's with not changing the window name...

Runtime

Different order of field initialization (base first)
Strange non-transitive equality with empty strings

Good article from McConnel 'Stuck in a VB.Net ghetto'

Using Log4Net's FileAppender in a web garden

Whilst you might not think you're running a web garden (that is: running your website in two or more processes simultaneously on the same machine), it's worth checking in case your server SOE catches you by surprise (hint: this happened to me):

Obviously in this scenario it's not just log4net you've got to watch out for, but anything you're saving on the local filesystem is a candidate for cross-process resource contention. In log4net's case, the RollingFileAppender can block itself (in the other process).

There's a couple of ways round this. You can configure the FileAppender to minimize the amount of time it locks the file (the MinimalLock attribute), but for diagnostic simplicity I prefer to configure each web application instance to log to a different file. Something like this:


 <appender name="File" type="log4net.Appender.RollingFileAppender">
  <file type="log4net.Util.PatternString" value="C:\temp\MyApp_[%processid].log" />
  <datePattern value="yyyyMMdd"/><!--roll the file daily-->
  <rollingStyle value="Date"/>
  <appendToFile value="true"/>
    
  <layout type="log4net.Layout.PatternLayout">
   <conversionPattern value="%date{ABSOLUTE} [%thread] %-5level %logger.%method() - %message [%identity]%newline" />
  </layout>
 </appender>

This way if anything screwy happens in one of the instances (it didn't start alltogether?) it's easier to isolate.

PS: Blogger is so not the right engine to be doing this XML stuff in...

Monday, February 06, 2006

The VS forms designer sucks

When you drag a control onto a form, what's the first thing you do?

You rename it (to something sensible rather than TextBox1), and clear or set the .Text property from the default. Both of which involve fiddling about in the property grid finding the right entries, which is slow and tedious. Why?

Presumably Microsoft think I spend all my time resizing and repositioning the controls, rather than tweaking their properties. Well, sorry... but no, most of the time it's just drag and drop, possibly adjusting the width to fit after I set the text.

There must be a better way... (thinks: probably some kind of VS add-in that prompts for the control's caption, and bases it's name on that with some kind of standard prefix...)

Thursday, January 12, 2006

How does VS.Net's Call Stack window infer the language the assembly was written in?

From the .pdb file it seems, sadly. Take that away, and it blanks out.

So I still don't know of any reliable way of determining what language an assembly was written in purely from the assembly / manifest. Of course it shouldn't matter - but it's nice to know...

VB.Net vs C#

(As if the debate hadn't been done to death already, right?)

I'm not a big fan of VB. Frankly the syntax sends beads of sweat down the small of my back, and attempting to find the right method using those pull-down thingies gives me fits. However even I'd have to admit, there's no real 'big ticket' items that you could wave about to say 'VB is shit' (or 'C# is shit' if you're that way inclined).

Instead what we're into here is market differentiation. This is normally known as price descrimination, but in this case it's more like developer differentiation (note how I avoided saying something like 'intelligence descrimination' ;-). Because ultimately we're advocating the same thing - .Net - and buying another house for BillG with our licences.

Whilst we sit divided, bickering over the relative merits of curly braces vs End Sub, what we should be discussing is the more important issues like:

when Microsoft will properly support test-driven development in it's toolset?
whether J2EE has caught up / overtaken .Net yet (how come all the good stuff - think log4net, NMock, NHybernate etc... is all java ports)?
is there yet a sensible framework for developing on Linux?

It's like a debate over white vs black iPods. Us .Net developers really need to get out more...

Monday, January 09, 2006

How to abort VS solution build when project build fails

On bigger projects one of the most irritating bits about Visual Studio .Net is it's 'carry on regardless' approach to multiple project compilation. A project fails, so instead of halting so you can fix it, you get a zillion build errors in all the other projects that reference it. This just slows down the build-fix cycle, not least because the 'root cause' error is drowned out in noise (the Task List window is useless in this regard, one has to find the first instance of the word 'Error' in the build output, and work from there).

There's a simple macro fix for this, terminate the build after the first error, however every time I move machine I forget where to find it (it strangely eludes my google-fu). So here it is:

http://www.ftponline.com/vsm/2003_10/online/hottips/sabbadin/


Private Sub BuildEvents_OnBuildProjConfigDone( _
  ByVal Project As String, _
  ByVal ProjectConfig As String, _
  ByVal Platform As String, _
  ByVal SolutionConfig As String, _
  ByVal Success As Boolean) Handles _
  BuildEvents.OnBuildProjConfigDone

  If Success = False Then
     DTE.ExecuteCommand("Build.Cancel", "")
     Dim win As Window = DTE.Windows.Item(EnvDTE.Constants.vsWindowKindOutput)
     Dim OW As OutputWindow = CType(win.Object, OutputWindow)
     OW.OutputWindowPanes.Item( _
        "Build").OutputString( _
        "ERROR IN " & Project & _
        " Build Stopped" & _
        System.Environment.NewLine)
  End If
End Sub

Why this isn't in the IDE from the get go amazes me. Did no-one at Microsoft ever write a multi-project solution in VS.Net?

Incidentally why is there no upward limit on the errors reported even within a project? One project I worked on had one massive code-gen'd file. When this broke, there could be upwards of 10,000's of build errors generated, which totally locked up the IDE for 5 minutes at a time. It wasn't responsive enough to manually Cancel Build, so you'd have to sit it out. I emailed the VS team lead (literally years ago) and his response was something like 'what a good idea'. So where's the fix?

Homily #1: Late binding is for code-gen only

Today's message: don't use late binding in any code that you (rather than the machine) wrote.

Why would you? No intellisense, no compiler safety net, diminished performance (in some cases). Even when you're sure you've got the right member, did you spell it right (and did you get the casing right, where applicable)?

It's not like it's hard to do either, eg:

Generate a strongly-typed dataset, and load the data into that instead, or
Use one of any number of code-gen tools to build objects from your data
For 'magic values' held in a database (or otherwise), use code-gen to import them into your code as constants / enumerations, and use them from there

It's the OnceAndOnlyOnce principal again: any time you build a 'wrapper' object like this you're centralizing the knowlege and maintanance of the binding contract in one place, and whilst you're not isolating the application from it (typically you'd expose members with names matching the data columns), any breakages are caught by the compiler.

Sunday, December 04, 2005

Bug: IE doesn't resolve URIs to behaviors correctly

URI's in stylesheets are resolved relative to the stylesheet, not relative to the page that loaded the stylesheet.

It's the only way that makes any sense. Take images for example:

/* this is somestyles.css */
body
{
   background-image:url(images/background.gif)
}

If you resolved the path to background.gif from the page, rather than from somestyles.css then every page would have to sit at the same level as the images directory, which precludes having a site heirachy.

Fortunately this is not how the world works - URIs are resolved based on the base URI of the stylesheet, and everyone's happy...

...except IE (6, but presumably 5+) doesn't do this for DHTML behaviours - the path to the HTC is resolved based on the page that the behaviour applies to. WTF!?

That I never noticed this before is a testament to how little I've ever used DHTML behaviours (I regard them as some kind of voodoo), but that there's been no public outcry presumably means no-one else is either. Or did I miss some kind of wierd DOCTYPE setting to force IE into 'not crap' mode?

(Sure you can make the path root-relative, ie

.someClass
{
    behavior:url(/behaviors/Moo.htc)
}

...but imagine if you're writing an app in a subdirectory (which is the normal VS.Net paradime):

.someClass
{
    behavior:url(/MyApp/behaviors/Moo.htc)
}

'MyApp' might be the name you use in development, but it might be MyAppDaily for the dailybuild, and it might be just / (as in, run at the root of the site) when you eventually deploy. Want to keep changing your CSS per-environment? I don't think so)

References:
The documentation - not very explicit on how it's expected to work
Other people who have noticed: 'Keith' (comment 15), Dean Edwards, 'Lithium' on microsoft.public.scripting.jscript, but unless my Google-fu is failing there's not many of them.

log4net Context problems with ASP.Net thread agility

Even after my recent revalations about how ASP.Net's agile threading model precudes storing things in threadslots, ThreadStatic's or CallContext almost without exception[1], it still took me until just the other day to realize an important ramification of that:

log4net contexts are broken[2] under ASP.Net

A log4net context is a bag where you can put data that can be referenced by the log4net appenders when the logging calls are rendered. This supports the model where you just publish logging data that's relevant to your immediate scope, and the log4net renderer (based on the pattern you put in your config) just appends all the extra data that you probably want to log (like the CustomerID who's logged in or somesuch).

There are three log4net contexts: GlobalContext, ThreadContext and LogicalThreadContext, which do pretty much what you'd expect - provide places to hang contextual data for logging calls that's either global, thread-local or 'logical call' local (ie propagates across remoting calls). Unfortunately that's just not good enough in an ASP.Net environment, since we now know that neither CallContext nor Thread Local Storage slots get propagated onto the new thread when ASP.Net does a thread switch. Since that's where the ThreadContext and LogicalThreadContext are actually stored, ASP.Net's thread switch either blows away your context (in the case of LogicalThreadContext), or overwrites it (in the case of ThreadContext).

There's nothing fundamentally wrong with log4net, it's just you can't use log4net's ThreadContext or LogicalThreadContext within ASP.Net and expect to get a per-request logging context, though you'd be forgiven for imagining otherwise.

Demo: Storing the requested page URL in ThreadContext, so it's liable to be overwritten if a thread switch happens because two requests come in concurrently:

[3164] INFO  ASP.global_asax /ASPNetThreadingDemo/SlowPage.aspx - Begin request for /ASPNetThreadingDemo/SlowPage.aspx
[3164] INFO  ASP.global_asax /ASPNetThreadingDemo/SlowPage.aspx - End request for /ASPNetThreadingDemo/SlowPage.aspx
[1996] INFO  ASP.global_asax /ASPNetThreadingDemo/SlowPage.aspx - Begin request for /ASPNetThreadingDemo/SlowPage.aspx
[3164] INFO  ASP.global_asax /ASPNetThreadingDemo/FastPage.aspx - Begin request for /ASPNetThreadingDemo/FastPage.aspx
[1996] INFO  ASP.global_asax /ASPNetThreadingDemo/SlowPage.aspx - End request for /ASPNetThreadingDemo/SlowPage.aspx
[1996] INFO  ASP.global_asax /ASPNetThreadingDemo/SlowPage.aspx - End request for /ASPNetThreadingDemo/FastPage.aspx

See how the last logging call has 'SlowPage' in context, even though the request that's finishing is actually for 'FastPage' (I was expecting exactly the reverse, but hey).

It's pretty ironic really, since it was log4net I used to confirm how the threading really worked in the first place, but of all the things I looked at, log4net contexts wasn't one of them. Trouble is, if you're using log4net contexts in ASP.Net this is exactly the kind of thing you'd[3] be doing: setting things up in BeginRequest so they're there for all logging calls throughout the system.

The Workaround:
Fortunately log4net's so flexible that it already contains the solution. For all the contexts, log4net supports the concept of delayed resolution of the context value. To be more specific, log4net gets the context value by calling ToString() on the contents of the context bags when they're frozen and attached to a logging call. If rather than adding the actual value, you add an object who's ToString() provides the value, then you've deferred resolution of what the actual value was.

So in this case, rather adding the value of CustomerID directly into the ThreadContext, you'd add a CustomerIdProvider object instead:

Public Class CustomerIdProvider
 Public Overrides Function ToString() As String
     Return HttpContext.Current.Items("CustomerId")
 End Function
End Class

Since this now leverages HttpContext, you can put it in GlobalContext as well - it's automatically thread-safe.

It's a bit of a drag, but you're going to have to do this for all your non-global log4net context data[4]. It's pretty easy to generalise the class above into an all-purpose HttpContextValueProvider though, so you won't have to spawn a myriad of classes.

The Solution:

log4net's going to have to have an AdaptiveContext. This is a context bag that's stored either in CallContext or in HttpContext, depending on the environment that the application is running within. I'm already doing this for anything I previously put in CallContext in my business layer, but I haven't got round to grafting one onto log4net yet.

[Update: I posted a follow-up post about the need to also implement IFixingRequired - please read. Also, please note the date on this post is 2005, I don't claim this as all current information. YMMV]

[1] Ok, if you don't store anything in TLS / CallContext prior to Page.Init, then you'll be ok (given the current implementation anyway).
[2] ...depending on how you're using them
[3] I.E. 'me, until I realised all this'
[4] Despite all this, log4net's still the best logging framework I know of by far.

Wednesday, November 16, 2005

Get VS2005 for free!

(or 'What to do if the boss won't let you upgrade')

...if you've already got VS 2003 that is.

I know many developers sit and watch the VS2005 demos and thing how brillant it all is, and how it's a pain they'll not be using it for x months/years because their employer is so backwards / tight / sceptical. But there's a whole list of things I see in demos time and time again that can be done right now if you know how. Sure, you won't always get the IDE support, and it might not be as well rounded, but if you've only been using vanilla VS2003, then you really want to think about some of the following. Buy me a beer with the cash you save.

IDE
Unit Testing: Download NUnit and TestDriven.NET. Note how double-clicking the test failure in TestDriven.NET actually takes you to the failing line, unlike VS2005. Enjoy using NestedTestCase like you can't in Whitbey.

Refactoring and Code Snippets: Download ReSharper and wonder how you ever did without it (or CodeRush + Refactor! if you can't use ReSharper because you're a VB guy)

Nullable Types: Just use those Sql types in the SqlClient namespace, and you're 90% there.

Functional programming / dependency injection: The new generic collections support cute methods to filter the collection (FindAll), where you pass in the filter behaviour as a generic delegate (eg Predicate). Great - but don't confuse the behavior-injection bit with the generic bit. If you're writing custom collections today, you can write Visitor-esque methods that do the same thing with either an interface or a non-generic delegate. If you're code-gen'ing, so much the better.

Developing without IIS: Write your own webserver, by spinning up a SimpleWorkerRequest and pointing it to a directory full of ASPX files

DebuggerVisualisers: Open that mcee_cs.dat file in (vs)\Common7\Packages\Debugger and you'll know what to do

DebuggerProxies: Refactor the offending class. No excuses.

Strongly-typed config files: Use a XmlSerializerSectionHandler to serialize your .config blocks directly onto settings classes. Retrieve the settings class early (eg Application_OnStart) to get FailFast (sure it's not at compile-time, but...)

Tracepoints: Just use log4net and have that diagnostic stream available all the time, even when the debugger's not attached (you can update the logging configuration on the fly, see...)

Class Designer: Sparx System's EnterpriseArchitect is an excellent UML tool that's not expensive and does a pretty good job (especially compared to how pants Visio is)

Object Test Bench: Write your playing-with-the-class (spiking) code as a unit test. That way, when you've finished playing, you've also written a regression test for the behaviour you're about to use elsewhere.

ASP.Net

Master Pages: Either use one of the free reverse-implementations for ASP.Net 1, or use a HTML editing tool that does support templates, like DreamWeaver. While you're at it note that DreamWeaver still beats the pants of the HTML editor in Whitbey, even if the gap is closing.

In-HTML intellisense for anchors etc...: Use a half-decent HTML editor that's got broken link detection (see above)

ObjectBindingSource: It's a little known fact that any object that implements IComponent can already be used as a data binding source in ASP.Net, though it's fiddly to get it on the form unless it's actually Component. I think (total conjecture here) this is what CSLA does.

TwoWayDataBinding: There are existing TwoWayDataBinding implementations for ASP.Net 1 (here's another), that don't require you to subclass all your controls like an eejit

Operator Overloading (& Continue, using blocks etc...): Use C# :-)

[phew. Did I miss any?]

Friday, November 04, 2005

Avoid public constants in .Net

I always advise people to avoid / be wary of public constants in .Net (internal / private is fine), and here's a good explanation of why:

http://haacked.com/archive/2005/01/02/1796.aspx

Why aren't bank notes perforated?

Never got the right change? Trouble breaking a note?

Why not just tear it in half?

$20 -> 2x $10
$10 -> 2x $5

Easy as.

When we had 'real' money (that is: coins whose value was equal to the amount of gold in the coin itself) you could do this. So what's wrong with perforating bank notes so we can do the same with them?

Wednesday, November 02, 2005

ThreadStatic, CallContext and HttpContext in ASP.Net

Summary:
Even if you think you know what you're doing, it is not safe to store anything in a ThreadStatic member, CallContext or Thread Local Storage within an ASP.Net application, if there is the possibilty that the value might be setup prior to Page_Load (eg in IHttpModule, or page constructor) but accessed during or after.

[Update: Aug 2008 In view of the fairly large number of people continuing to link to this post I feel the need to clarify that this thread-swapping behaviour happens at a very specific point in the page lifecycle and not whenever-it-feels-like-it. My wording after the Jef Newson quote was unfortunate. That aside, I've been immensely gratified (and flattered) by the number of times I've seen this post cited within design discussions around dealing appropriately with HttpContext. I'm glad people found it useful.]

There's a lot of confusion about using how to implement user-specific singletons in ASP.Net - that is to say global data that's only global to one user or request. This is not an uncommon requirement: publishing Transactions, security context or other 'global' data in one place, rather than pushing it through every method call as tramp data can make for a cleaner (and more readable) implementation. However its a great place to shoot yourself in the foot (or head) if you're not careful. I thought I knew what was going on, but I didn't.

The preferred option, storing your singletons in HttpContext.Current.Items, is simple and safe, but ties the singleton in question to being used within an ASP.Net application. If the singleton's down in your business objects, this isn't ideal. Even if you wrap the property-access in an if statement

if(HttpContext.Current!=null){
  /* store in HttpContext */
}else{
  /* store in CallContext or ThreadStatic */
}

... then you've still got to reference System.Web from that assembly, which tends to encorage more 'webby' objects in the wrong place.

The alternatives are to use a [ThreadStatic] static member, Thread local storage (which pretty much amounts to the same thing), or CallContext.

The problems with [ThreadStatic] are well documented, but to summarize:

Field initalizers only fire on the first thread
ThreadStatic data needs explicit cleaning up (eg in EndRequest), because whilst the Thread's reachable, the ThreadStatic data won't be GC'd so you might be leaking resources.
ThreadStatic data is only any good within a request, because the next request might come in on a different thread, and get someone else's data.

Scott Hanselman gets it right, that ThreadStatic doesn't play well with ASP.Net, but doesn't fully explain why.

Storage in CallContext alleviates some of these problems, since the context dies off at the end of the request and GC will occur eventually (though you can still leak resources until the GC happens if you're storing Disposables). Additionally CallContext is how HttpContext gets stored, so it must be ok, right?. Irrespective, you'd think (as I did) that provided you cleaned up after yourself at the end of each request, everthing would be fine:

"If you initialize a ThreadStatic variable at the beginning of a request, and you properly dispose of the referenced object at the end of the request, I am going to go out on a limb and claim that nothing bad will happen. You're even cool between contexts in the same AppDomain

"Now, I could be wrong on this. The clr could potentially stop a managed thread mid-stream, serialize out its stack somewhere, give it a new stack, and let it start executing. I seriously doubt it. I suppose that it is conceivable that hyperthreading makes things difficult as well, but I also doubt that."
Jef Newsom

Update: This was the misleading bit. I do explain further later on that this thread-swap can only happen between the BeginRequest and the Page_Load, but Jef's quote creates a very powerful image I failed to immediately correct. My bad.

~~Trouble is that's exactly what happens.~~ Trouble is that's almost what happens. Under load ASP.Net can migrate inbound requests from its IO thread pool to a queue taken up by it's worker process thread pool:

So at some point ASP.NET decides that there are too many I/O threads processing other requests. [...] It just takes the request and it queues it up in this internal queue object within the ASP.NET runtime. Then, after that's queued up, the I/O thread will ask for a worker thread, and then the I/O thread will be returned to its pool. [...] So ASP.NET will have that worker thread process the request. It will take it into the ASP.NET runtime, just as the I/O thread would have under low load.
Microsoft ASP.NET Threading Webcast

Now I always knew about this, but I assumed it happened early enough in the process that I didn't care. It appears however that I was wrong. We've been having a problem in our ASP.Net app where the user clicks one link just after clicking another, and our app blows up with a null reference exception for one of our singletons (I'm using CallContext not ThreadStatic for the singleton, but it turns out it doesn't matter).

I did a bit of research about how exactly ASP.Net's threading works, and got conflicting opinions-masquerading-as-fact (requests are thread-agile within a request vs requests are pinned to a thread for their lifetime) so I replicated my problem in a test application with a slow page (sleeps for a second) and a fast page. I click the link for the slow page and before the page comes back I click the link for the fast page. The results (a log4net dump of what's going on) surprised me.

What the output shows is that - for the second request - the BeginRequest events in the HttpModule pipeline and the page constructor fire on one thread, but the Page_Load fires on another. The second thread has had the HttpContext migrated from the first, but not the CallContext or the ThreadStatic's (NB: since HttpContext is itself stored in CallContext, this means ASP.Net is explicitly migrating the HttpContext across). Let's just spell this out again:

The thread switch occurs after the IHttpHandler has been created
After the page's field initializers and constructor run
After any BeginRequest, AuthenticateRequest, AquireSessionState type events that your Global.ASA / IHttpModules are using.
Only the HttpContext migrates to the new thread

This is a major PITA, because as far as I can see it mean the only persistence option for 'ThreadStatic'esque behavior in ASP.Net is to use HttpContext. So for your business objects, either you're stuck with the if(HttpContext.Current!=null) and the System.Web reference (yuck) or you've got to come up with some kind of provider model for your static persistence, which will need setting up prior to the point that any of these singletons are accessed. Double yuck.

Please someone say it ain't so.

Appendix: That log in full:

[3748] INFO  11:10:05,239 ASP.Global_asax.Application_BeginRequest() - BEGIN /ConcurrentRequestsDemo/SlowPage.aspx
[3748] INFO  11:10:05,239 ASP.Global_asax.Application_BeginRequest() - threadid=, threadhash=, threadhash(now)=97, calldata=
[3748] INFO  11:10:05,249 ASP.SlowPage_aspx..ctor() - threadid=3748, threadhash=(cctor)97, threadhash(now)=97, calldata=3748, logicalcalldata=3748
[3748] INFO  11:10:05,349 ASP.SlowPage_aspx.Page_Load() - threadid=3748, threadhash=(cctor)97, threadhash(now)=97, calldata=3748, logicalcalldata=3748
[3748] INFO  11:10:05,349 ASP.SlowPage_aspx.Page_Load() - Slow page sleeping....

[2720] INFO  11:10:05,669 ASP.Global_asax.Application_BeginRequest() - BEGIN /ConcurrentRequestsDemo/FastPage.aspx
[2720] INFO  11:10:05,679 ASP.Global_asax.Application_BeginRequest() - threadid=, threadhash=, threadhash(now)=1835, calldata=
[2720] INFO  11:10:05,679 ASP.FastPage_aspx..ctor() - threadid=2720, threadhash=(cctor)1835, threadhash(now)=1835, calldata=2720, logicalcalldata=2720, threadstatic=2720

[3748] INFO  11:10:06,350 ASP.SlowPage_aspx.Page_Load() - Slow page waking up....
[3748] INFO  11:10:06,350 ASP.SlowPage_aspx.Page_Load() - threadid=3748, threadhash=(cctor)97, threadhash(now)=97, calldata=3748, logicalcalldata=3748
[3748] INFO  11:10:06,350 ASP.Global_asax.Application_EndRequest() - threadid=3748, threadhash=97, threadhash(now)=97, calldata=3748
[3748] INFO  11:10:06,350 ASP.Global_asax.Application_EndRequest() - END /ConcurrentRequestsDemo/SlowPage.aspx

[4748] INFO  11:10:06,791 ASP.FastPage_aspx.Page_Load() - threadid=2720, threadhash=(cctor)1835, threadhash(now)=1703, calldata=, logicalcalldata=, threadstatic=
[4748] INFO  11:10:06,791 ASP.Global_asax.Application_EndRequest() - threadid=2720, threadhash=1835, threadhash(now)=1703, calldata=
[4748] INFO  11:10:06,791 ASP.Global_asax.Application_EndRequest() - END /ConcurrentRequestsDemo/FastPage.aspx

The key bit is what happens when FastPage's Page_Load fires. The ThreadID is 4748, but the threadID I stored in HttpContext in the ctor is 2720. The hash code for the logical thread is 1703, but the one I stored in the ctor is 1835. All data I stored in the CallContext is gone (even that marked ILogicalThreadAffinative), but HttpContext is still there. As you'd expect, my ThreadStatic is gone too.

Sunday, June 03, 2007

Introduction

Getting the Dump

Analyzing the Dump

Other SOS commands to look at:

Other Resources:

Saturday, May 19, 2007

Thursday, April 19, 2007

Thursday, March 29, 2007

Thursday, March 22, 2007

Monday, March 12, 2007

Friday, March 09, 2007

Tuesday, December 12, 2006

Sunday, October 22, 2006

Friday, October 13, 2006

Friday, August 11, 2006

Tuesday, August 08, 2006

Saturday, August 05, 2006

Friday, July 21, 2006

Friday, July 07, 2006

Sunday, May 07, 2006

Thursday, March 16, 2006

Wednesday, March 01, 2006

Monday, February 27, 2006

Monday, February 20, 2006

Wednesday, February 15, 2006

Saturday, February 11, 2006

Thursday, February 09, 2006

Wednesday, February 08, 2006

Monday, February 06, 2006

Thursday, January 12, 2006

Monday, January 09, 2006

Sunday, December 04, 2005

Wednesday, November 16, 2005

Friday, November 04, 2005

Wednesday, November 02, 2005

Popular Posts

Search This Blog

Previous Posts

Subscribe To

Tags

Approved Stuff:

About Me

ORG