Cup(Of T)

Friday, July 04, 2008

Using PowerShell to export the Windows Feeds list

Moved computers recently, and one of the things I realised I lost was my RSS feeds list. It was probably a blessing (I just tend to accumulate subscriptions otherwise), and maybe I should be using a reading service of some nature, but there you are.

Anyway given I'm all Mesh'd up, I though I'd copy my feeds list into my Mesh folder (like my bookmarks), so I'd have a backup and this wouldn't happen again. Only I couldn't find where the feeds list actually lives. Instead there's a whole API for dealing with it...

...which is surprisingly easy to use, and works like a treat in PowerShell (I'm always amazed at it's ability to 'just work' with things like COM objects). So I just exported the list instead:


# Dump the contents of the Windows Feeds store to an XML file

$erroractionpreference="stop";
[xml]$feedsListDocument = "<feeds/>"
$feedsList = $feedsListDocument.get_DocumentElement();
$feedManager = new-object -com "Microsoft.FeedsManager"

@"
<feeds>
$(
 $feedManager.RootFolder.Feeds | % {
  $feed = $_;
  $feedXml = $feed.Xml(-1, 0, 0, 0, 0)
  '<feed Name="{0}">{1}</feed>' -f $feed.Name,$feedXml
 }
)
</feeds>
"@

Easy as. The XML it spits out is overly large (since it includes all the article contents from the cache), but for the MB involved it barely seems worth refining it.

Update 2008-07-17: So like the very next day I realised I could have just sync'd the feed list into Outlook, and asked it to export it as OPML. But syncing into Outlook blew my tiny mailbox quota (these feeds are suprisingly large) so I ended up back doing this again anyway. Then it turned out that IE can export the feed list as OPML too (File \ Import and Export - you'd think I'd have noticed originally) - but I still like having a script because I can schedule it.

Note to self: It is definitely time to find a blog that can cope with XML a bit better

Finally: PowerShell as build language

I've never really got into MSBuild, which surprised some people given how much time in the last four years I've spend mucking about with CCNet / NAnt. It was partly that we did a bit of investigation when MSBuild came out, and saw a couple of things we didn't really like about it and decided to wait for v2 (ie Team Build in TFS 2008). Partly.

More fundamentally however the problem is that MSBuild is just too similar to NAnt, and my considered opinion after years of usage is that NAnt sucks, or to be more specific, XML is a terrible 'language' for writing executable code. Fowler puts it pretty well:

"After all until we tried it I thought XML would be a good syntax for build files"
http://www.martinfowler.com/articles/rake.html

Sure it's fine for the IDE to write that stuff out (though even then you have to look at it and wince, right), but for humans who want to customise their build process? Jesus wept. Variable scope: gone. Explicit parameters for subroutines (targets): gone. Its fine when it's simple, but once you start looping and branching and using temporary variables it's just a great big mess of angle brackets that even it's mother pities. And debugging? Now there's a story...

There's a time and a place for the angle bracket tax, and this isn't it. Square peg, round hole.

So given how amenable for DSLs PowerShell has proven to be, I've been holding my breath for some kinda PowerShell rake-alike.

And here it is: Introducing PSake

(Also Microsoft themselves are thinking about it, and canvassing for opinions about whether it's a good idea or not.)

Sadly (actually quite the opposite) I'm not actually having to deal with the build process on my current project, so I don't really have much excuse to play with it. But I dream of a future in which the TFS Team Build project wizard kicks out a PS1 file instead. It'd certainly make fixing some of it's shortcomings a whole heap easier (that's a subject for a future post)

Edit [14/7/08]: Most requested feature for MSBuild? Debugging. Obviously this'll be interpreted by the MSBuild team as a need for a debugger, but maybe they should have used a language that already had one.

Thursday, June 26, 2008

Beyond Compare 3 supports 3 way merge, is totally awesome

Beyond Compare 3 is out in beta. It supports 3 way merges!

I found this out literally minutes before I started what turned into a 2 day mergeathon between two large and divergent branches in TFS, with *lots* of merge conflicts to manually resolve, and I can honestly say I'd probably still be merging if I hadn't downloaded it. It's just fantastic.

I'll probably post some screenshots etc... soon, but if you're struggling merging with BC2 and/or the built-in diff/merge support then you really should check it out.

Monday, June 23, 2008

MSDN Downloads and the fly-out menus trauma

Raymond's just posted about the rationale behind the windows menu show delay, and goes on to point out various web properties that blatantly ignore the underlying usability requirement.

Sadly finding examples is like shooting fish in a barrel. I remember Jakob Nielsen winging about this last millennium, but as the technology moved forwards: Director, DHTML then Flash, the ease with which anyone can design their own UI and distribute it widely over the internet has lead to a flood of bad UI. Even as Vista attempts to move forwards, the new Silverlight version of MSDN Downloads re-re-implements the fly-out-menus concept, with almost unusable results.

Maybe this is a necessary pain we have to move through, but it kinda sucks that we can't explore novel and interesting UI concepts without making them totally unusable. I'm no UI designer, but at least I don't pretend to be, or work as one.

The templating within WPF is a great example of an enabling technology here, where the usability can be codified into a control by 'experts', but still delegate most of the 'funky look' to the end-designer. In this case if WPF / Silverlight had shipped with a decent fly-out menus control, maybe the MSDN Downloads team wouldn't have got it so horribly wrong, and I wouldn't have had to uninstall Silverlight in frustration.

I guess there is hope then that this isn't just another enabling technology that enables people to make a real arse of things.

PS: Check out this bizzarro comment on Raymond's blog:
"Let's not get into the "gynaecologist's interface" that is Vista's Start Menu, shall we?"
WFT?

Friday, May 30, 2008

Don't be Stupid

Years ago I was working on a project and I came up with a fantastic idea to help limit the level of regressions in the codebase I was working on. Rather than write unit tests as little throwaway test harnesses, I moved them into the codebase, and created a little app to execute them. It even did this via reflection, so as we added more tests, they got run too.

I thought I was being pretty clever.

I was being very stupid. I'd just re-invented xUnit, and didn't even know. [1]

It's a particular type of stupidity that manifests itself only in those who'd otherwise regard themselves as anything but: we get so wrapped up in our great idea that we stop to consider that someone else might have done this already. Programmers are particularly badly afflicted by this, mostly because it suits our vanity to create it ourselves.

There was already an automated testing community, that had over time evolved what worked and what didn't, the practical upshot of which - for a .Net developer at the time - was that NUnit already existed. I could have spent the time writing more tests instead. Or better still, more screens, which is what I was actually being paid for.

The last three applications I've worked on have all involved considerable custom frameworks (stupid) including a custom databinding scheme (very stupid). They were written by clever people, most of whom I respect, but they did some stupid things that less able programmers wouldn't have been able to do. Clever isn't always a complement in the agile camp, and this is why.

Of course 5 years of hindsight is a wonderful thing, and I've written my share of head-slappingly dumb code too. And it's all too easy to succumbed to the 'quick fix' fallacy when the boss is breathing down your neck : after all it's so much easier to get started writing your own framework than to learn to use someone else's.

But once you start down the dark path, forever will it haunt your destiny[2]. Which is why I make this plee to you now:

Please, before you put finger to keyboard again, consider whether what you're about to write has already been written.

Don't be stupid.

[1] To be fair to my erstwhile self, at least I was actually doing some testing, which was more than had been done before on that project
[2] Or that particular project at any rate

Tuesday, May 13, 2008

Enabling multiple RDP sessions in Vista

After many days of frigging around I realised those thegreenbutton.com Vista multiple remote desktop hacks (that you find from google) are all broken by SP1. That page' on missingremote.com that is supposed to draw all this together still hasn’t been updated with this new info.

However add SP1 to your search and you find this other thread, which works: http://thegreenbutton.com/forums/permalink/242509/255166/ShowThread.aspx#255166

Ah, the joys of subsequent-threads-with-lower-page-rank-than-the-original-now-outdated-info.

Thursday, May 08, 2008

Running ASP.Net webservices under a service account

Most of the time I run websites and webservices in an app pool that's running as Network Service. It just saves a whole truck load of time and hastle:
* no passwords to worry about
* already trusted for kerberos delegation
* can still use it to talk to a database under integrated security (you just grant access to the machinename$ account in the domain).

Hey - this is what this account was *invented* for.

However, sometimes a specific service account is a must. Reasons include:
* Needing to differentiate access rights between applications running on the same host
* Needing to authenticate back across a one-way domain trust
* Specific policy mandates

Unfortunately you can't just add any account to IIS_WPG and use it, because the ACL on windows\temp is wrong: and grants access to network service rather than to the group. Miss this one, and you'll just get serialization errors left right and center.

So I do this:


Net localgroup iis_wpg /add mydomain\myserviceaccount
cacls %systemroot%\temp /E /G IIS_WPG:C

...then when you change the identity of the app pool you won't get 'Service Unavailable'.

Sunday, March 16, 2008

Don't override Equals

A colleague had a problem the other day which turned out to be due to an overridden Equals operator. In this case it was a straightforward bug in the implementation, but after he saw my horror struck face I had to introduce him to the whole 'don't override Equals' philosophy[1]. On the pretext that you've not come across it, here's the argument in full:

You have two objects that came from different places, and need to know if they represent essentially the same data.
You can't override Equals unless you also override GetHashCode. If two objects are equal, they must have the same hashcode, or collections are screwed.
GetHashCode must return the same value for an instance throughout it's lifetime, or Hashtable's are screwed
Your object isn't readonly, so you need an immutable field in the instance to base the hashcode on.
But if you modify one instance's data to equal another, that field can't change, so the hashcodes are still different.
You're screwed

And that's without getting into the problems associated with a correct implementation of Equals in the first place (getting the reflexive, symmetric and transitive bit right). Generally speaking some kind of IsEquivilent method is a whole heap less trouble, but it depends what you're up to. You might think about loading your objects through some kind of registry, so references to the 'same' data actually end up pointing to the same instance. Then everything just works...

More reading:

UPDATE 10/04/08: Some clarifications: I'm talking about not overriding Equals/GetHashCode for reference types here. It's not such a problem for value types [as IDisposable points out in the comments]. And I've futher clarified some of my assertions about GetHashTable in the comments.

[1] PS: Like all advice, this has exceptions. But the chances are they don't apply in your case. No, really.

Thursday, January 31, 2008

Care required passing arrays to .Net methods in Powershell

In Powershell, argument lists for .Net methods are treated as arrays:

$instance.MyMethod($arg1,$arg2);

...which can be confusing if you want to pass an array as a single argument:

$instance.MyMethod($myArray);

New-Object : Cannot find an overload for "MyMethod" and the argument count: ""

Instead, force the array-argument to be contained within a single-member array:

# Note the extra comma below
$instance.MyMethod(,$myArray);

Makes sense when you think about it, but definitely a gotcha.

[In my case, I was caught out with the byte[] overload constructor for a MemoryStream]

Wednesday, January 30, 2008

Blobs out with SQL 2008

Recently I re-visited the blobs in/blobs out argument with a colleague. You know the one, one of you says blobs shouldn't be stored in database (principally because the last time he tried it 'blobs in' in VB 6 access to the blob data was a pain in the arse), then the other one says no they should be in the database (because the last time they tried it 'blobs out' all the files got mixed up / out of sync / weren't backed up). Etc...

Anyway, not only has Paul Randal posted a good summary of the pros and cons, but he did so as an intro to a new SQL 2008 data type 'FileStream' that attempts to bridge the two approaches (the 'have your cake and eat it' approach).

I'm cautious. Transactions at the filesystem level are a real mess (as some of the OneNote blogs make clear, especially with non-MS implementations of SMB like SAMBA). Your database backup is presumably still huge and unwieldy (or missing the blob data, which is worse?).

The main advantage of this approach seems to be that SQL can access the blob data faster through NTFS than via it's own internal MDF formats. But you've apparently still got to go via SQL to get the data, you can't (for example) just serve up images-stored-as-blobs directly via IIS. Or maybe I've missed something. Either way, the upside all seems to be focused on blob streaming performance, which may or may not be the most relevant factor for your app.

So it's possible that next year's arguments will be blobs in vs blobs out vs filestream, and still no one-size-fits all. Ah well.

Thursday, January 03, 2008

Path already mapped in workspace error with CCNet and TFS

Had a problem with CCNet that kept me here till midnight where try as I might, I just couldn't get a build to not fail with the dreaded "Path ... is already mapped in workspace ..." error:

Microsoft.TeamFoundation.VersionControl.Client.MappingConflictException: The path C:\Builds\etc\Working is already mapped in workspace someworkspace

We use a different workspace for every CCNet project to avoid collissions, and to maintain uniqueness we keep the workspace name the same as the CCNet project name. I couldn't find the workspace in question, and was pretty sure I'd already deleted it. In fact I'd used TF Sidekicks to delete all the build user's workspaces, and it still didn't work. So what was up?

Fortunately in a post 'How to handle "The path X is already mapped in workspace Y"' I learnt of the mappings cache file on the client PC, in the user's Local Settings\Application Data\Microsoft\Team Foundation\1.0\Cache\VersionControl.config file. Just nuking workspaces on the server isn't enough!

So to be sure I blew away the build server's local profile entirely, and that finally fixed it.

Wednesday, November 07, 2007

The new starter experience

I normally avoid 'link-posts', but again Hacknot is right on the money with 'If They Come, How Will they Build It?', an eminently familiar analysis of the plight of a new developer on a project with an oral documentation culture.

In fact I'd go slightly further than Hacknot, and state that the initial experience of a new developer on the project is one of the most important things to get right. First impressions do matter, and if your first impression of a project is the frustration of:

Not having a login
Not having internet access
Not being able to get latest
Not being able to build
Not being able to locate any documentation
Not having clear lines of escalation
Not having clear rules of engagement
Not knowing what's expected of you
Not having a mentor

...you're going to be hard pushed not to be prejudicing your opinions of the professionalism of the rest of the project. You'll start disheartened, but on the other end of this unfortunate indoctrination you're going to be just like them. You won't regard the absence of the list above as anything other than normal. You'll accept that that's just not how things are done around here. You will love big brother.

[ahem. got carried away there]

I regard the absence of guides and documentation as more than a major time-waster: it's a self-perpetuating morale hole for all future team members to climb into and die.

New staff play a vital part in ensuring a project's approach doesn't atrophy. If you waste their 'fresh' time frustrating them with missing documentation and runaround, you won't get the benefit of seeing things from their eyes. They'll have clammed up and learnt to live with how it is, and by the time you ask them they'll have forgotten that they used to care.

FixBrokenWindows - they're not all in your code

Tuesday, October 30, 2007

Great Powershell One-liners

Or, 'highly useful things you can do in Powershell':

Watching the last 20 lines of the last (log?) file in a directory:

(gci)[-1] | gc | select-object -last 20

Finding where .Net is installed:

(get-itemproperty HKLM:\software\microsoft\.netframework).InstallRoot

Pinging a URL [1]:

$temp = (new-object net.webclient).DownloadString($url)
if ($?) { write-host Passed -foregroundcolor "green" }

Getting the DNS suffix for a machine:

(Get-WmiObject -class "Win32_ComputerSystem").Domain

[1] Ok, that one's not a one-liner.

Why Powershell Rocks

I try and remain reasonably sanguine / sceptical / cynical about most new technologies, but there's two about at the moment that I just can't find anything to fault with: Vista's Media Center, and Windows Powershell.

So (apart from the fawning) what's so good about Powershell?

The key point is that Powershell is a shell that thinks it's a scripting language. Or is it the other way round? Well it's both anyway. So things you can do in a BAT file, you can do in Powershell:

xcopy /I /F /Y Somefile.abc ..\SomeFolder

[In many cases, like the above, you can just cut and paste the same line into Powershell and it will work]

Ever tried doing that in VBScript/JScript? Either you wrote lines of Scripting.FileSystemObject code, or you shelled out (in which case you've got the line above plus the 'shelling out' code).

But BAT files can only take you so far. Further if you're stubborn or clever, but even then there's a wall. Creating files with todays date in the name is a real swine. Setting IIS properties involves an external VBScript. HTTP-pinging a URL and checking it's up... no chance.

For these you have to write a VBScript or an (.Net?) EXE, where you can take advantage a host of supporting libraries and benefit from 'real' programmer concepts like variable scope, functions and conditionals. But integrating them with your BAT files (ie retting return values back) is quite a challenge. And you end up with your foot in both camps

Powershell does both.

I'll give you an example of the kind of hybrid approach this engenders: setting ACLs for the anonymous user on a website. Now you can assume that it's IUSR_%computername%, but in some scenarios[1] (renamed for security, ghosted image / machine renamed) it's not. So you've got to look up the user first, which is pretty tricky in a BAT file (even with adsutil.vbs), but then set some file permissions, which is nigh-on impossible in VBScript. In PowerShell this is easy:

$iisobj = [wmi]"root/MicrosoftIIsV2:IIsWebVirtualDirSetting.Name='W3SVC/1/Root/MyWebSite'"
$userName=$iisobj.AnonymousUserName
$path=$iisobj.Path

CACLS $path /E /G $userName:R

Note how I did something very 'script' - dealing with objects and properties, and followed it up with something very 'batch' - just calling a shell command (CACLS). And it just worked.

Now in my real script I didn't use CACLS, I used the .Net System.Security.AccessControl classes, because I'm a .Net developer and that's what I thought of first. I wrapped them up into a neat reusable function, so the code above actually didn't look so different. But I ended up writing 20+ lines of code, where CACLS would have done the job just as well. I'm learning too.

There's lots of other great things about Powershell too:

Some great syntax improvements, like range operators, and reverse-indexing arrays
Fantastic support for script parameters, and default values
Here-strings

...but they're all secondary to this one, which is that Powershell is all you need[2]. That's pretty compelling to me.

[1] I.E. My scenario.
[2] Ok, unless you're doing something really wacky/Win32 [3] where the .Net BCL doesn't have the support yet.
[3] [4] Is there a difference?
[4] Recursing footnotes? Can I get away with that?

Tuesday, October 16, 2007

IE Advanced Security Configuration affects Powershell execution policy

Like a few other MS apps, but hardly obvious to us mortals, Powershell determines whether something is 'local' based on the Internet Explorer Security Zones. This means that (by default) scripts run from a UNC are considered part of the Local Intranet zone, and so can run unsigned under the RemoteSigned execution policy:

However, if your Win 2003 server is running the Internet Explorer Enhanced Security Configuration, UNC paths are not included in the zone unless explicitly added:

So your scripts will refuse to run unless you reduce the execution policy down to 'unrestricted', leaving you with lots of nasty messages:

Run only scripts that you trust. While scripts from the Internet can be useful, this script can potentially harm your computer. Do you want to run etc... [D] Do not run [R] Run once [S] Suspend [?] Help (default is "D"):

Yuck (and very frustrating till you work it out).

If IE Enhanced Security Configuration is mandated, the only way round this is to explicitly add the UNCs where your scripts are located into the Local Intranet zone (nb: don't add a http: prefix!)

You can do this manually, or you can just import a set of registry keys if you need to do a whole load in one go (or the same across multiple servers). If you do it through the UI the settings are user-scoped, if you do it via the registry you can add settings either for the current user, or at a machine-wide level. See Adding Sites to the Enhanced Security Configuration Zones in MSDN and Description of Internet Explorer security zones registry entries in the knowlege base for more details.

Monday, July 16, 2007

Log4net in Asp.Net redux: Implement IFixingRequired on your Active Property Values

A while back I noted that the built-in contexts in log4net were broken in the face of ASP.Net's thread agility (and pointed this out to the log4net community).

The workaround I suggested, which was also suggested on the log4net developers list, was to leverage log4net's support for deferred evaluation of logging properties. These are known as Active Property Values, but that's just a fancy/short way of saying 'any object you like, who's ToString() method results in the value that you actually want logged'. This was a pretty neat workaround, eg:

      log4net.GlobalContext.Properties["requestUrl"] = new HttpContextRequestUrlProvider();

    private class HttpContextRequestUrlProvider
    {
        public override string ToString()
        {
            HttpContext context = HttpContext.Current;
            if (context == null) return null;
            return context.Request.RawUrl;
        }
    }

However in one of my logging databases, I recently noticed logging entries attributed to me that had clearly come from someone else, which made me worry for a few hours whether this pattern was broken, or broken when using an appender that buffers (like the AdoNetAppender).

Digging through the source code showed that BufferingAppenderSkeleton does indeed attempt to 'fix' logging entries when they go into the buffer, to guard against exactly these kind of multithreading logging mishaps:

 // Because we are caching the LoggingEvent beyond the
// lifetime of the Append() method we must fix any
// volatile data in the event.
loggingEvent.Fix = this.Fix;

This causes all kinds of intrinsic log4net values (like thread ID, UserName), and the message itself to be 'fixed': i.e. fully evaluated (and rendered via the layout) now, rather than later. Otherwise all the logging events in the buffer would end up being written out with the values in-play at the time the appender flushed (ie potentially another thread/users's context).

This fixing also 'fixes' properties from the log4net contexts (ThreadContext, LogicalThreadContext and GlobalContext) if they implement IFixingRequired. And that's what I missed - one obscure interface:

 // Fix any IFixingRequired objects
IFixingRequired fixingRequired = val as IFixingRequired;
if (fixingRequired != null)
{
 val = fixingRequired.GetFixedObject();
}

But I've missed that for the last couple of years. So… actually I'd rather not think about the implications of that. Meanwhile, if anyone who does any log4net dev would actually like to update the Active Property Value doco, that'd be good, thanks.

Better still, 18 months later, can we have our Adaptive Context now please?

Sunday, June 03, 2007

Production Debugging for Hung ASP.Net 2 applications – a crash course

If your app is down right now skip the intro – I’ll understand.

Introduction

First up: debugging – even in the development environment – should never be a substitute to detailed diagnostic logging and appropriate – meaningful – error handling. Any time I fix an exception, if the problem is initially non-obvious, I first fix the logging code. Then and only then do I fix the exception.

Baking in a support mentality early in the development process is the only way to avoid an application that’s hard (and by that I mean expensive) to support. The customer might be paying for the support time, but if you can’t turnaround support issues quickly – and practically any issue in a production business application needs turning around quickly – they start questioning your competence.

But sometimes logs aren’t enough, and even with the best will in the world, sometimes they’re not sufficiently detailed (they won’t - for example – be much help tracking down that memory leak). Sometimes you need to know what’s thrashing right now; to see the smoking gun.

Enter the world of crash dump debugging.

I’d been following Tess’s blog for a while now, but still had a big hole in my understanding. Sure I now knew I could use the !clrstack command to start looking at managed thread stacks, but where the hell was I typing that? This then is my introduction to Production Debugging for .Net applications (or ‘how to understand what Tess is talking about’).

I’m going to be talking about debugging hung .Net 2 ASP.Net applications, but much of this is relevant to debugging any .Net application in a production or testing environment.

Getting the Dump

Assuming you don’t have time to read a more detailed article on this (and the Microsoft guide is 170+ pages!), just do this:

WARNING: Installing the debugging tools on your production server isn’t recommended. That’s not to say you have any choice in the matter, but you should be aware there are security and availability implications that – if your app is hung – you probably don’t care about right now.

Install the Microsoft Debugging Tools for Windows and IISState on the machine where the hang occurs. You’ll need the Debugging Tools on your workstation too, but you can do this later if it’s happening right now.

If your app is thrashing CPU, use TaskManager to get the process ID (PID) of the W3WP process that’s using the most CPU (there may be more than one). NB: PIDs aren’t displayed by default: use View\Select Columns to turn them on.

Otherwise, at the command prompt, enter cscript iisapp.vbs to determine the process ID that matches your app pool.

At the command prompt CD to wherever you installed IISState (C:\IISState is the default). If there is existing content in the output folder, delete or rename the folder. Then execute the following:

IISState –p  <PID> -d

(…replacing <PID> with the PID you determined above)

WARNING: At this point your app will become unresponsive until the dump completes

IISState will put a dump file (.dmp) and a text summary (.log) of the process’s threads etc… in the output directory.

Rename the output folder to something that reflects the date/time, so future dumps don’t overwrite it.

Since we’re doing a ‘hang’, ideally at this point you’d wait a few mins, and then take another dump. If you don’t, you may regret it later and / or have to wait for the issue to re-arise before you can definitively pin it down. However, if the world and his dog want the site back up right now then skip it – it’s your call.

Kill that process. IIS will recycle it, and things should go back to normal for now. Panic over.

Analyzing the Dump

Copy the renamed output directory down onto your machine. For
the process ID you dumped there will also be an IISState summary file, eg IISState-5090.log. This log is a dump of the stack trace of all the threads in the process, and attempts to perform some additional diagnostics for you. Unfortunately, it doesn’t (at present) support .Net 2, so most likely it’s not going to tell you what the problem is. You should at least be able to see a few threads that look something like this:

Thread ID: 13
System Thread ID: d10
Kernel Time: 0:0:4.186
User Time: 0:20:25.702
Thread Type: Managed Thread. Possible ASP.Net page or other .Net worker
No valid SOS data table found.

Begin System Thread Information

# ChildEBP RetAddr  
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 01a9e76c 06ee91cd 0x6eea301
01 01a9eb6c 06cc6e91 0x6ee91cd
02 01a9ee44 06cc6278 0x6cc6e91
03 01a9ee80 06cc3ddd 0x6cc6278
04 01a9eeb8 06cc381c 0x6cc3ddd
05 01a9ef3c 0736fa0f 0x6cc381c
06 01a9ef84 0736f65f 0x736fa0f
07 01a9efec 0725853d 0x736f65f
08 01a9f050 79e9272b 0x725853d
09 01a9f060 79e926ad mscorwks!DllRegisterServerInternal+0x1d6b
0a 01a9f0e0 79e9450b mscorwks!DllRegisterServerInternal+0x1ced
0b 01a9f220 79e943f3 mscorwks!DllRegisterServerInternal+0x3b4b
0c 01a9f37c 07258408 mscorwks!DllRegisterServerInternal+0x3a33
0d 05b30b78 00000008 0x7258408

What does this mean? Well IIState doesn’t understand what’s going on with the stack below that mscorwks dll, which screams .Net 2 stack to you (as does the comment against Thread Type). But the likelihood is there’s not much else we’re going to get out of that file[1]. So it’s time to look at the mini dump.

Open a Visual Studio 2005 Command Prompt window, and from there launch C:\Program Files\Debugging Tools for Windows\windbg.exe (this ensures the .Net 2 framework is in the PATH for the WinDBG process, and makes life simpler –if you don’t have VS2005 installed do it manually). Do File \ Open Crash Dump and brose to the dmp file you created with IISState. If it asks you about saving workspace information just hit no.

It’ll open a text window with some summary information about the dump. This window has a little command bar at the bottom. This is where you do stuff.

The first thing it probably says is “Symbol search path is: *** Invalid ***”. So we type .symfix into the command bar to set it up to dynamically load symbols from Microsoft as required. You can do something to your %path% so you don’t have to do this each time, but I’ve not got round to that yet.

It probably also says

This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(13cc.84): Wake debugger - code 80000007 (first/second chance not available)

…but this is exactly what it says it is – it’s the console debugger attaching (how IISState got the dump), and not a ‘real’ exception per-se.

You can type ~ to get a list of the threads, or ~ *kb to get them with a bit of stack trace too, but we’re still only looking at unmanaged threads here, so pretty meaningless to me.

Type .load SOS to load the managed code debugging extension SOS.dll, then type !threads to see all the managed threads. Now that’s more like it.

NB: If you’re debugging on a different platform to the original crash (eg server was Win2003, you’re on XP), then you’re going to get an error at this point “Failed to load data access DLL, 0x80004005”. The error message contains the fix: execute .cordll -ve -u -l and the correct version of mscordacwks.dll for the dump will automatically be loaded, eg:

0:000> .cordll -ve -u -l
CLRDLL: C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\mscordacwks.dll:2.0.50727.42
f:0 doesn't match desired version 2.0.50727.91 f:0
CLRDLL: Loaded DLL C:\Program Files\Debugging Tools for Windows\sym\mscordacwks_x86_x86_2.0.50727.91.dll\442337B7562000
  \mscordacwks_x86_x86_2.0.50727.91.dll
CLR DLL status: Loaded DLL C:\Program Files\Debugging Tools for 
  Windows\sym\mscordacwks_x86_x86_2.0.50727.91.dll\442337B7562000\mscordacwks_x86_x86_2.0.50727.91.dll

Your !threads output might look like this:

ThreadCount: 9
UnstartedThread: 0
BackgroundThread: 9
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
      ID OSID Thread   OBJ    State   GC       Context           Domain   Count APT Exception
      12    1  d10 000dce98   1808220 Disabled 00000000:00000000 000d4008     2 Ukn (Threadpool Worker)
      14    2  9f4 000ede10      b220 Enabled  00000000:00000000 000daa20     0 MTA (Finalizer)
      15    3  6d0 00103ad8    80a220 Enabled  00000000:00000000 000daa20     0 MTA (Threadpool Completion Port)
      16    4 14b0 001074d0      1220 Enabled  00000000:00000000 000daa20     0 Ukn
      10    5  4f8 0012e408   880a220 Enabled  00000000:00000000 000daa20     0 MTA (Threadpool Completion Port)
      18    9 1184 06a32e20    800220 Enabled  00000000:00000000 000daa20     0 Ukn (Threadpool Completion Port)
      26    6 15a0 06ada200   180b220 Enabled  00000000:00000000 000daa20     0 MTA (Threadpool Worker)
      28    b   e0 06a5bd40   880b220 Enabled  00000000:00000000 000daa20     0 MTA (Threadpool Completion Port)
      29    c  fd0 06a3c368   180b220 Enabled  00000000:00000000 000daa20     0 MTA (Threadpool Worker)

Notice how the last column tells you what type of thread it is. You can get the managed stack trace for each one of these threads by selecting which thread you’re interested in from the Processes and Threads window (up on the toolbar) and typing !clrstack, but I find it more useful to just look at all of them straight off: ~*e !clrstack

You’ll see a lot of this:

OS Thread Id: 0x84 (0)
Unable to walk the managed stack. The current thread is likely not a managed thread. 
You can run !threads to get a list of managed threads in the process

...because what you're doing is looking at the managed stack trace for all threads (and many of them won't be .Net threads remember), but eventually you’ll see some managed stack traces pop up.

In my case it was a bit of an open-and-shut case, because there was only one .Net thread actually doing anything in the process – kinda a giveaway:

OS Thread Id: 0xd10 (12)
ESP       EIP     
[… trimmed 3^rd party vendor stack trace…]
01a9eec4 06cc381c Aspose.Pdf.Pdf.Save(System.IO.Stream)
01a9eed4 069d4a05 MyApp.Server.Utils.WordToPdf.AsposeImpl.Generate(System.IO.MemoryStream)
01a9ef44 0736fa0f MyApp.Server.Document.Implementations.AbstractWordToPdfDocumentGenerationImpl.Generate(...)
01a9ef8c 0736f65f MyApp.Server.Document.GenerationManager.Generate(...)
01a9eff4 0725853d MyApp.Server.ServiceImplementation.DocumentService.GenerateDocumentById(GenerateDocumentByIdRequest)
[… removed ASP.Net framework stack trace …]
01a9f710 65fbe244 System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
01a9f744 65fbde92 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest)
01a9f750 65fbc567 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
01a9f900 79f2d05b [ContextTransitionFrame: 01a9f900] 
01a9f950 79f2d05b [GCFrame: 01a9f950] 
01a9faa8 79f2d05b [ComMethodFrame: 01a9faa8]

[NB: I've had to edit that for brevity and wrapping]

Ok, so we’re dying in a 3^rd party component. But we’re still going to need to replicate this somehow because (presumably) it doesn’t do this all the time. Fortunately in this case all we need to know is the arguments supplied to our GenerateDocumentById() method to attempt to reproduce in dev. They’re in the dump too, so let’s see what they were.

Select that thread (12) using ~12s (or the Processes and Threads window, and no, I don’t know what the ‘s’ suffix means) and use !clrstack –p to view the stack with parameters. It’s a big list, but in there somewhere is the bit I’m interested in:

0:012> !clrstack –p
[… blah blah …]
01a9eff4 0725853d MyApp.Server.ServiceImplementation.DocumentService.GenerateDocumentById(GenerateDocumentByIdRequest)
     PARAMETERS:
          this = 0x02e596b0
          generateDocumentByIdRequest = 0x02e595b4

!dumpobject (or !do) then tells me the contents of that generateDocumentByIdRequest instance given it’s address as shown above (0x02e595b4):

0:012> !do 0x02e595b4
Name: MyApp.Server.ServiceContracts.GenerateDocumentByIdRequest
MethodTable: 05dcb5e0
EEClass: 06f935ec
Size: 12(0xc) bytes
   (C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\Temporary ASP.NET Files\blah blah blah\MyApp.Server.ServiceContracts.DLL)
Fields:
MT        Field    Offset    Type    VT Attr      Value   Name
790fedc4  40001c4  4         System.Int32 0  instance  8229991 documentId

The bit I care about on this is the documentId field itself, and being a value-type field (Int32) I can see the value right there: 8229991 (for a reference type like System.String I'd get another address to the instance itself, ie have to perform another !do).

And in this case, that was that. Well, it turned out to affect more than just this document ID, but at least I had enough to reproduce in the dev environment. And just in case anyone read the stack trace carefully, by the time we got to this stage Aspose already had a fix out for this issue, so I can’t really knock them. Finally whilst I was at it, I had a look at the objects in memory and began to wonder whether I’ve got a memory leak, so there’s probably a follow up to come.

Other SOS commands to look at:

!threadpool (also this is where the CPU utilization hides)
!eeheap –gc (see memory allocations that are subject to the GC)
!dumpheap –stat (see object allocations by type)
!help (gives help on the last extension loaded, so SOS commands in our case)

Other Resources:

Installing and Configuring IISState

SOS: It’s not just an ABBA Song Anymore: A good introductory article on MSDN with a dreadful title

If Broken It Is… Fix it You Should [Tess’s Blog]: This will all make a lot more sense now

Production Debugging for .NET Framework Applications: This 170 page document from Microsoft’s Patterns and Practices team is allegedly out of date, but is an invaluable reference, and what I based most of this article on

Debugging Microsoft® .NET 2.0 Applications: I haven’t read this MSPress book, but the v1 version got quite a lot of citations when I was researching this stuff

[1] So why did we take it? Because if it turns out it’s an IIS issue, and not your code, then someone will be able to get something out of it. Just not you.

Saturday, May 19, 2007

Nesting Transactions in Stored Procedures (Part 1)

Nesting transactional stored procedures is an absolute minefield.

Many projects I've worked on have just avoided opening transactions within stored procs altogether, and rely entirely on ADO / ADO.Net transactions in the application. This is certainally a simple way to avoid the problem, but isn't always appropriate. At some point there's always a data integrity rule that you want enforced at the lowest possible level (to avoid it being accidentally bypassed - I'm thinking auditing, or the classic 'move money between accounts' scenario). Putting that rule in a stored proc, and making that the only point of call for writes to the table involved makes it pretty safe. This is even more important if it's not just your app using the database.

But nevertheless avoidance is appropriate in many cases, if only because handling the nesting of transactions within nested stored proc calls is a complex matter. To summarize the problems:

Rollback and commit are asymmetric in nested transaction scenarios

Rollback aborts the whole nested stack, whereas commit just decreases the nesting level (only the final commit actually commits anything). So error handling has to behave differently in the proc at the 'root' of the transaction than it does in nested procs. Ulp!

SQL error handling is hard, and poorly-understood

Even basic TSQL error handling is poorly understood: having an exception raised in the calling application but the rest of the stored proc continue executing is pretty counter-intuitive, right? Whadyamean the transaction committed anyway!?!

The fact is that most SQL errors don't affect batch completion without explicit handling, some abort the batch but not the transaction, and some abort both. Furthermore, XACT_ABORT changes which do which. Erk!

Fortunately we don't have to come up with a solution, because there are already well-defined patterns for handling both these issues. There's an excellent article on the subject on Code magazine that's been the basis for my approach for a few years now:

Handling SQL Server Errors in Nested Procedures by Ron Talmage (I use the multi-level model)

Unfortunately barely any developers appear to be aware of the problem, let alone the solution patterns. This is particularly problematic, because either of Ron's solutions must be applied consistently in order to work properly. If that's an issue, maybe you're best going back to the avoidance pattern.

Anyway, the big question for me is "how has this pattern been affected by TRY ... CATCH functionality in Sql Server 2005?". We'll look at that next time...

Thursday, April 19, 2007

How to Disable Kerberos on Windows 2003 using NTAuthenticationProviders

If your IIS website / service

Is running as an account other than NetworkService
or isn't being accessed via the server's AD name (eg through an alternate URL, or load balanced alias)

...then Kerberos authentication will fail, because the client doesn't know who the server is to start the ticket exchange process.

In both cases getting the right SPN added into AD will fix things, but sometimes this can be problematic. However you can explicitly downgrade the virtual directory to only use NTLM authentication by setting the NTAuthenticationProviders property. There's no UI for this, so you have to set it on your IIsWebVirtualDir in the metabase, eg (for IIS 6 \ Windows 2003):

cscript C:\inetpub\adminscripts\adsutil.vbs //nologo SET W3SVC/1/Root/MyApp/NTAuthenticationProviders NTLM

Then everything (?) works again. Hooray.

This is discussed at the very bottom of this article: http://support.microsoft.com/kb/215383
NB: For Windows 2000 you can only set this at a site-level, not an application level, as the article outlines.

Working Directory Independence for Batch Files

%~p0\

(or: Many times your batch file will want to access resources in the same folder as the batch file. This can be tricky if the user calls the batch file from another folder, since the working directory is not the directory the batch file is in. Rather that resorting to pushd / popd everywhere, one can use the extended command line parameter handling to convert the full path to the batch file into a relative path to it’s container. %0 is the path the batch was called with, so %~p0 is the path to that location)