Playing MythBusters with Sitecore setup suggestions

Recently a colleague of mine told me about a suggestion they’d been given about setting up an instance of Sitecore. They were told that you should put your license file into a subfolder of your data directory because the license check enumerates files and folders in the directory containing the file. So if the folder contained other things, this would slow down the check. This sounded odd to me as you have to specify the exact path of the license in your config, so I thought I’d do some investigating, and see if I could prove or disprove the suggestion.

So, putting on my best beret at a jaunty “for science!” angle, here’s what I discovered:

How do you test a theory like this?

My plan was that I needed to record the startup and request times for some variations of how the data folder was set up. Having thought about what they might be, I came up with three variations that needed testing:

  • Whether the data folder is a sibling of the site’s website folder, or a child of it.
  • Whether there are lots of files and folders in the data directory or not.
  • Whether the license file is in the data folder, or in a subfolder of it.

To confirm the theory, timings should be visibly slower if there are lots of files present and the license is in the same folder as them. If the extra files are missing, or the license is in a subfolder it should run at “normal” speed. And, my gut says that whether the license is in /data/ or /website/data should make no difference.

So three variations with two states each gives us eight possible tests to run. It’s a bit of a pain to set them up and run them manually, so what’s needed is some script to automate it all. I spent a bit of time hacking up some quick and dirty PowerShell to set up these states on an instance of Sitecore v8.2 I happened to have to hand.

Testing timings of events usually benefits from performing the event multiple times and averaging the result, so I set up the script to take each scenario in turn, and work out an average for the “first request” time, and an average for the “subsequent request” time. I chose to make it alternate between “junk in the data folder” and “tidy data folder” even though that would make the tests slower overall – but my gut said I should see a more obvious pattern in the data if it worked that way.

Finally I made the script write the times and averages out to a CSV file so could make graphs in Excel easily.

The very hacky script’s source is available to donload if you want. Note that I’ve not made any attempt to make paths and instance names configurable, or make the code pretty – this was written just to run my tests, not to work on your machine.

So what results do I get?

Setting the script up to use

  • 1500 random files and 1500 random folders created to make a “worst case” data folder.
  • 25 requests averaged for each test result

I ran the script twice, and pulled the results together into a single graph:

(The startup times are on the left axis, and the request times are on the right hand one – though the absolute times probably aren’t that meaningful, just the differences between them)

Immediately, that data doesn’t seem to back up the original assertion.

The even numbered tests (marked with the red arrows) are the ones with the junk data added to the data folder. But they don’t seem to show predicatble increase in time compared to the other tests. Mostly the lines are pretty stable, and requests with and without the junk data are roughly similar. That suggests that the various test settings don’t make any noticable difference to the speed of site startup or requests.

There is an obvious glitch in the timings for the second run of the startup time request for test #6 – but given that is one odd result out of all of them, my gut feel is that this is probably just something going on in the background of my machine messing with the timings. (I did try rerunning the tests later – I get at least one glitchy result on a fairly regular basis – but I never see a predicable “even numbered tests all take longer” result)

The raw data is available to download, if you feel the need to examine it.

Confirming things

So based on those results I’m pretty confident that the contents of your data folder don’t make any real difference to license check times. But to be sure, I tried working out what happens in the Sitecore code while it’s checking licenses. This isn’t easy – licensing is part of the “Sitecore.Nexus” library, which is obfuscated. Hence reading the logic of it is pretty tricky.

But there are a few things I can see here which make me pretty convinced that it’s not doing any enumeration. The first is that the code directly references the license file config setting, and checks that the specific file exists:

And the second is, that if I follow the call tree, the value of the license file property gets passed down to a load call for an XmlDocument:

So the specific file you describe in config appears to be verified and loaded.

And the final thing is that I can’t see any construct that looks like it’s enumerating a set of files here. While the obfuscation makes life tough, nothing sticks out as being the equivalent of a “for each file in the folder” type call.

In conclusion…

I call “busted”. From what I’ve seen, you can put your license where you want – processing times do not seem to be meaningfully affected.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.