What went wrong with my media?

Edited to add: With help from Sitecore Support, there’s an explanation for what was going on here now.


If you spend your life working with software, you can’t help but collect a few stories of issues that defied your understanding – and I came across a great example with Sitecore recently. I haven’t managed to decide if this is an issue that can happen to others, or whether it was completely specific to the setup of this particular site. But since I got few useful results from Google when I was trying to solve this, I figure it’s worth writing about it just in case someone else sees a similar problem in the future…

The issue:

A developer on my team (and a bit later the client themselves) reported issues with media items on a particular v8.0 site. The developer noticed that there were some PDFs available for download which caused a 500 error to be thrown. The client noticed some images on the site had become broken. Under the surface these were both manifestations of the same problem. Requests for these specific media items failed with an exception, but requests for other media items worked fine. The site’s 500-page handling hid the details, but in the logs you would see this:

5496 19:20:22 ERROR Application error.
Exception: System.Web.HttpParseException
Message: The code block is missing a closing "}" character.  Make sure you have a matching "}" character for all the "{" characters within this block, and that none of the "}" characters are being interpreted as markup.

Source: System.Web.WebPages.Razor
   at System.Web.WebPages.Razor.RazorBuildProvider.EnsureGeneratedCode()
   at System.Web.WebPages.Razor.RazorBuildProvider.get_CodeCompilerType()
   at System.Web.Compilation.BuildProvider.GetCompilerTypeFromBuildProvider(BuildProvider buildProvider)
   at System.Web.Compilation.BuildProvidersCompiler.ProcessBuildProviders()
   at System.Web.Compilation.BuildProvidersCompiler.PerformBuild()
   at System.Web.Compilation.BuildManager.CompileWebFile(VirtualPath virtualPath)
   at System.Web.Compilation.BuildManager.GetVPathBuildResultInternal(VirtualPath virtualPath, Boolean noBuild, Boolean allowCrossApp, Boolean allowBuildInPrecompile, Boolean throwIfNotFound, Boolean ensureIsUpToDate)
   at System.Web.Compilation.BuildManager.GetVPathBuildResultWithNoAssert(HttpContext context, VirtualPath virtualPath, Boolean noBuild, Boolean allowCrossApp, Boolean allowBuildInPrecompile, Boolean throwIfNotFound, Boolean ensureIsUpToDate)
   at System.Web.Compilation.BuildManager.GetVirtualPathObjectFactory(VirtualPath virtualPath, HttpContext context, Boolean allowCrossApp, Boolean throwIfNotFound)
   at System.Web.Compilation.BuildManager.CreateInstanceFromVirtualPath(VirtualPath virtualPath, Type requiredBaseType, HttpContext context, Boolean allowCrossApp)
   at System.Web.WebPages.BuildManagerWrapper.CreateInstanceOfType[T](String virtualPath)
   at System.Web.WebPages.VirtualPathFactoryManager.CreateInstanceOfType[T](String virtualPath)
   at System.Web.WebPages.WebPageHttpHandler.CreateFromVirtualPath(String virtualPath, IVirtualPathFactory virtualPathFactory)
   at System.Web.WebPages.WebPageRoute.DoPostResolveRequestCache(HttpContextBase context)
   at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
   at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

Looking at the site from on a server (bypassing the friendly error pages) and making a request for one of the broken image items it showed this:

That’s a pretty strange error to see for a media item – why on earth would the Razor template engine be trying to parse a binary file that exists in the content database?

Digging further…

When you RDP’d to the individual servers in the farm, the same error occurred on all machines, and for exactly the same set of items. This seemed to rule out the problem relating to load balancing, firewalls or the external caching that the site used. So initially we considered the idea that it might be broken data in the Web database. In Content Editor (looking at Master) you could download the affected items OK, so we tried republishing affected media items – with no effect.

Then we tried renaming items, in case there was an odd character in a filename. These tests showed further odd behaviour:

  • Renaming a broken item to make is name shorter would (in some cases) resolve the error.
  • Renaming it to be longer would never resolve the issue.
  • Reverting the name would cause the error to come back again, even if you were typing the old name from scratch rather than copying & pasting.
  • Despite being able to fix some broken items by shortening their names, there were working items with (much) longer names.
  • When requesting an item from the public site, you could deliberately change the last few characters of some item names to something random and still see the error above, rather than the expected 404 response for a missing item.

My colleagues tried copying the broken content back from production to recreate the issue on our development platform – but no luck. The same items in the same locations did not cause any errors there.

Looking at the IIS logs from production, there were clear differences between the “failed” and successful requests for media items:

For a working request, the logs looked exactly as you would expect:

2017-05-08 11:22:41 127.0.0.1 GET /~/media/Folder/Folder/thumbnails/Folder/A-working-media-item.ashx - 443 - 127.0.0.1 Mozilla/5.0+(Windows+NT+6.3;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/57.0.2987.133+Safari/537.36 - 200 0 0 84

But for the requests returning the error, the logged URLs were different to those being requested:

2017-05-08 11:13:36 127.0.0.1 GET /~/media/Folder/Folder/thumbnails/Folder/sitecore_media.ashx/~/media/Folder/Folder/thumbnails/Folder/A-broken-media-item.ashx - 443 - 127.0.0.1 Mozilla/5.0+(Windows+NT+6.3;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/57.0.2987.133+Safari/537.36 - 500 0 0 874

These duplicated the folder and included a reference to “sitecore_media.ashx” – which is the underlying handler for media requests.

This broken URL also looked like part of the error message being reported at the bottom of the screenshot above:

Source file: /~/media/Folder/Folder/thumbnails/Folder/sitecore_media.ashx/~/media/Folder/Folder/thumbnails/Folder/A-broken-media-item.ashx/default.cshtml

The presence “default.cshtml” might be related to why we were getting a Razor error, but still doesn’t explain anything about what’s caused the underlying issue.

I also noticed that some of the breaking URLs had the “/en/” language prefix and some didn’t. (The site was published in two languages – so the presence of a prefix was expected) When you looked at the error messages being shown for those that included the prefix, they looked subtly different:

Source file: /en/~/media/Folder/Folder/thumbnails/Folder/sitecore_media.ashx/en/~/media/Folder/Folder/thumbnails/Folder/A-working-media-item.ashx.cshtml

Note how the URL got a bit longer (because of the two copies of the prefix) but the “/default” vanished. Now the url ends with “.ashx.cshtml” on the YSOD‘s source file and in the IIS log.

Stranger and stranger…

Despite the change of URL between what was requested and what was logged, looking at he network trace in Chrome, there was no redirect being issued. That seemed to rule out an issue with any server-side redirection code or data.

Scrabbling around for a solution!

I was thoroughly confused by this point – none of this behaviour made much sense to me, and we had failed to recreate the problem anywhere other than the production site. That said “configuration mistake” to me… So I adopted a common strategy for diagnosing confusing issues: take stuff away until you find the thing which makes the problem stop.

Taking one of the production servers out of the load balancing configuration, I tried removing all the custom config patches we had deployed. While this broke the majority of the site (as you would expect), requesting the broken media items still responded and returned the same error. I also tried reverting changes in the web.config file. That had no effect on the error either.

But what did become obvious was that there were some files on the production server that should not be there. A bit of over-zealous deployment on the part of the TDS update package generator (combined with some dodgy settings in the Visual Studio project) had accidentally deployed the solution’s gulp file, and some text other text files to the root of the site. But removing these had no effect either.

The solution – but I don’t know why *sadface*

We also noticed that there was one wholly unexpected file at the root of the site: a 900kb file named NOT_A_VALID_FILESYSTEM_PATH. Investigating it, this turned out to be a PDF file (which belonged to the client’s content) that had somehow been renamed and stored in the site root:

Deleting that file fixed all of the broken media items. Restoring it again broke them. So clearly this was at least part of the cause…

How did it get there? I don’t really know. The particular file name is mentioned on Stack Overflow as being a symptom of calling Server.MapPath() on dodgy data. Sitecore Support back that up, saying:

We’ve seen some similar issues before. The possible cause of the problem might be the “NOT_A_VALID_FILESYSTEM_PATH” file in your website root folder. It is created by IIS web server when the media extension is incorrect and both of the following conditions are true:

1. The argument provided to the “Server.MapPath” method includes a character that can’t appear in a valid filename, such as colon (“:”) or question mark (“?”).
2. The <httpRuntime> section of the web.config file includes the following attribute: relaxedUrlToFileSystemMapping=”true”

If you have this file in the website root folder, please perform the following:

1. Change the “relaxedUrlToFileSystemMapping” attribute value to “false”.
2. Remove or rename the “NOT_A_VALID_FILESYSTEM_PATH” file.
3. Make sure that all your media items have correct value in the “Extension” field without special characters like “:”, “?”, “^”, etc.
4. Also check out whether site names in the sites configuration section are valid and do not have any special characters.

That tends to suggest that an attempt to upload a media item with an invalid name might have caused this odd file to appear. Hence changing the “relaxedUrlToFileSystemMapping” attribute to force an exception when a dodgy URL is passed to Server.MapPath() sounds like a reasonable approach to trying to spot any future issues with this. But unfortunately this doesn’t explain how the presence of that file causes the media item problems we’ve seen. (None of the media items affected seem to have odd extensions either) Copying that dodgy file back to a development instance of the site doesn’t cause the same behaviour to occur. So that suggests there’s something specific to the production site going on here…

So I’ll admint I’m still confused. If any of you have experienced something similar and know what’s going on, then please let me know in the comments or via twitter.

But at least now if anyone else does see this oddness, there’ll be something in Google about it…

Advertisements

3 thoughts on “What went wrong with my media?

  1. Pingback: It’s never the runtime… (Except when it is) | Jeremy Davis
  2. Hi Jeremy

    Thanks a ton for this post.

    We startet having this issue some time ago on our development environments. Just with image files.

    We experienced the same things in regards to changing the file names of the images affected, but we only had the issue on mobile devices. (or when using the device emulation in chrome/firefox)

    The exact same image url would load fine on desktops, but would throw the YSoD on mobile devices.

    You solution fixed everything.

    //Regards
    Thomas

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s