Enabling automated index rebuilds

Another helpful addition to the “scripted installs” functions I’ve been looking at for the last few weeks is the ability to trigger a full rebuild of a search index. Last week we looked at deferring the indexing of items installed by a package to try and help speed up the scripted install of packages. So it makes sense to be able to trigger a build as well…

A similar pattern to installing the packages can be used to trigger the index build. We can add a simple ASPX file with the code to build an index, and then make an HTTP request to trigger it.

Running the index build

[NB: The code presented here was put together for scripting private development instances. If you were to use it on publicly accessible sites you would have to pay close attention to security, as it presents an opportunity for denial of service attacks against your site. As mentioned in the comments, you should probably consider IP Address restrictions.]

Sitecore exposes an API for index building, but it changes between v6.6 and v7.0 due to the way the infrastructure for search changes between these versions. Only one line needs to change here, so the code below includes both, with the v7 version commented out Based on modifying the package install endpoint we’ve looked at before, the code is as follows:

<%@ Assembly Name="Sitecore.Client" %>
<%@ Import Namespace="System.IO" %>
<%@ Import Namespace="System" %>
<%@ Import Namespace="System.Text.RegularExpressions" %>
<%@ Import Namespace="System.Configuration" %>
<%@ Import Namespace="log4net" %>
<%@ Import Namespace="Sitecore.Web" %>

<%@ Language="C#" %>
<html>
<script runat="server" language="C#">
    public void Page_Load(object sender, EventArgs e)
    {
        var indexes = WebUtil.GetQueryString("indexes").Split('|');
        if (indexes.Length == 0)
        {
            Response.Write("No Indexes specified");
            return;
        }
        Sitecore.Context.SetActiveSite("shell");
        foreach(var index in indexes)
        {
            if(!string.IsNullOrWhiteSpace(index))
            {
                Response.Write("Index: " + index + "<br/>");
                bool result = Update(index);
                if(result)
                {
                    Response.Write("Updated Index: " + index + "<br/>");
                }
            }
        }
    }

    protected static bool Update(string index)
    {
        // Include this line for Sitecore v6.6
        var idx = Sitecore.Search.SearchManager.GetIndex(index);

        // Include this line for Sitecore v7.0 and up
        // var idx = Sitecore.ContentSearch.ContentSearchManager.GetIndex(index);

        if(idx != null)
        {
            idx.Rebuild();
            
            return true;
        }

        return false;
    }

    protected String GetTime()
    {
        return DateTime.Now.ToString("t");
    }
</script>
<body>
    <form id="MyForm" runat="server">
    <div>
        This rebuilds Sitecore 6.x search indexes.</div>
    Current server time is
    <% =GetTime()%>
    </form>
</body>
</html>

You can call this by making a request to whatever you called this page and passing a querystring that names the indexes to build. For example /IndexRebuild.aspx?indexes=indexNameOne|indexNameTwo will start a build for two indexes.

Triggering the rebuild from script

To make use of this, it’s necessary to add a couple of functions to the PowerShell script. As with the automated package install, we need to be able to add the code above to the target Sitecore instance, and then call it.

First we need to tell the config for the instance we’re installing where to find the .ASPX file for building the index. We can also add some configuration

<config>
  <params>

    <param name="IndexBuildTool">.\files\IndexRebuild.aspx</param>

  </params>

  <indexes>
    <index>MyCustomIndex</index>
  </indexes>

</config>

The set of indexes can be retrieved with a bit of PowerShell similar to the way we’re extracting packages to install:

function Get-ConfigIndexes() {
    return Select-XML "/config/indexes/index/text()" $xml
}

The .ASPX can then be copied to the new Sitecore instance:

function Add-IndexBuild() {
    $siteName = Get-ConfigParam "InstanceName"
    $sitecoreFolder = "C:\Inetpub\wwwroot\$($siteName)\Website"
    
    $buildTool = Get-ConfigParam "IndexBuildTool"
    
    Write-Host "Adding index build tool to Sitecore..."
    Copy-Item $buildTool $sitecoreFolder -force
}

And then the config can be processed to update the indexes:

function Rebuild-Indexes() {
    $siteName = Get-ConfigParam "InstanceName"
    $sitecoreFolder = "C:\Inetpub\wwwroot\$($siteName)\Data\packages"   
    
    foreach($index in Get-ConfigIndexes) {             
        Write-Host "Updating search index $index"
    
        # call tool
        $url = "http://$siteNameIndexRebuild.aspx?indexes=$index"
        $result = Invoke-WebRequest -Uri $url -TimeoutSec 600 -OutFile ".\$siteName-IndexBuild-$index.log" -PassThru
        
        if($result.StatusCode -ne 200) {
            Write-Host "StatusCode: $($result.StatusCode)"
            throw "Index build failed for $($index)"
        }
    }

    write-host "Index build done..."
}

Since this is re-worked from the package install code, it generates an HTTP request for each index in configuration for performance in order to try and avoid any timeouts for these calls. But the code could just issue one request for all the configured indexes if this was more appropriate.


Edited to add: I’ve been writing this series of posts from the perspective of automating Sitecore v6.6. However if you’re working with newer releases you may well want to look into the work that the team behind the Sitecore PowerShell Extensions have been doing on Remoting. This provides an interesting alternative approach to triggering things like index rebuilds from outside Sitecore.

Advertisements

3 thoughts on “Enabling automated index rebuilds

  1. Is it worth mentioning the implications of this kind of url being open to the public. We’ve done a very similar thing in the past so the approach is a good one, with caution, you only want known people / ip’s / services able to rebuild your indexes 🙂

    • Very good point – I think I mentioned earlier in this series of posts that this was being used for scripting internal development instances of Sitecore. I’ll add an edit to reiterate that point when I get a chance, as you’re quite right that these endpoints would present at least a Denial of Service risk if exposed publicly.

  2. Pingback: Another package install performance boost | Jeremy Davis

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s