SharePoint 2013: Continuous Crawl and the Difference Between Incremental and Continuous Crawl

SharePoint 2013: Continuous Crawl and the Difference Between Incremental and Continuous Crawl

With the new version of SharePoint a new type of crawl appeared in 2013 named « Continuous Crawl ».
For Old schools like me on SharePoint 2010 we had 2 crawls available and configurable on our Search Service Application.

  • Full : Crawl all content,
  • Incremental : As the name is says, it crawls content has been modified since the last crawl.

The disadvantage of these crawls, is that once launched, you are not able to launch a second in parallel (on the same content source), and therefore the content changed in the meantime we will need to wait until the current crawl is finished (crawl and another) to be integrated into the index, and therefore to be found via search.
An example :

  • A incremental crawl named ALFA is started and will last 50 take minutes,
  • After 10 minutes of crawling a new document has been added, so we need a second incremental crawl named BETA to get the document in the index.
  • This item will have to wait at least 40 minutes to be integrated into the index.

 

So, we can’t keep an updated index with the latest changes, because latency is invited in each crawling process.
It is possible that in most of cases this operation is suitable and favorable for your clients, but for those who want to search their content immediately or after their integration into SharePoint there is now a new solution in SharePoint: "Continuous Crawl".

 

The Continuous Crawl
So resuming: The "Continuous Crawl" is a type of crawl that aims to maintain the index as current as possible.

His operation is simple: once activated, it will launch the crawl at regular intervals. The major difference with incremental crawl is that the crawl can run in parallel, and do not expect that the crawl is completed prior to launch.

Important Points:

  • "Continuous Crawl" is only available for sources of content type "SharePoint Sites"
  •  By default, a new crawl is run every 15 minutes, but the SharePoint administrator can change this interval using the PowerShell cmdlet Set-SPEnterpriseSearchCrawlContentSource,
  • Once started, a "Continuous Crawl" can’t be paused or stopped, you can just disable it.

If we take our example above with "Continuous Crawl":

  •  Our ALFA crawl starts and will take at least 50 minutes,
  •  After 10 minutes of crawling an item already crawl is hereby amended, and requires a new crawl.
  •  Crawl "BETA" is launched,
  •  The crawl "BETA" starts in (15-10) minutes,
  •  Therefore this item will not need to wait 5 minutes (instead of 50 minutes) to be integrated into the index.


1- How to Enable it?

In Central Administration, click on your search service application, and then in the menu on the "Content Sources"

 

Clique on « New Content Source » at the menu

 

Chose « SharePoint Sites »

Select « Enable Continuous Crawls »

 

  • The content source has been created so we can see his status on « Crawling Continuous »

 

2 - How to disable it?

  •  From the content source page, chose the option "Enable Incremental Crawls" option. This will disable the continuous crawl.
  •  Save changes.

 

3 - How to see if it works ?

Click on your service application search then "Crawl Log" in the section "Diagnostics".

 

Select your Content Source and click on « View crawl history »

Or via PowerShell Execute the followoing cmdlets
$SearchSA = « Search Service»
Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $SearchSA | select *


Impact on our Servers

The impact of a "Continuous Crawl" is the same as an incremental crawl.
At the parallel execution of crawls, the "Continuous Crawl" within the parameters defined in the "Crawler Impact Rule" which controls the maximum number of requests that can be executed by the server (default 8).

 

4 - SharePoint Online

 This feature is not available in SharePoint Online 2013. You can read it here:

http://technet.microsoft.com/en-us/library/jj819291.aspx

Leave a Comment
  • Please add 5 and 2 and type the answer here:
  • Post
Wiki - Revision Comment List(Revision Comment)
Sort by: Published Date | Most Recent | Most Useful
Comments
  • Ed Price - MSFT edited Original. Comment: Title & Tags. Some bullets formatted

Page 1 of 1 (1 items)
Wikis - Comment List
Sort by: Published Date | Most Recent | Most Useful
Posting comments is temporarily disabled until 10:00am PST on Saturday, December 14th. Thank you for your patience.
Comments
  • Ed Price - MSFT edited Original. Comment: Title & Tags. Some bullets formatted

Page 1 of 1 (1 items)