This guide has the three sections regarding problem, solution and results. For better understanding, review them in order, as each section builds on concepts presented in previous sections.
This guide is designed to highlight the solution for crawling the related data in SharePoint Server 2010 Fast Search. It provides an overview of the problem statement, a solution for the issue, and concludes with implementation of the solution suggested.
This section describes why exactly we need relational crawled properties. It refers to a practical scenario where we need to address the crawled property for related data.
Suppose we have 4 SharePoint lists names Cities, Universities, Courses & Colleges The data for the above mentioned lists are as follows:
As seen from above schema, there is no information related to city in the Colleges list. So now when user searches any record related to any city, FAST Search will retrieve the record from Cities list only.
We want to achieve that, when a user searches for any city, then related records must also be retrieved (i.e. from Universities list and Colleges list also, as City is a lookup column in Universities list and University column is a lookup column in Colleges list.)
The pipeline extensibility of FAST Search has been customized and is called Processing Component. The item processing pipeline prepares an item from a content source for indexing and searching. Additional searchable metadata has been generated from the content. This preparation includes text extraction, language detection, and tokenization. The following diagram depicts the architecture of how pipeline extensibility in FAST Search works. It shows the standard sequence of ‘stages’ from crawl to Index.
We can solve this by customizing the pipeline extensibility of FAST Search (creating Pipeline Extensibility Stage), which enables us to run custom processing commands for each item that is fed through the pipeline. This helps to work on the indexed data before it is made searchable
According to the problem statement, when a user searches for a city, the related records should also be retrieved. The following steps needs to be performed in order to get this solution in place:
private
static
readonly
string
SiteURL = ConfigurationManager.AppSettings[
"SiteURL"
];
List1 = ConfigurationManager.AppSettings[
"List1"
//Colleges List
List2 = ConfigurationManager.AppSettings[
"List2"
//Universities List
List1ColumnName = ConfigurationManager.AppSettings[
"List1ColumnName"
//University column
List2ColumnName = ConfigurationManager.AppSettings[
"List2ColumnName"
//City Column
const
stringSeprator =
";#"
;
<
appSettings
>
addkey
=
value
"http://servername/:PortNumber"
/>
"Colleges"
"Universities"
"University"
"City"
</
objItem = web.Lists[List1].GetItemById(itemID);
strvalue = Convert.ToString(objItem[List1ColumnName]);
[] arrUnivValues = strvalue.Split(newstring[] { Seprator }, StringSplitOptions.None);
if
(arrUnivValues.Length> 1)
{
intunivItemId = Convert.ToInt32(arrUnivValues[0]);
objItem = web.Lists[List2].GetItemById(univItemId);
strvalue = Convert.ToString(objItem[List2ColumnName]);
[] arrCityValues = strvalue.Split(newstring[] { Seprator }, StringSplitOptions.None);
(arrCityValues.Length> 1)
strCityValue = arrCityValues[1];
}
int
Main(
[] args)
try
CustomCode objCustomCode =
new
CustomCode();
objCustomCode.DoProcessing(args[0], args[1]);
catch
(Exception e)
// This will end up in the crawl log, since exit code != 0
Console.WriteLine(
"Failed: "
+ e.Message +
"/"
+ e.StackTrace);
return
1;
0;
public
void
DoProcessing(
inputFile,
outputFile)
bool
status =
false
XDocument inputDoc = XDocument.Load(inputFile);
strCityValue =
.Empty;
var listID = from cp
in
inputDoc.Descendants(CRAWLED_PROPERTY)
where cp.Attribute(PROPERTY_SET).Value.Equals (CrawledCategorySharePoint_PropertySetID) && cp.Attribute(PROPERTY_NAME).Value == CrawledProperty_SPList &&
cp.Attribute(VAR_TYPE).Value == VAR_TYPE_TEXT selectcp.Value;
var itemID = from cp
where cp.Attribute(PROPERTY_SET).Value.Equals(CrawledCategorySharePoint) &&
cp.Attribute(PROPERTY_NAME).Value == CrawledProperty_ItemID &&
cp.Attribute(VAR_TYPE).Value == VAR_TYPE_INTEGER selectcp.Value;
Element outputElement =
XElement(DOCUMENT);
((listID !=
null
&& listID.First().Length > 0) && (itemID !=
&&itemID.First().Length > 0))
Logger.WriteLogFile(inputFile, FILE_SUFIX_INPUT);
Service1Client client = newService1Client();
client.ClientCredentials.Windows.AllowNtlm =
true
client.ClientCredentials.Windows.AllowedImpersonationLevel = System.Security.Principal.TokenImpersonationLevel.Impersonation;
strCityValue = client.GetMessage(listID.First(), Convert.ToInt32(itemID.First()));
client.Close();
(Exception ex)
strCityValue = ex.Message;
(client.State == System.ServiceModel.CommunicationState.Opened)
outputElement.Add(newXElement(CRAWLED_PROPERTY,
newXAttribute(PROPERTY_SET, CrawledCategorySharePoint),
newXAttribute(PROPERTY_NAME, CrawledProperty_City),
newXAttribute(VAR_TYPE, 31), strCityValue)
Edit the PipelineExtensibility.xml file and update it as shown below to include the custom code details and crawled properties details
PipelineExtensibility
Runcommand
"C:\pipelinemodules\CustomCode.exe %(input)s %(output)s"
Input
CrawledPropertypropertySet
"158d7563-aeff-4dbf-bf16-4a1445f0366c"
varType
"31"
propertyName
"ows_taxId_SPLocationList"
"00130329-0000-0130-c000-000000131346"
"20"
"ows_ID"
Output
"ows_city"
Run
After saving the file reset the document processors in order to read the updated configuration with the command:psctrl reset
Output File’s content
<?
xmlversion
"1.0"
encoding
"utf-8"
?>
Document
></
CrawledProperty
Input File’s content:
"UTF-8"
>{fc03eeb1-56c7-4dfd-af1e-0a22dc06b4d2}</
>2</
After successfull configuration of FAST Search and customizing Pipeline extensiblity, browse the site and check search results. Searching Delhi as the city shows DU as the university in Delhi, as well as Kirori Mal College as the college in Delhi.
Richard Mueller edited Revision 11. Comment: Removed (en-US) from title, modified title casing, added tags
Richard Mueller edited Revision 12. Comment: Removed blank headings in HTML
Richard Mueller edited Revision 13. Comment: Replaced RGB values with color names in HTML to restore colors
Craig Lussier edited Revision 10. Comment: added en-US to tags and title