Sandeep Khandelwal's Blog

SharePoint, ASP.net & other related stuffs

Search Express Server 2008

clock December 12, 2008 08:27 by author Sandeep Khandelwal

My my my; Just got done with setting up Search Express server for WSS 3.0 server. The product certainly is a giant leap forward for solution integrators like IntegrationNow for quickly delivering SharePoint solutions for clients looking to explore SharePoint for internal collaboration only and are hesitant to make heavy investment on MOSS. The single biggeset feature available in MOSS (IMHO), is the enterprise search capability by which contents from File Share, Networks, Websites, Business Data, SQL Server, SAP, Oracle, Documentum, FileNet and numerous other system can be crawled, indexed and search right from the Search Center. Now Microsoft decided to offer the product in the name of Search Express for free. This is great that it allows to have extensive search capability including File shares, urls etc without the cost of MOSS. Search Server Express 2008 is a free product from Microsoft (I think I mentioned that already) which allows you to configure search (query & index) for WSS 3.0 sp1 sites (Search Express requires sp1 installed on WSS 3.0 boxes). For a complete list of features in Search Express 2008, please visit HERE. I must say, that I am impressed with what can be done out of box from Search Express server 2008. In this blog, I will cover the differences between Search Express Server & MOSS intrinsic search server.

First of all - to the basics. If you recall, Search is a feature that resides in SSP (Shared Service Provider) in MOSS. Since SSP is not a native feature to WSS 3.0, after you install SE2008 server, you will see a new site created for Shared Service Provider that only has Enterprise Search features in it. Please note that other ssp features like excel services, business data catalog, my sites etc are not added with the SE installation. Also, the search website has a totally different look and feel as compared to MOSS. In MOSS, you would go to Search Settings and thats where you would like a whole sleu of links to create scopes, content sources, managed property mappings etc. Similary, instead of having these available on Search Settings, these links are added to Central Administration -> SharedServices site. Typically the url to ssp is <central admin url>/ssp/admin/SearchAdministration.aspx. The screen would look something like this.

And the Search Center actually resides on a different web application. The URL is important to remember because, you will need this to configure your sites to look for search results from this location instead of default wss 3.0 search. The beauty is, you can customize your search and results pages, create best bets and customize query just like you could in MOSS enterprise. And it is security trimmed. Isn't that amazing.

If you are in the boat where you would like to leverage a uniform search platform for your wss 3.0 sites, I strongly recommend taking a look at SE2008. In my next post, I will walk through the process of installing SE2008 and configuring your site collection to look at new SE server. Until then, happy collaborating.

Currently rated 4.0 by 1 people

  • Currently 4/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


Enterprise Search Relevancy explained

clock December 8, 2008 05:07 by author Sandeep Khandelwal

In this post, I will try to explain the factors that can impact the relevance score in enterprise search. First of all, you will have to give it to Microsoft Research Team and MSN internet search team to come up with such an incredible search engine that actually works. Enterprise search is a feature that is available with MOSS Enterprise Search license and is truly a search engine that crawls and keep tab of not only SharePoint data & files but also external data/files including network shares, LOB system (via BDC), People's search via LDAP and mysites etc. The best part is the extensibility of Search center by adding new content sources, scopes, managed property mappings and creative use of XSLT to format and display search results in an enterprise themed format. The most misunderstood aspect of enterprise search is Relevance. Simply put, relevance score is calculated from a number of content relevance algorithm. Depending on search keywords that a user has typed in, the search results are sorted in the order of relevance score. There are number of ways the relevance score can be impacted.

Click Distance:

There are several hyperlinks used in a web based application that connects one item to the other. Depending on how many clicks it takes for a user to get to a document or search result, the relevance score is impacted. Needless to say, the result with the least # of click distance tends to appear at the top of search results. Crawler looks at the Authoritative pages (page that contains topic specific unique content) and all the links that connects the pages and assigns them relevance score. The closer a page is to authoritative pages, the higher the relevance score assigned to them.

URL Depth:

URL depth also impacts the relevance score. The more the /(slashes) in URL, the lower is the relevance score.

HyperLinks Anchor Text:

The text used to describe an hyperlink is called Anchor text. <a href="docs.aspx">My Documents</a>. In this example - "My Documents" is the anchor text used to describe url "docs.aspx". The anchor text do not play any role if the results are going to appear in the search query but they play an important role in assigning relevance score. In other words, if a user searches for "crawl" and there are no documents with crawl keyword (although there are hyperlinks that says Crawl), the search query ignores the description and hence no results are displayed.

Document Title:

Each document have some metadata that is inherently stored with them including author's name, last modified date and title. The title plays an important role in assigning relevance score. Most people do not pay much attention to title but the search engine is smart enough to ignore the default title assigned by the editor tool and then looks for the first page to set the title relevance. For eg. all powerpoint have a title of "Slide 1", but search engine will ignore that and tries to read the first slide and assign title relevance that way.

File Type biasing:
Simply put, file type biasing means the document types that are indexed and searched first before other document types picks up. The default ranking order in Enterprise Search is:
  1. Web Pages
  2. Power Point Presentations (ppt, pptx)
  3. Word documents (doc, docx)
  4. XML Files (.xml)
  5. Excel (xls, xlsx)
  6. Plain text files (.txt)
  7. SharePoint list items

 

It is really important to understand how MOSS Search engine assigns relevance score and places the links/documents at the top of search results. In my future posts, I will talk about best bets and how they can be helpful in highlighting some of common search keywords in an organization.

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


About the author

I work as SharePoint Consultant and Lead ECM Solution Expert for Integration Now (a pioneer in SharePoint solutions in midwest region). Besides having PMP, MCP, MCTS and other technical certifications, I am also an MBA (Finance) from UMKC. I lead & oversee SharePoint engagements in 4 states around Kansas City (MO, KS, IA, & NE).

Tag cloud

Page List

Sign in