Blog author:
Ketvin | Posted on: 9/19/2012 | Category:
Others Blogs | Views: 504 | Status:
[Member] |
Points: 75
|
Alert Moderator
Out of the box, SharePoint provides a mechanism for excluding publishing
pages from the search index by any number of criteria, but what if you
want to exclude only parts of a page? This becomes useful when you have
content on numerous pages that contains common keywords.
For Example, In Our Production site we have master page and in this
master page , Header , Left Navigation and footer for the Pages.
So when someone performs a search for "Careers" they will get back every
page in your site, instead of just the Careers and related Careers
pages. To prevent this, we needed a way to keep SharePoint from
indexing that content when it's performing a crawl.
Web Control
We decided that a System.Web.UI.WebControls.Panel control would be a
good model to build my control on. It allows you to easily drop it in
the page layout using SharePoint Designer, and you can put other html
and controls within it. I didn't want to inherit from the Panel control
though, because it adds unwanted 'div' tags to the rendered output.
The key to the Panel control's behavior are the following two attributes
on the class:
[ParseChildren(false), PersistChildren(true)].
These attributes allow the content within the control to persist as controls and not properties of this control.
User Agent
The second part of the equation is knowing when to show or hide the contents of the web control.
SharePoint gives us a way to identify that it's performing a crawl
through the UserAgent property of the http request by adding "MS Search"
to it.
Code
Putting this all together we come up with the following class:
[
ParseChildren(
false),
PersistChildren(
true)]
public class SearchCrawlExclusionControl :
WebControl
{
private string userAgentToExclude;
public string UserAgentToExclude
{
get
{
return (
string.IsNullOrEmpty(userAgentToExclude)) ?
"ms search" : userAgentToExclude;
}
set
{
userAgentToExclude =
value;
}
}
protected override void CreateChildControls()
{
string userAgent =
this.Context.Request.UserAgent;
this.Visible = (!
string.IsNullOrEmpty(userAgent)) ? !userAgent.ToLower().Contains(UserAgentToExclude) :
true;
base.CreateChildControls();
}
}
Using It
Register Web Control within Page.
<%@ Register Tagprefix="SearchUtil
" Namespace="ABC.SharePoint.WebControls" Assembly="ABC.SharePoint, Version=1.0.0.0, Culture=neutral, PublicKeyToken=xxxxxx" %>
After adding the register tag to the page layout, we can wrap all the content we want to exclude with our control:
<SearchUtil:SearchCrawlExclusionControl ID="SearchCrawlExclusionControl1" runat="server">
<div>Some Content To Excludediv>
SearchUtil:SearchCrawlExclusionControl>
Test this:
After applying this User control to all your excluding Div and HTML tags.
Make Incremental or Full crawl of you web site.
How to edit the User Agent string in Mozilla FireFox
To change the User Agent string, just enter about:config as an address
in the address bar of FireFox, the location where you normally enter a
URL (link). I recommend to preserve the original value, which you can
get when you enter just about: in the address bar.
Now press the right mouse button to get the context menu and select
"String" from the menu entry "New". Enter the preference name
"general.useragent.override", without the quotes. Next, enter the new
User Agent value you want Mozilla Firefox to use.
I added my name and a link to my web site to the original value. You can
also pick one from the list of User Agent strings. Check the new value
by entering about: in the address bar.
How to edit the User Agent string in Google Chrome
Here's how to change the user agent:
- open the Developer Tools (Ctrl+Shift+I on Windows/Linux, Command - Option - I on Mac OS X)
-
2. click the "settings" icon at the bottom of the window
-
3. check "override user agent" and select one of the options (Internet Explorer
7/8/9, Firefox 4/7 for Windows/Mac, iPhone, iPad and Nexus S running Android 2.3).
You can also select "other" and enter a custom user agent.
Note: Here to test for SharePoint search write MS Search as User Agent in Other option.
How to edit the User Agent string in Internet Explorer
To change it open your registry and find the key
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Internet Settings\User Agent\Post Platform].
Each value name listed in this key will be sent to the remote web server
as an additional entry in the user agent string. To remove any
additional information delete the values within the [Post Platform] key.
To add additional entries create a string value and name it the string
you want to be sent.
Restart Internet Explorer for the changes to take effect.
Note:Here to test for SharePoint search write MS Search as User Agent in Other option.