It's been an interesting and full week as I've been attending the MS-sponsored Publishers/Authors summit conference all this week and while the details are under NDA, I did have a good time seeing other SQL Server authors and I did meet several publishers and discussed my book proposal. I also saw some of my friends from SQL Dev and Communities at the after-conference party at the Red Hook Brewery where a great time was had by all! I too am in the Yukon beta and when it's Ok, to post information on this, I'll discuss some of the advantages and enhancements relative to Yukon FTS and all things textual. Speaking of Yukon, Eric Brown of Microsoft SQL Marketing is interviewed in a “talking to“ article in the October 2003 MSDN Magazine that is now online.
Eric Brown's Oct. 2003 MSDN Magazine interview - http://msdn.microsoft.com/msdnmag/issues/03/10/TalkingTo/default.aspx
Relative to SQL Server 2000 FTS and depending upon what OS Platform you have it installed on or if you upgrade just the OS platform from Windows Server 2000 (Win2K) to Windows Server 2003 (Win2003), you can see significant difference (and for most users, a *better*, more expected difference). The OS Platform supplies the “word breaker” dll, for Win2K - infosoft.dll and for Win2003 - langwrbk.dll. The latter is a new Microsoft developed wordbreaker that is also included with Windows XP Pro (and used by SQL Server 2000 Developer's Edition on WinXP) with "better" or at least what people would expect as better, although, sometimes different is not always better. Note, the workaround for this issue on Win2K is to use the Neutral “Language for Word Breaker“, but then you lose the ability to use the language-specific INFLECTIONAL FTS query keyword as the Neutral wordbreaker “breaks“ the words based upon the white space between words.
I've tested the Neutral wordbreaker with SQL Server 2000 on OS Platforms Win2K and Win2003 and using a search string of "T-SQL" is broken into "T" and "SQL". Note the use of double quotes in the search string as this indicates a phrase, i.e., a multiple word search string. However, in this case we are using a single letter and single letters are normally "noise words" in all of the noise word files, so the "T" is ignored and in this case, a SQL FTS query will return results for "SQL" alone. You can also remove "T" (or other single letters) from noise.dat, the Neutral wordbreaker noise word file, and on the OS platform Win2K, you will also need to remove "T"or other single letters from the noise.* files under your \WINNT\System32 directory as well as noise.enu (US_English) and noise.eng (UK_English) as well as the noise word files under your SQL Server default folder of \FTDATA\SQLServer\Config. After making these changes and before saving the file changes, you must stop the “Microsoft Search“ service, before you can save the FTDATA noise word files. When your modifications are completed, you must run a Full Population and then re-test your SQL FTS query.