#.think.in
learn.create.enjoy

#.think.in infoDose #7 (10th Nov - 15th Nov)

November 17, 2008 09:23 by brodie

Management/Process

Architecture

ASP.NET

jQuery

WCF

MEF - Managed Extensibility Framework

Silverlight

UI Design

Testing/Debugging

Singularity Watch

Other


Tags:
Categories:
Comments (0)

SQL Sever 2005 Full Text Indexing

November 14, 2008 09:31 by tarn

imageTill now I've always had problems getting SQL Server 2005 Full Text Indexing (FTI) to perform well in real world scenarios. Recently I found it was because I wasn't implementing it correctly, so I'll post this tip which will hopefully help myself and others get it right next time.

The query is a user entered keyword search on text columns using the FTI predicates and T-SQL predicates on other relational columns. The query is over a moderately large number of and returns paged results.

FTI is basically the ability to search from text columns where the text has been indexed with natural language, relevance and ranking considerations built in. XML and other data type can also be index natively, but I haven't used that. FTI in SQL Server 2005 uses the Microsoft Search Service. This service is not actually part of SQL Server itself, it is also used by Exchange and Sharepoint. I think you can use the service directly with a Sharepoint SDK.

To use Full Text Indexing in SQL server you must enable full text indexing to the entire database and the table and columns containing the searchable text. It can all be setup from SQL Management Studio, but it does create a new physical folder in addition to the normal database MDF and LDF files. Once this has been done additional FTI predicates can be used on the selected columns.

As SQL Server Express doesn't support FTI, I think it is worth considering optionally supporting normal T-SQL similar to applications support querying using the FTI and  functions. Its nice to be able develop or even deploy with SQL Server Express even though the text searching is not indexed using the Search Services. But this really depends on circumstances, I just wanted to note supporting both is an option. 

I found a query similar to this example was taking ages over a large amount of data. It was unacceptable, we had to render all pages in less than 4 seconds under moderate load and we expected to get less than two seconds. We we getting search times alone in double figures for some keywords.

SELECT *
FROM
(
      SELECT *, ROW_NUMBER() OVER ([DocumentInformation].PublishDate ASC) as RowNumber,             
      FROM Document
      JOIN DocumentInformation ON DocumentInformation.DocumentId = Document.DocumentId
      JOIN UserAccount Owner ON Document.OwnerId = UserAccount.UserId
      JOIN UserAccount Publisher ON Document.PublisherId = Publisher.UserId
      WHERE [Document].Published = 1 AND     
            [Owner].IsActive = 1 AND      
            [Publisher].IsActive = 1 AND
            CONTAINS([Document].DocumentText, "FTI*" )
) x
WHERE RowNumber BETWEEN 1 AND 10
ORDER BY RowNumber)

 

Below has an additional inner query on the document table alone, using just the CONTAINS predicates on the document text. The additional tables are joined to the results and the normal T-SQL predicates are added. 

SELECT *
FROM
(
    SELECT *, ROW_NUMBER() OVER ([DocumentInformation].PublishDate ASC) as RowNumber,        
    FROM 
        (SELECT *, DocumentId
        FROM dbo.Document
        WHERE CONTAINS(([Documemnt].ProductKeywords, "FTI*" )) p
        JOIN DocumentInformation ON [DocumentInformation].DocumentId = p.DocumentId
        JOIN UserAccount Owner ON Document.OwnerId = UserAccount.UserId         
        JOIN UserAccount Publisher ON Document.PublisherId = Publisher.UserId 
        WHERE [Document].Published = 1 AND 
              [Owner].IsActive = 1 AND 
              [Publisher].IsActive = 1 
) x
WHERE RowNumber BETWEEN 1 AND 10
ORDER BY RowNumber

 

I gave this a go after reading an FTI Best Practice article that indicated the searchable text rows couldn't be excluded from a search due to the search service running in a separate process. The result of this subtle change was amazing, the searches were obviously much quicker and returned the same result set. Under load test we found this change alone improved the average execution time of all pages on the website by a massive 30%.

In addition to this the FTI best practices document also recommends the following optimizations to improve full text performance.

We ended up implementing all the recommendations above to try and get the most out of the FTI service. After all this we got search times down to about a second under light load, this was cool but we had grander plans for an in memory search using of expression trees for search filters and LINQ for querying the in memory objects. I hope to chronicle this in future posts as initial performance testing shows much quicker search times and a significant capacity load increase by removing searching from SQL Server which was previously our performance bottleneck.

In SQL Server 2008 they have apparently integrated  FTI into the SQL Server process which appears to be the cause of my original performance problem. But so far I've only heard bad things, hopefully these issues will be cleaned up before I start using it.

2005 Full-Text Search Architecture

2008 Full-Text Search Architecture


Tags:
Categories:
Comments (0)

Liberation Day - Power To Developers

November 13, 2008 22:40 by tarn

image I'm not in anyway formally associated Microsoft other than that I've been primarily using their tools to develop software for the over the last few years, so I was a little excited I was invited to Liberation Day in Sydney after placing 2nd in DevSta. I thought it would be great to see Steve Ballmer, infamous for this famous clip and this. He is, of course, also the Microsoft CEO and was going to announce Azure the new Microsoft cloud computing platform. He was sure to be entertaining anyway.

As I'd never been to Sydney I decided I could spend a couple of days on Bondi Beach with my girlfriend and drop into the conference and while I was there. I'm expecting to be looking for a new job from next year, probably not in Sydney, but I thought it would also be good to see who was there and meet a few people anyway. I was expecting there would be a couple of DevSta judges there too.

There were heaps of people at the event, I wouldn't be surprised if there actually was the 1000 developers the flyer claimed there would be. It was cool, I can't image an event with that many developers in Melbourne. It was streamed live and can be replayed on the Power To Developers site. I watched Steve get up and do his thing, and it was fun, not rock'n'roll, but fun. Unfortunately I wasn't feeling very well, almost feverish, and I had to duck out to find a chemist.

Gianpaolo Carraro's presentation attempted to demonstrate how easy developing cloud solutions for the Azure was with Visual Studios 2008. I thought it was pretty cool despite most of his demos failing in ways he couldn't have imagined. The Azure platform sounded pretty awesome, basically allowing custom .Net assemblies to be uploaded and invoked on Microsoft servers. He also managed to successfully demonstrate the cloud emulator for locally running and debugging cloud applications. I also really like the idea of being able to a write LINQ query joining two data tables from a SQL Server Service in the cloud.

Mike Culver presented the Amazon Web Services cloud solutions to the Vic.NET user group about a year ago and I think he sold their technology and the possibilities better. He showed the architecture of a video compression service that used a message queue service to queue requests and a controller application that can automatically rent, build and deploy servers capable of processing the requests. I thought it was pretty cool. I think Microsoft are a long way behind, but we'll see what the software giant is capable of. I'm certainly keen to get in and give it a go.

I still wasn't feeling well and missed most of Tim Sneath which was a shame, Brodie had seen him in London years ago but said he "knew his stuff". At the networking drinks I chatted with Michael Kordahi which was kind of fun, he'd judged my entry and he is a Silverlight guy. He introduced me to a couple of people, one of which was Jeremi Kelaher who does Strange Devices Podcasts. I also ran into Tatham Oddie on the way out who I'd seen present MVC stuff at REMIX and Vic.NET. I would have liked to have chatted with Andrew Coates who is a great presenter and was also a DevSta judge, but I didn't end up seeing him this time.


Tags:
Categories:
Comments (0)

#.think.in infoDose #6 (3rd Nov - 8th Nov)

November 10, 2008 09:32 by brodie

Notable Events

  • Obama wins the American election - w00t!
  • Viewed wins Melbourne Cup
  • Steve Ballmer rocks into Sydney to address developers

News Flash - End of LiNQ to SQL!?

This week there was a lot of speculation around the future of LINQ to SQL, here are just a few of the links in the blogo-sphere...

Management/Process

Architecture

ASP.NET

ASP.NET MVC

Silverlight

WCF

Singularity Watch

Other

Books

0000@901_901-4126614de0598154

Silverlight 2 Unleashed

Quotes

"Figure out the absolute least you need to do to implement the idea, do just that, and then polish the hell out of the experience" —John Gruber


Tags:
Categories:
Comments (0)

#.think.in infoDose #5 (27th Oct - 31st Oct)

November 3, 2008 09:45 by brodie

Notable Events

Obviously the biggest event of the week was the Microsoft Professional Developers Conference.  Every developer blogger around the globe will be busting at the seems with PDC goodness, but I guess the best place to go is the site itself and download yourself some of the sessions - there's something for everyone.

...and it looks like the marketing guys gave us a new .Net logo

PDC Announcements

As you'd imagine there were announcements aplenty, here is just a slice of what was announced ....

Architecture

UI

Silverlight

jQuery

Tools

WCF

Been doing some work with WCF this week, and here are some of the better links/screencasts I've been using ...

0002@871_871-4126614259d33f56patterns & practices Improving Web Services Security Guide - Home

Books

0001@871_871-4126613fd14c3b2aFramework Design Guidelines: Conventions, Idioms, and Patterns for Reuseable .NET Libraries (2nd Edition) (Microsoft .NET Development Series): Krzysztof Cwalina, Brad Abrams

0003@871_871-412661425b425ed1Coding4Fun 10 .NET Programming Projects for Wiimote, YouTube, World of Warcraft, and More

Other

Quotes

Purpose of Incorporation : "To establish a place of work where engineers can feel the joy of technological innovation, be aware of their mission to society, and work to their hearts content" - Masaru Ibuka, Sony


Tags:
Categories:
Comments (0)

#.think.in infoDose #4 (20th Oct - 25th Oct)

October 27, 2008 09:22 by brodie

Notable Events

Silverlight

Architecture

ASP.NET AJAX

ASP.NET MVC

jQuery

Testing

Usability

Developer Skills

Singularity Watch

Other

Tips