Blog

Headnix — Clip It!

Thursday, May 29, 2008 MySQL full text search and test data

I like using lipsum as test data. If I need to generate test strings, I usually generate them using my lipsum in a text file.

There’s one thing to note when you generate a large size of test data from a small lipsum source.

I need to test the full text search function in MySQL using my test data. However, my lipsum source is too small and the generated table is too large (~10,000). myisam_ftdump shows the following information:

Total rows: 9939
Total words: 1299056
Unique words: 161
Longest word: 12 chars (consectetuer)
Median length: 6
Average global weight: -1.871320
Most common word: 9906 times, weight: -5.704388 (mauris)

Hm… there are only 161 unique words. Remember, MySQL Natural Language Full-Text Searches has a behaviour which ignores the search words if they appear in 50% or more of the rows. You can almost say, for sure, that any word will appear in > 50% of the rows.

I need to use Boolean Mode to get around this.

If you are looking for Finger iPhone app, please go to Finger's site.

Copyright © 2007 Headnix. All rights reserved. Design by BeansBox.