techutopia
Total posts: 202
14 Aug 2014 16:30

Hi,

I'm getting strange results with record serching in a section ...

Search Mode : Detect Automatically

Search Title : Yes

Record Title Mode : Standard

The record types have been re-indexed in DB Tool.

I am only searching via text only, no other filters are applied.

I have a record, and its title is "The Three Mariners"

Here are the searches I have performed and wether the record is shown:

the - YES

the three - NO

the mariners - NO

three mariners - NO

mariners - YES

mariners three - NO

mariners the - NO

three - YES

three the - NO

the three mariners - NO!

TBH I was expecting the record to show for all the searches above.

Any advice appreciated.

Thanks,

Dale.

Last Modified: 08 Sep 2014


pepperstreet VIP
Total posts: 3,837
14 Aug 2014 18:58

It seems to behave like in this forum search. I always wondered why I it is hard to find my OWN topics and what is the best way to enter search term(s). Personally I am used to enter multiple words (up to 3). But in this forum, it leads to strange or inaccurate results.

My first thought was to enter group multipe words with "" e.g. "my multiple search terms" I think this pattern is not used or recognized by Cobalt search. Apparently Cobalt treats it as a single word, including the "" as additional characters?!

At least in this forum, I seem to get better results by using + and - in front of your words. e.g. +multiple +search +terms

Did you try +/- in your setting?

Otherwise...,
you might set the search mode to %LIKE% and repeat your tests. That may find all 3 words as a matching "phrase".


Related link

Documentation - Fulltext search


pepperstreet VIP
Total posts: 3,837
14 Aug 2014 22:46

@ Sergey

BTW, here is a "strange" result in this forum. Even with a single search term. Should it really work this way, or do I search and think different ;) ?

mj_cob_search_results_relevance_titles


techutopia
Total posts: 202
14 Aug 2014 23:14

Hi,

Thanks pepperstreet for your advice ...

I set the search to %LIKE% and got these results ...

the - YES

the three - YES

the mariners - NO

three mariners - YES

mariners - YES

mariners three - NO

mariners the - NO

three - YES

three the - NO

the three mariners - YES

It's a little bit more true to what I would expect, but still not good.

On a general public access website it's not reasonable to expect users to add + or - in search terms ... users are simply not used to thinking that way generally. Do they do this in Google? Nope.

Warm regards,

Dale.


pepperstreet VIP
Total posts: 3,837
15 Aug 2014 01:02

techutopia On a general public access website it's not reasonable to expect users to add + or - in search terms ... users are simply not used to thinking that way generally. Do they do this in Google? Nope.

Yep, agree with you.

I have spent a lot! of time with Cobalt and his pre-decessors, but funnily enough, I did not use or test the search field in depth. Otherwise I would have made a lot of similar posts ;) I always liked the "Filters" and "Count". I personally hate to enter something in an input... it is like "guessing"... not searching. (at least without an AJAX suggest feature)


Sergey
Total posts: 13,748
15 Aug 2014 02:17

What can I say. MySQL fulltext search is applied here. So the whole proces of finding matches are there. What tit returns, Cobalt shows.


techutopia
Total posts: 202
15 Aug 2014 11:08

Hi all,

Here are the results with adding a "+" ...

+the - NO

+the +three - NO

+the +mariners - YES

+three +mariners - YES

+mariners - NO

+mariners +three - YES

+mariners +the - YES

+three - NO

+three +the - NO

+the +three +mariners - YES

It may be conincidence, but the shorter queries seem to be worst? (but maybe I'm just reading something into it)

I thought a solution may be to 'force' the +, but then I thought "What happens if a user types in a word that isn't there?"

Here's a couple of tests ...

+the +three +mariners +bob - YES

+the +three +mariners +pepper - NO

Note that neither 'bob' or 'pepper' are in the record, but one works and they other doesn't?

This is way over my head. :-)

Thanks,

Dale.


pepperstreet VIP
Total posts: 3,837
16 Aug 2014 02:47

Your last comment has really strange results ;)

techutopia +three - NO

Confirmed :( Any single search term with "+" shows no results!?

techutopia +the

Not 100% sure... but i guess short and unimportant words are not added to the index at all. e.g. and, or, the, not etc. At least they are not considered as "relevant".


techutopia
Total posts: 202
20 Aug 2014 01:06

Hi all,

Would anyone be able to suggest a 'hack' to make searching better?

I really do not like hacking cobalt at all, and have not done this yet, but 'search' working with keywords better (or rather, in a way the user expects) is a 'must have' for me.

Thanks you.

Dale.


Sergey
Total posts: 13,748
26 Aug 2014 23:54

May be you just set in configuration LIKE search. This way it will not use fulltext index and search wilt LIKE '%something%'.


techutopia
Total posts: 202
26 Aug 2014 23:58

Thanks Sergey.

I did do this already (please see comment above - reference 14 Aug 2014 23:14) - the results that I got are still not what a user would expect I think.

Ta.

Dale.


Jeff VIP
Total posts: 745
27 Aug 2014 06:32

@all

I think the whole Cobalt text search needs to be refactored (Sorry, Sergey) It works fine in titles only, but is unusable in full text searches (there's no real logic in the results)

Search Alternatives

  1. Google search - accurate and fast
  2. Joomla's smart search - also accurate and fast + auto predict list BUT although all words are indexed, selecting a word from the list are sometimes false positives
  3. Ajax Live Search - accurate and fast + image support + fun to use

The main reason to use Cobalt text search, is that it produces a filtered list. Combined with other filters, it is indeed a powerful search/filter tool.

If only Cobalt could use Joomla's smart search to find records for text searches, but I'm afraid that's science fiction ;-)


techutopia
Total posts: 202
27 Aug 2014 13:10

Hi all,

I eventually got rid of the search box on the main menu bar - as it didn't work as expected.

I then installed the filter module and turned on the keyword search ... and got much better results!

the - YES

the three - NO

the mariners - YES

three mariners - YES

mariners - YES

mariners three - YES

mariners the - YES

three - YES

three the - NO

the three mariners - YES

So, I can only assume that this search works differently.

Hope that helps someone,

Warm regards,

Dale.


pepperstreet VIP
Total posts: 3,837
27 Aug 2014 15:54

techutopia I then installed the filter module and turned on the keyword search ... and got much better results!

Interesting!?! I thought, it has exactly the same features and functionality? @Sergey Any ideas?


techutopia
Total posts: 202
05 Sep 2014 15:01

Hi,

Could this be something to do with a mysql setting? I noticed that shorter words are more of a problem, and found this ... do you think I'm onto something?

"The minimum and maximum lengths of words to be indexed are defined by the ft_min_word_len and ft_max_word_len system variables. (See Section 5.1.4, “Server System Variables”.) The default minimum value is four characters; the default maximum is version dependent. If you change either value, you must rebuild your FULLTEXT indexes. For example, if you want three-character words to be searchable, you can set the ft_min_word_len variable by putting the following lines in an option file:

[mysqld]
ft_min_word_len = 3

http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html


Sergey
Total posts: 13,748
08 Sep 2014 00:48

techutopia do you think I'm onto something?

Absolutely. This is waht fulltext indexi is. that is why parameters starts with ft_.

Powered by Cobalt