Shoshana
Posts: 5
Joined: Tue Jun 06, 2017 1:31 pm

SIML and Hebew with *

Tue Jun 06, 2017 4:00 pm

I try to use *siml files,
for hebrew language there is not full treatment,
when I use <Item> with * (for like) it's does not work.
anyone?

User avatar
Fantom
Help & Support
Posts: 304
Joined: Fri Oct 25, 2013 9:20 pm

Re: SIML and Hebew with *

Tue Jun 06, 2017 4:09 pm

Please paste in a test SIML code so we can advise you further. And where are you using the <Item> ? within a pattern or any other element?

Shoshana
Posts: 5
Joined: Tue Jun 06, 2017 1:31 pm

Re: SIML and Hebew with *

Tue Jun 06, 2017 4:32 pm

Hi,
Thanks for reply
this is my code:

Code: Select all

 
<Model>
      <Pattern>
        <Item>כמה משתמשים במערכת *</Item>
      </Pattern>
      <Response>
      במערכת יש כעת  200 משתמשים פעילים
      </Response>   
 </Model>    

User avatar
Fantom
Help & Support
Posts: 304
Joined: Fri Oct 25, 2013 9:20 pm

Re: SIML and Hebew with *

Tue Jun 06, 2017 5:27 pm

Does the sample code work without *? I mean if you just make the pattern "כמה משתמשים במערכת" does the model work?

Shoshana
Posts: 5
Joined: Tue Jun 06, 2017 1:31 pm

Re: SIML and Hebew with *

Wed Jun 07, 2017 9:27 am

Obviously, works fine!

Shoshana
Posts: 5
Joined: Tue Jun 06, 2017 1:31 pm

Re: SIML and Hebew with *

Wed Jun 07, 2017 10:19 am

Hi, I find this tag { or } that do the same, and it's work, but only for two sides,
the * still doens't work

User avatar
Fantom
Help & Support
Posts: 304
Joined: Fri Oct 25, 2013 9:20 pm

Re: SIML and Hebew with *

Wed Jun 07, 2017 1:27 pm

This looks like an interesting find. I will have the team look into this and get back to me with a resolution, hopefully within the next 24 hours if they are able to successfully reproduce this behavior.

Shoshana
Posts: 5
Joined: Tue Jun 06, 2017 1:31 pm

Re: SIML and Hebew with *

Wed Jun 07, 2017 2:03 pm

Ok, I'm waiting,
Thank in advance!

User avatar
Leslie
Lead Software Architect
Posts: 353
Joined: Fri Sep 14, 2012 12:20 pm
Contact: Website

Re: SIML and Hebew with *

Fri Jun 09, 2017 2:13 am

Just my 2 cents on what might be going on internally with your Pattern and the Tokenizer.

By default the entire architecture of SIML has been designed and tested for LTR languages. Hebrew is an RTL language and the default SIML Tokenizer is tailored for LTR languages.

Let me take your Pattern as an example.

כמה משתמשים במערכת *

In the above, the fragment כמה משתמשים במערכת is RTL whereas the symbol * is LTR. Now despite being an LTR the symbol is expected to capture an RTL text fragment as per your Pattern. Now here comes the problem in the current Tokenizer.

After the tokenization is completed the list of tokens would be arranged in the following order (ascending index):
  • 0. כמה
  • 1. משתמשים
  • במערכת .2
  • 3. * (This symbol must have been the first in the list)
Note: The above order is actually correct given the fact that the words are tokenized in their logical order and not their display order.

Now as per the generated decision tree nodes, the tokenized pattern now expects to capture a wildcard entry at the end of your Hebrew sentence instead of the beginning. This is why a combination of RTL + LTR tokens may generate shuffled decision tree nodes giving unexpected results.

Return to “General Discussion”

Who is online

Users browsing this forum: No registered users and 1 guest