Trying to match up words

shabbysquire

Client
Регистрация
25.11.2012
Сообщения
544
Благодарностей
26
Баллы
28
Facing an issue with matching words to product titles.

I'm scraping a site for the product titles, ie: 'Bamboo 8 piece kitchen knife set'. I have a list of 100 banned items that I need to compare to the product title.

So if the banned item is: 'kitchen knife', how do I compare it? Some banned item words are single or up to three in total, ie: knife, air rifle, etc.

I'm able to break up the title into separate words, and then compare to the banned list - but, I can only compare single word banned items. This clearly won't work.

Any other suggestions?

Thanks.
 

HuangXiangCai(黄祥财)

Administrator
Команда форума
Регистрация
03.10.2012
Сообщения
21
Благодарностей
14
Баллы
3
you want to make a filter for the "100 banned items" ? or make some relative degree in these product title?
i think you can make the "kitchen knife" as a keywords, compare the keywords in " products title" list.
secondly ,you may think about to make a "temp list with RowCount" for checking the length/words quantity .
i met the similar problem in somewhere,i make it like this way.
hope can give you some help.
 

shabbysquire

Client
Регистрация
25.11.2012
Сообщения
544
Благодарностей
26
Баллы
28
Not sure if it's possible, because I usually save the title in a file, each word on it's own line:

Код:
Bamboo
8
piece
kitchen
knife
set
And then take one line and compare to banned items. It works ok sometimes, as some banned items are a single word, i.e. knife, but I can't match something like: air rifle.

It's a tricky one.
 

Tobbe

Client
Регистрация
01.08.2013
Сообщения
428
Благодарностей
148
Баллы
43


:dm:
 

Вложения

  • Спасибо
Реакции: shabbysquire

shabbysquire

Client
Регистрация
25.11.2012
Сообщения
544
Благодарностей
26
Баллы
28
Perfect solution Toby, thanks!

Can now match more than one word from the banned list, and compare to the title.

:bz:
 
Последнее редактирование:

Кто просматривает тему: (Всего: 1, Пользователи: 0, Гости: 1)