Please Help With Amazon Description RegEx

Discussion in 'Regular expressions' started by LBrown, Nov 20, 2011.

  1. LBrown

    LBrown Client

    Joined:
    Nov 20, 2011
    Messages:
    6
    Likes Received:
    0
    It should be really obvious and I've been able to scrape other elements on the page but I just can't pick up the description.

    So, for this page:

    http://www.amazon.com/gp/product/B004X6TSOG/

    the whole description is between

    Code (text):
    1. <div class="productDescriptionWrapper"></div class="EmptyClear">
    so the RegEx should be

    Code (text):
    1. (?<=\<div class\=\"productDescriptionWrapper\"\>).*?(?=\<div class\=\"emptyClear\"\>)
    But it doesn't work when I test it.

    Can someone show me where I've gone wrong and what I need to do to fix it, please?
     
  2. shifu

    shifu Client

    Joined:
    Apr 4, 2011
    Messages:
    147
    Likes Received:
    18
    Your regexp is wrong. You need this (?<=\<div class\=\"productDescriptionWrapper\"\>).*?(?=\<\/div class\=\"EmptyClear\"\>)
     
  3. LBrown

    LBrown Client

    Joined:
    Nov 20, 2011
    Messages:
    6
    Likes Received:
    0
    Thank you for your answer but when I try "Test a Regular Expression" with yours nothing comes up. I've tried with DOM HTML and Source HTML.
     
  4. archel

    archel Client

    Joined:
    May 2, 2011
    Messages:
    173
    Likes Received:
    22
    You should also do \ before spaces:
    (?<=\<div\ class\=\"productDescriptionWrapper\"\>).*?(?=\<\/div\ class\=\"EmptyClear\"\>)

    Though the answer of shifu should work in test mode.
    Make also look if <div class="productDescriptionWrapper"></div class="EmptyClear"> is correct in DOM, cause on my pc it's <div class=productDescriptionWrapper></div class=EmptyClear>
     
    LBrown likes this.
  5. LBrown

    LBrown Client

    Joined:
    Nov 20, 2011
    Messages:
    6
    Likes Received:
    0
    Okay, So I tried this with Source HTML (?<=\<div\ class\=\"productDescriptionWrapper\"\>).*?(?=\<\/div\ class\=\"EmptyClear\"\>)

    And this with DOM HTML (?<=\<DIV\ class\=productDescriptionWrapper\>).*(?=\<DIV\ class\=emptyClear\>)

    And still nothing. Is there something wrong with the Regular expression builder? Has anyone else been able to get the Regular Expression builder to give results from these?
     
  6. drvosjeca

    drvosjeca Client

    Joined:
    Oct 26, 2011
    Messages:
    512
    Likes Received:
    453
    get me on skype... i will se if i can help you with this
     
    LBrown likes this.
  7. LBrown

    LBrown Client

    Joined:
    Nov 20, 2011
    Messages:
    6
    Likes Received:
    0
    drvosjeca fixed it for me. Thanks to all for the help.
     
  8. dongle132

    dongle132 Client

    Joined:
    Jan 22, 2011
    Messages:
    36
    Likes Received:
    0
    Would you please so kind and tell us all what the fix/correct RegEx finally was :-) ?!
    I'm almost stuck @the same position within one of my new amazon templates.

    THX
     
  9. drvosjeca

    drvosjeca Client

    Joined:
    Oct 26, 2011
    Messages:
    512
    Likes Received:
    453
    which part is bothering you?
     
  10. MD. Shamid Islam

    MD. Shamid Islam Новичок

    Joined:
    Feb 13, 2019
    Messages:
    3
    Likes Received:
    0

Пользователи просматривающие тему (Пользователей: 0, Гостей: 0)