If Logic match of an article

jp1

Client
Регистрация
23.01.2011
Сообщения
234
Благодарностей
2
Баллы
0
Even though I use the DOM Html of the regex builder, I'm getting many articles where the regex is extremely hard to build because of the line breaks in the original DOM Html. In other words, I can scrape the article but it comes with line breaks attached to it, which means that when I want to scrape it with other more simple values, it breaks my CSV or TSV files. There are three rows where there should only be one, and two of those rows are empty.

I thought of just scraping the articles and outputting them into a different file, and then joining them to the simpler values later on to create my definitive CSV file, but sometimes there's no article and then it becomes very hard to match the articles with the other values, as there are some empty values where you'd want there to be articles, so the articles don't match with the other columns. This happens because I have to get rid of the line breaks represented by the empty rows, and if there was no article, the rows where no article was found are also erased, accidentally.

I thought about adding a prefix to the article value, so that if there is no article, at least it wouldn't be erased later on, but quickly abandoned the idea as the prefix wouldn't be sitting on the same row as the article but rather on the row preceding it.

If you go to an article in ezinearticles, you'll see what I mean. (?<=<div\ id="article-content">)[\w\W]*(?=<div\ id="article-resource">) gives me three rows instead of one, and two are empty. Same story with amazon, etc. I just don't know how to match the empty rows.

But what I also don't know how to do is do an If Logical match statement where '{-Variable.article-}'!='' because If statements seem to fail either because the variable is a big article instead of a string, or because there are three strings involved instead of one.

I then thought of extracting the middle line, which is the one with my article, to see at least if a logical statement that perhaps only can tolerate strings can also match an article. The problem with this middle line is it has no separators. You open it up in notepad and it shows no separators because it's part of the same match number. But even though it has no separators, it still screws my bloody csv and tsv files because it takes the liberty to invade three rows instead of one.

I wanted to add a note each time there's no article and I wanted to do so by doing an If '{-Variable.article-}'!='' but it's breaking my template. It either doesn't like the fact there are three rows inside the variable.article, or it doesn't like the fact it's an article and not a string.

I thought that if the problem was that regexp of DOM Html for specific Match No. wasn't working, at least I could output the result of the regexp DOM Html into a file. But that too doesn't work. And a Get line of the first line of a file also breaks my bloody template.
 
  • Спасибо
Реакции: stormybreeze

rostonix

Известная личность
Регистрация
23.12.2011
Сообщения
29 067
Благодарностей
5 707
Баллы
113
Before checking if variable is empty you can use Word processing - Prepare JavaScript.
And only after that check if result of this action is empty or not.
 

jp1

Client
Регистрация
23.01.2011
Сообщения
234
Благодарностей
2
Баллы
0
what javascript do you suggest I write in this box?
 

rostonix

Известная личность
Регистрация
23.12.2011
Сообщения
29 067
Благодарностей
5 707
Баллы
113
But what I also don't know how to do is do an If Logical match statement where '{-Variable.article-}'!='' because If statements seem to fail either because the variable is a big article instead of a string, or because there are three strings involved instead of one.
Put text of your article to action that i mentioned and check here result of that action.
It will help you cause all unnecessary symbols and line breaks will be deleted.
 

hotohori

Client
Регистрация
10.02.2012
Сообщения
154
Благодарностей
40
Баллы
28
Like this?

Capture.JPG
 
  • Спасибо
Реакции: dyscus

rostonix

Известная личность
Регистрация
23.12.2011
Сообщения
29 067
Благодарностей
5 707
Баллы
113
Yes
 
  • Спасибо
Реакции: dyscus

hotohori

Client
Регистрация
10.02.2012
Сообщения
154
Благодарностей
40
Баллы
28
Thanks! I was always thinking what is "Prepare Javascript" use for.
 

jp1

Client
Регистрация
23.01.2011
Сообщения
234
Благодарностей
2
Баллы
0
Yes, thanks. Now it works good!
 

Кто просматривает тему: (Всего: 1, Пользователи: 0, Гости: 1)