Clean the file with millions of lines for matches in the blacklist, which also has millions of lines

LightWood

Moderator
Регистрация
04.11.2010
Сообщения
2 382
Благодарностей
915
Баллы
113
C#:
List<string> bad = project.Lists["stopList"].ToList();//stopwords
List<string> mix = project.Lists["allwordsList"].ToList();//fullbase of words
var good =project.Lists["resultList"];//result list
List<string> kostyl = new List<string>();//важный элемент индусского кода (not important. russian joke)
//блок очень сложного индусского кода  (not important too. russian joke too)
kostyl=mix.Except(bad).ToList();
foreach(string data in kostyl)
{good.Add(data);}
Author is Lexicon
For your pleasure, guys.

Source file 2.1mln lines
Blacklist 2.5mln lines
The result is 0.8mln lines

Time of template execution is 2 seconds !!!


More cool snippets here
http://zennolab.com/discussion/threads/meet-it-its-c-simple-fast-convenient-a-selection-of-snippets-inside.37340/
 

Кто просматривает тему: (Всего: 1, Пользователи: 0, Гости: 1)