{"id":2929,"date":"2012-09-05T18:25:41","date_gmt":"2012-09-05T15:25:41","guid":{"rendered":"http:\/\/daimon.me\/blog\/?p=2929"},"modified":"2012-09-05T18:48:02","modified_gmt":"2012-09-05T15:48:02","slug":"dupa-5-luni-zelist-2","status":"publish","type":"post","link":"http:\/\/daimon.me\/blog\/2012\/09\/dupa-5-luni-zelist-2\/","title":{"rendered":"Dup\u0103 5 luni, Zelist &#8230; #2"},"content":{"rendered":"<div class=\"entry\">\n<p style=\"text-align: justify;\">\u00cen <a href=\"https:\/\/daimon.me\/blog\/2012\/09\/dupa-5-luni-zelist\/\" target=\"_blank\">articolul precedent<\/a> am explicat cum se pot extrage date din paginile web ale saitului Zelist, cu demonstra\u0163ie practic\u0103 ob\u0163inerea topului curent \u00een format text. Ast\u0103zi voi chiar \u015fi folosi datele respective.<\/p>\n<p style=\"text-align: center;\">~*~<\/p>\n<p style=\"text-align: justify;\">Spre exemplu, s\u0103 satisfacem curiozitatea de la care a pornit \u00eentreaga discu\u0163ie: c\u00e2te saituri care ap\u0103reau \u00een Aprilie nu mai apar ast\u0103zi deloc \u00een top? Este chiar simplu de aflat:<\/p>\n<blockquote><p>sort f4.txt &gt; present ((comanda sort e necesar\u0103 pentru a putea efectua comm))<br \/>\nsort forig.txt &gt; past ((\u00eenainte de a rula, am adus fi\u015fierul cu rezultatele din Aprilie \u00een dosarul de lucru \u015fi l-am numit forig.txt))<\/p>\n<p>comm past present -23 &gt; pastonly ((comanda comm d\u0103 3 seturi de date: liniile specifice primului fi\u015fier, liniile specifice celui de-al doilea fi\u015fier, respectiv liniile prezente \u00een ambele ; parametrul -23 elimin\u0103 deci seturile 2 \u015fi 3, l\u0103s\u00e2ndu-ne cu domeniile ce \u00eentre timp au fost scoase))<br \/>\ncomm past present -13 &gt; presentonly ((aici r\u0103m\u00e2nem cu domeniile ce au fost ad\u0103ugate))<\/p><\/blockquote>\n<p style=\"text-align: justify;\">Deschiz\u00e2nd fi\u015fierele \u00een Notepad++, g\u0103sim c\u0103 fi\u015fierul <span style=\"color: #800000;\">pastonly<\/span> are 477 de linii. Admi\u0163\u00e2nd ipoteza c\u0103 exist\u0103 motiv \u00eentemeiat pentru a scoate domenii din top (e.g. a disp\u0103rut blogul de suficient\u0103 vreme), ne rezult\u0103 c\u0103 dispar ~95 de bloguri \u00eentr-o lun\u0103. L\u0103s\u0103m ca exerci\u0163iu cititorului s\u0103 verifice datele din fi\u015fier \u015fi s\u0103 confirme sau infirme ipoteza.<\/p>\n<p style=\"text-align: justify;\">Studiind fi\u015fierul presentonly, descoperim 1377 de noi bloguri, adic\u0103 ~275 bloguri noi care au ap\u0103rut \u00een fiecare lun\u0103. Iat\u0103 deci c\u0103 sporul natural, dac\u0103 putem vorbi de a\u015fa ceva, este pozitiv \u00een cazul blogurilor, \u015fi anume ~180 bloguri noi lunar! Desigur, admi\u0163\u00e2nd c\u0103 Treeworks au p\u0103ianjeni (<em>spiders<\/em>) care ating toate col\u0163urile internetului de limb\u0103 rom\u00e2n\u0103, f\u0103r\u0103 laten\u0163\u0103 semnificativ\u0103; l\u0103s\u0103m exerci\u0163iu cititorului s\u0103 apere sau s\u0103 infirme presupunerea. Mai sunt \u015fi alte presupuneri implicite \u00een aceste calcule, ca atare numerele se vor lua cu un gram de sare.<\/p>\n<p style=\"text-align: justify;\">De final, s\u0103 remarc\u0103m c\u0103 num\u0103r\u0103toarea liniilor se poate face \u015fi cu comanda wc a linux:<\/p>\n<blockquote><p>wc -l pastonly ((parametrul -l cere num\u0103rarea liniilor))<br \/>\n477 pastonly ((\u00een lipsa altor preciz\u0103ri, rezultatele comenzii vor fi afi\u015fate pe ecran))<\/p>\n<p>wc -l presentonly<br \/>\n1377 presentonly<\/p>\n<p>wc -l forig.txt<br \/>\n63932 forig.txt<\/p><\/blockquote>\n<p style=\"text-align: justify;\">Dac\u0103 tot suntem la faza asta, num\u0103r\u0103m c\u00e2te bloguri avea Zelist \u00een Aprilie, \u015fi vedem care-s procentele: au disp\u0103rut 477\/63932, adic\u0103 <span style=\"color: #800000;\">0,746%<\/span>. Idem, raportat la num\u0103rul de atunci, vedem c\u0103 au ap\u0103rut 1377\/63932, deci sunt <span style=\"color: #800000;\">2,153%<\/span> bloguri noi \u00een 5 luni. E pu\u0163in ca rat\u0103 de cre\u015ftere? L\u0103s\u0103m pe al\u0163ii s\u0103 decid\u0103 ((am ad\u0103ugat <a href=\"http:\/\/daimon.me\/storage\/zr310812.rar\" target=\"_blank\">\u00een arhiva<\/a> de pe server fi\u015fierele rezultate \u00een urma comm)).<\/p>\n<p style=\"text-align: justify;\">Cam at\u00e2t pentru ast\u0103zi, mul\u0163umiri pentru aten\u0163ie, etc.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>\u00cen articolul precedent am explicat cum se pot extrage date din paginile web ale saitului Zelist, cu demonstra\u0163ie practic\u0103 ob\u0163inerea topului curent \u00een format text. Ast\u0103zi voi chiar \u015fi folosi datele respective. ~*~ Spre exemplu, s\u0103 satisfacem curiozitatea de la care a pornit \u00eentreaga discu\u0163ie: c\u00e2te saituri care ap\u0103reau \u00een Aprilie nu mai apar ast\u0103zi &#8230;<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[40],"tags":[],"class_list":{"0":"post-2929","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-internet-si-tehnica","7":"anons"},"_links":{"self":[{"href":"http:\/\/daimon.me\/blog\/wp-json\/wp\/v2\/posts\/2929","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/daimon.me\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/daimon.me\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/daimon.me\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"http:\/\/daimon.me\/blog\/wp-json\/wp\/v2\/comments?post=2929"}],"version-history":[{"count":1,"href":"http:\/\/daimon.me\/blog\/wp-json\/wp\/v2\/posts\/2929\/revisions"}],"predecessor-version":[{"id":2931,"href":"http:\/\/daimon.me\/blog\/wp-json\/wp\/v2\/posts\/2929\/revisions\/2931"}],"wp:attachment":[{"href":"http:\/\/daimon.me\/blog\/wp-json\/wp\/v2\/media?parent=2929"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/daimon.me\/blog\/wp-json\/wp\/v2\/categories?post=2929"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/daimon.me\/blog\/wp-json\/wp\/v2\/tags?post=2929"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}