It happened these days (bad karma, for sure I’ve done a lot of crimes in my previous life…) to read some linguistics stuff [2] , [1].
What is briefly about (see [2] for joining me in the frustration…): those guys started to count words in different literary texts. Let say there is a set X of words (the literary piece). They actually perform some magic and detect a subset Y of distinct words. Pretty exciting…. but ok, all have hobbies. Well, it comes a new step now: it computes report(s)
K = card(Y)/card(X)*100, k= card(Y)/card(X)
“which corresponds to the lexical wealth in terms of percentage” ([2], p.3) … Tough enough for you?
Well, wait a little. Is more much than this.
They quote Zipf’s law in a Mandelbrot way:
N = A * exp(k, -FI)
where: exp(x,y) = “x at power y” (my innovation in the lack of some math writing tools inside wordpress editor…), and “A is a constant amplitude and FI exponent which should be characteristics
of a given author.” ([2], p.3).
The next step is even more deep: it computes throgh linear regreesion involved A and FI values. All related to initial X. Or set of Xes.
That’s all? Yes.
Take a deep breath and look here typical approach:
(A) in the beginning, paragraph “2. ZIPF’S LAW IN LITERATURE“:
“By assuming a power law behaviour for these quantities…” bla-bla-bla (…… some black-box in article…..)
and, you know what?:
(B) in “4. CONCLUDING REMARKS“: “We have shown, from the corpus analysis of the literary production of English authors and non-literary texts, that a power fractal law can be associated with the lexical wealth of the authors.“.
What sort of pseudo-science is this one..? Same as Dehmer papers I’ve quoted in Random graphs (1/4) : [1][2],[3].
Actually, this is about: a lot of pseudo scientists, stating different things without no start or end. Just an academical bla-bla-bla.
Coming back on these Brasilians guys, with fractals and so on: there are two possible approaches to present their work: (1) as a mathematical theorem, under several axioms, or (2) as a scientific theory.
We can agree that [2] is whatever one wants to be but 100% it is not a theorem. Is it a scientific theory? Supposing that its predictions are somehow to be taken into consideration, there is no minimal verifying of targeted hypothesis. Don’t believe me? Ok, brothers and sisters, let agree that there is a “statistical signature” in the sense of [2]. It should be easy to check: get several random texts (in the sense of meaningfull statistical sample) and let check it: belongs to poet Alpha? To big taler Beta? To …Dehmer (see my random graphs posts…plz,plz,plz!) ….?
Relax, none of these minimal things that should complete a decent approach were done.
Anyway, nasty type of articles. No one force us to read them, but there are a plenty of similar dumb ways of talking about nothing.
That’s all. Yeah!
By the way, I put in pdf bellow some of my early days approaches.. I almost was one of them!
Homework
[1]Entropy Gary Davis, Adam Callahan
[2]Fractal Power law in literary english L. L. Goncalves, L. B. Goncalves,
arXiv:cond-mat/0501361v2 [cond-mat.other] 3 Jun 2005
Regretable update Referrenced [1] was somehow misteriously assasinated after a couple of days of my post. It actually hosted the link to [2]. I don’t know, have we to thank them or maybe just to continue the “hunt”…?
Encouraging update (March 12th) Be proud of me, old and tough teachers from my childhood!
It is my turn to say : My name is Bond. James Bond.:I discovered how to put some math formulas inside wordpress.
Therefore, instead of (1) should stay (2):
| (1) |
|
(2) |
| |
| N = A * exp(k, -FI) |
|
|
| K = card(Y)/card(X)*100 |
|
|
| k= card(Y)/card(X) |
|
|
| |
I hunted it around, evaluate the best moment of letal attack, prepare my shot… and no chance for that damn formulas.
…….Or I should better try: …Doe. John Doe…? Nope. First one.

Copyright ©2009 http://marius09.wordpress.com
oh yeah!