acknak wrote:I really don't understand why it seems to be so hard to get this feature right.
But, acknak, no software
gets this feature right. Until we have a proper fuzzy-logic software that can understand and reproduce and apply real-world rules of language---no matter what language---a word-count or grammar tool or spell-checker is limited entirely to whatever logic humans can program into binary form as rules of language.
The saddest outcome of MS's attempts to do so is that people tend to regard computers as infallible and word-count utilities as equally infallible, by some phantom baseline measure that has no foundation in the real world. It's an agonizingly frustrating situation for people like me who are versed in both technology and language, when the majority of the world is grounded in either one or the other, and expectations on both sides are unrealistic.
But technology does creep closer, bit by bit (pardon the pun), as it improves upon itself. That's the way of progress.
We haven't arrived at the ideal language level yet, but there's no indication that we should actually want technology to process language information the same way (or better) that humans can. I personally believe that true language capability in machines is beyond the reach of technologies within the next ten years... but I also believe that even if machines can be taught the finely nuanced rules of language, into which culture is inextricably bound, that social, cultural, and economic balances will limit or eliminate such capabilities (very likely some time after those capabilities are offered to the public). Language is intrinsically human, and we humans could not, or perhaps should not, or perhaps cannot, promote any utilities of machines as a benefit in processing language. Computers are too limited, yet, even if we could imbue them with our personal culture.
Language isn't limited to one dictionary, one style guide and one set of rules. And computers are limited by inability to fully parse and process and understand language.
Word counts aren't, either. Who should decide what is or isn't a word?