Monday, 31 July 2017

The Vision Thing

A critique of the notion of perfect objectivity as represented by the 'Universal Language' and other similar notions.

The effect of this vision on the modern mind has for the last fewˋ centuries been progressive and profound. It shows, for instance, in the pervasive attachment of educated opinion in the West to the belief that unless moral principles can be shown to be “objective,” which is to say, somehow or other inherent in a “Nature” untouched by human hands, we have no option but to embrace a noncognitivism according to which morality is a tissue of subjective “feelings” or “commitments,” and as such immune from rational criticism. In another way it shows in the conviction, widespread in literary studies, that there is ultimately no distinction of value to be drawn between great literature and the most trivial piece of kitsch, as literature per se is a fantasy; a further layer of coloured illusion that we interpose between ourselves and the realities of which we would be glumly confined to speaking, did we but speak a language as hostile to fantasy as the Universal Language would be, and as such putative fragments of it as, say, the languages of the physical sciences already are.
(from: Word and World by Patricia Hanna and Bernard Harrison)


john doyle said...

Have you seen or heard about The Bestseller Code (2016)? The authors, a former acquisitions editor for a big US publishing house and a lit prof specializing in digital humanities, analyzed hundreds of measurable variables extracted from thousands of texts to construct an algorithm, which they call the "bestseller-ometer," that predicts with pretty good accuracy the likelihood that any newly released novel will go on to achieve bestseller status.

ombhurbhuva said...

I haven’t heard of this book. Is it itself a cynical exercise leaving out the basic missing element - writing talent? Check all the theme boxes you wish but if you can’t write distinctive prose you will probably fail. Hold on - The DaVinci Code. Is there any best seller that has been known to use it or suspected of using it. Amazon's kindle reader if you register with them, which I didn’t, logs all your underlining and all the books you finished or didn’t and how far you got etc. They will have a key to readability in that.

john doyle said...

The bestseller-ometer deemed Da Vinci Code a likely winner. As I recall it got high marks for thematic focus, episodes of interpersonal closeness, emotional ups and downs, and the protagonist taking action in pursuit of wants and needs. The authors claim that writers couldn't intentionally craft prose in conformity with the algo's preferences, but I wouldn't doubt that a manuscript could be edited so as to improve its score. Surely that's what the authors are or will soon be doing with their device: consulting with publishing houses on selection and editing to maximize the bestseller likelihoods of their portfolios.

I don't know if it's been done and, if so, how successfully, but similar techniques could be deployed to distinguish statistically between "great literature" and "trivial kitsch." The algo's universal language for explaining its findings might not make much intuitive sense to readers of natural languages, likely consisting of multivariate factor-analytic vectors, eigenvalues and eigenvectors, confidence intervals, etc.

john doyle said...

This isn't "perfect objectivity" though. Bestsellerness is a popularity indicator measured in financial terms, a metric of collective subjectivity; the algorithm's task is to identify whole books that have high probability of matching this aggregate financial metric. So too would be the case for an algorithm that identifies great literature: it would attempt to match, using data extracted from books and aggregating those data into complex patterns, the judgments of literary greatness as rendered by humans qualified to render such judgments. The algorithm refines itself across a large collection of test cases; then, after having achieved a high level of statistical correspondence between its judgments and those of the humans it is attempting to model, the algorithm is set loose on new texts. This process of machine learning isn't that different from human learning: by devoting careful reading many texts judged excellent by literary masters, students should gradually acquire the ability to discriminate between excellence and ordinariness in unfamiliar texts not taught in the classroom.

ombhurbhuva said...

Gregory Bateson in his neglected collection of essays Steps to an Ecology of Mind calls rote learning (machine learning?) Learning 1, Learning how to learn Learning 2 and awareness of context and altering of context Learning 3. Learning 4 he speculates might be super consciousness. The trouble with machine/computer learning is that it doesn’t get contexts or analogies and therefore poetry, wit and wisdom is beyond it. Dear Hillary was full to the brim with algos and was trumped (I’ll find my own way out) by an idiot savant.

john doyle said...

Echoing your post: Can a distinction be drawn between great literature and trivial kitsch by using scientific methods? Yes, almost surely it can, and probably it already has. Is this distinction meaningful? Well I found meaning in the distinctions an AI made between bestsellers and not. I would rather the authors had written a 20-page scientific article describing their methods and findings than a 260-page exposition in which they attempted to reduce the complex patterns embedded in the machine's algo and outputs to the simpler terms that humans can easily understand. For humans to derive more meaning from the algo would require them to learn to think more like the machine. Does the algo find the distinctions it makes meaningful to itself? No, it's just doing its job.

Regarding Trump and algos, see this article. A quote:

Pretty much every message that Trump put out was data-driven,” Alexander Nix remembers. On the day of the third presidential debate between Trump and Clinton, Trump’s team tested 175,000 different ad variations for his arguments, in order to find the right versions above all via Facebook. The messages differed for the most part only in microscopic details, in order to target the recipients in the optimal psychological way: different headings, colors, captions, with a photo or video. This fine-tuning reaches all the way down to the smallest groups, Nix explained in an interview with us. “We can address villages or apartment blocks in a targeted way. Even individuals.”

In grad school in the early 80s I served on the colloquium committee, which gave me the opportunity to go out to eat for free once every couple of months. Once we invited Daniel Robinson, a philosopher of psychology from Georgetown University, to speak. Over dinner we performed the obligatory going around the table ritual, with each student giving the honoree a brief summary of interests and research. Robinson found most of it rebarbative, making his objections known in caustic but amusing one-liners. By the time my turn came around I had learned that my job was to be Robinson's straight man. "You know," I informed him, "computers are being programmed to simulate human intelligence." Robinson's reply: "I wish I knew more humans who could simulate human intelligence."

john doyle said...

Reasoning by analogy has for some time been a topic of interest in AI because it obviates the need for the brute-force iterative processing of big data sets on which machine learning relies. I've not been following this line of research, but one gets a glimpse of incremental progress in the authors' description of the bestseller-ometer. There's a natural language comprehension program involved, which has to infer from the textual context which alternative meaning of a particular word is meant. This is accomplished largely by network associations between the target word and other words in close proximity. As an example the authors used the word "bar" -- does it refer to a saloon or a legal proceeding? If, say, words like "whiskey" and "music" show up in the immediate context the machine draws one conclusion; if "attorney" and "proceeding," then the other. (I feel an AI joke coming on: two lawyers walk into a bar...) Identifying emotional ups and downs in a text relies on a similar context network interpretation: when words like "cry" and "horror" and "crumple" cluster in adjacent paragraphs, the bestseller-ometer infers an emotional dip in the narrative flow.

Now I'll click the "I'm not a robot" button at the bottom of this page and send my comment on its way.

ombhurbhuva said...

I'm reading the motherboard article. It's also easy to find this:

Could it all be a little salting of the gold mine by Cambridge Analytica?
I'm a terrible cynic. I'm not on FB or Twitter and I may stop eating cookies

john doyle said...

"I would have believed in the efficiency of these shamanic manipulations had I not been the recipient of numerous e-mail messages from the Trump campaign that designated me as a "Big League Supporter" and doggedly asked for contributions and moral support, though I am disqualified as a Russian citizen."

This article was published a month after the election. In light of subsequent revelations (or is it fake news?), his being Russian might be precisely what qualified him as a Trump supporter.

Is the hype surrounding Trump's Facebook campaign justified? Surely the data miners could throw some empirical light on the subject. Trump's campaign focused intensively on swing states while accepting the virtual inevitability of either winning or losing other states. This sets up a naturally occurring experiment. The 2016 election data can be broken down by state and by district within each state. Detailed demographic data, including Facebook data, can also be compiled on the populations of those states and districts. An algorithmic district-by-district model could then be built that provides the closest statistical fit to the actual voting results, adjusting for demographics. Then, once the district results are modeled, add into the algo an additional variable that measures the degree of intensity of Facebook campaign intervention in each district. Does adding the Facebook-intensity variable increase the accuracy of the algo? If so, by how much -- how many more votes for Trump, how many fewer for Clinton? It seems highly likely that these sorts of analyses have been conducted by both parties, but that the results will remain confidential until there's a leak.

I have no Facebook or Twitter presence either, so solicitations directed at me were limited to mail, telephone, and neighborhood canvassing.