Pete Hindle

Pictures and stuff from a guy who likes coffee.

Tag: dms8002

Basic Tech III – Life, NCL.AC.UK and Everything

I realised today that I could have titled this “Life, the University, and Everything”, which would have worked a lot better. Hey ho.

The grandiose title of this piece could be read as a sign that I’m going to write about things other than relevant to the course. In general, I’m going to steer clear of that sort of approach in this piece of text. I am slightly tempted to do a cross-comparative chart of my mental state versus the deadline of this module, but the time for that sort of navel-gazing isn’t now, and this isn’t really the place. So, what to do with such a grandiose title?

I know: we’ll talk about Lev Manovich.

Manovich is famous for putting together two things. First, his book, the Language of New Media, which was an early foray into series notions about the academic reception of New Media artworks. It’s aligning of the concepts behind computing as being analogous to early cinema was a masterstroke of metaphor, allowing humanities departments the world over to finally get their head around the fact that yes, really, we are going to be using these computer things for artistic purposes and we better get used to it.

The other thing that Manovich is famous for is his de/reconstructed film software “Soft Cinema”, which puts into practice the more theoretical notions that he talks about in his book. This work was, in fact, shown in the Baltic at an early stage in it’s gestation, where I walked in and then promptly walked out again (having a very low tolerance for the sort of abstract narrative found in most art films).

But these are not the features of Manovich’s practice that I’m going to discuss here. In his recent work, Manovich has looked at the way that society is pressurising all information onto a digital plane, and concluded that as more raw data is available in this form, it is the practice of data-mining that will become valuable. This is a conclusion actually being reached independently in several different structures at the same time, by researchers working in different fields.

This polyphyletic idea is ideally suited to Manovich’s position as somebody who can talk about the practice of art and computers in a way that those working in other fields can’t. For instance, whilst both Martin Wattenburg and Ben Fry are creating, promoting, and even working as artists in these fields, they still do not have the necessary academic chutzpah to propel the idea under discussion out of the ballpark. They are, essentially, knocking the idea around between a few like-minded friends.

Franco Moretti is not a like-minded friend, nor is he particularly interested in what we would term “New Media” (from what I can make out, which should be regarded as limited). However, what he is interested in, as a leading left-wing literary critic, is a method of understanding texts. And, as Manovich would point out, these texts are merely data awaiting transmutation into a computerised form. Therefore, coming to the point and the birth of yet another instance of our polyphyletic idea, Moretti suggests the use of quantitative data analysis for literature in his book “Graphs, Maps, Trees”.

I find the fact that infovisualization is being suggested as a research tool in the humanities as particularly interesting, and when I attended a recent afterparty for a Newcastle University conference on Crime Fiction I had a chance to quiz those doing stylistic analysis of texts in other fields. It was regarded as impossible that a visual program could be analysed by a computer (not so, either by using jit.cv or by web services such as Mechanical Turk). But I’m not sure that these people were participating in leading edge research, and besides, I was being plied with mohitios at the time.

The final point of this is, however, that there will be an expanding bubble of interest around these themes of data-mining and the humanities, and that Newcastle University already has some projects and researchers that are interested in this field (by which I am not referring to myself, but rather people working within the English department whom I’ve met very briefly). There needs to be a way of gathering the tools, or creating accessible tools for these researchers, and as soon as possible, so that Moretti’s idea of quantitative tools for qualitative purposes can become a reality.

Having said that, I’m now ready to share my own set of quantitative tools. Be aware that this is a rough and ready – but working – version, and merely produces a small line-graph and a text files that counts specific words. In the next section of this (essay? Series of blog posts?) I’ll discuss the road not taken, by which I mean the false starts and horrific crushing disappointments of working in code.

orange_text_test

Basic Tech II – The Poptart at the End of the Universe

I’m eating a lot of pop tarts at the minute. So let’s have a quick diversion into the history of pop tarts.

Pop tarts are a form of sugary pastry sold by Kellogg’s. The achieved a small amount of notoriety in the early 1990’s, in the UK at least, for burning the hands of a kid who’d microwaved his. The site of this child, waving his bandaged hands at the camera, was probably a sigte of great pathos for many people. I never saw it. I did, however, hang out a lot in Stevenage, with people who knew the poor pop-tart scarred kid, and therefore pop tarts are forever linked in my mind to people doing unpitying renditions of the words “I didn’t know it was going to be hot” in the most moronic Stevenage accent possible.

According to wikipedia, the US Forces dropped 2.4 million pop-tarts on Afghanistan in 2001. Currently, you can only buy two out of the total of forty-three flavours of pop tarts in the UK, those being Chocolate and Strawberry. There is no information on wikipedia as to what flavours native Afghan’s received in 2001.

My presentation was done with the entire aim of reproducing my thinking structure. I did consider adding a distracting audio element to it as well, in order to allow people to experience the jarring cuts in concentration I seem to suffer, but I thought that it would be taking it a little far, and anyway, I needed to be talking about my program, rather than anything else. Sadly, as noted in part one, I didn’t have a great deal of success to talk about.

In terms of the presentation and it’s marking, I have to feel some regret that I couldn’t have produced a working version of the program at that point. Nor that I could show a clarity of aesthetic; however, I think my aesthetic sympathies during the course of the taught module have run more towards the conceptual idea as represented by text rather than the visual. What use is the visual in the age of repetitive machine-produce images? And how can the aesthetic idea compete against the barrage of the new?

I think that the best of communication in this period is to resort to clear thinking and simple communication. To that end, showing text to others is the clearest way, so that your thought processes can be evaluated in a simple way at the leisure of others. Whilst I have some issues with essays as a form of academic measurement, I do not have a problem with a longer or shorter form of idea and expression; where original thought can be laid out for the use of others.

It is this projects aim of laying out original thought in a special way that I have been aiming for. My lack of ability to achieve that with programming is not the issue for me, except in terms of earning marks (and without those marks I’ll not pass the course, something which does give me ‘the fear’). What I have achieved is an ability to deal with and understand lumps of text that I have generated. You can read the presentation as I planned it in the attached file, and I’ll discuss the knock-on affects of this text assimilation in part three, the grandiose titled “Life, NCL.AC.UK and Everything”.

PDF Download of Presentation:

basic_tech_pres
ADDENDUM: The reason that this article/blogpost talks about poptarts – as I forgot to mention why I spiralled off into such a diversion – is stress. During times of biological and physiological stress our bodies seek out sugary and fatty foods, which are not necessarily the best thing for us to eat at such times. However, having just moved, I treated myself to a packet of Pop Tarts whilst restocking my kitchen, and found myself hooked on their sugar content. In between starting woefully at code that I was growing to loath, I occupied the vacant space in my belly with glucose and inverted syrup, and felt almost like a true hacker (although the last true hacker I met was just coming off a vegetarian rice-only diet, which is far away from the stereotypical programmers food consumption as depicted in media).

Basic Tech I – (The Hitchhikers Guide to Regex)

The current state of my Basic Techniques project is this:

It doesn’t work.

However, this is a defeatist attitude. Not quite as defeatist as I’ve been considering (it doesn’t work, I’m never going to understand regex, and I’m going to stop bothering with programming being the other considered viewpoint).

On the other hand, sometimes the ways that it doesn’t work make no sense to me. For instance, one piece of code I wrote matched the string being read to a specific string, and incremented a counter once using the ‘++’ function. Except that it didn’t, it decided to increment the counter 559 times, and then it decided that all the words I was looking for were all there, 559 times.

Back to the drawing board from that code then. I really thought that loop was going to work as well; it had all the indications of when and where, as it cycled through the newly created string array that contained the compartmentalised (granulised?) longer string.

Then, when that failed I was back at regex. And I now hate regex deeply and purely, for being such a dense science that needs introduction. A big ‘thanks’ to everybody who pointed me at the same damn impenetrable tutorial. I sort of wish I’d chosen to do a project with Arduino controlled rockets instead, because whilst rocket science might have a reputation for being hard it never involves typing a string of impenetrable characters into a search box and hoping against hope that this would be the last leap. (http://www.youtube.com/watch?v=1XBwWAu2a5U)

Even the more seasoned programmers threw some askance glances at my code when they saw the way that splitTokens() works – ie, you throw all the tokens you want to use to split up the text together in a big line. For me, this was the lump of code ” ,.?!;: “, which I’d inherited from Daniel Shiffman’s example code in “Learning Processing”. This actually made a lot more sense to me than the output of match().

According to it’s documentation, Match() outputs an array if the sequence searched for matches what is in the inputted string. It outputs a an array “if the sequence did match, an array is returned. If there are groups (specified by sets of parentheses) in the regexp, then the contents of each will be returned in the array. Element [0] of a regexp match returns the entire matching string, and the match groups start at element [1] (the first group is [1], the second [2], and so on).”

Okay: first problem. Putting parentheses in doesn’t make it work with multiple choices. I guess we can swap over to matchAll() for that, but without multiple parentheses and therefore multiple choices, what point is the items returned as an array? It could, surely, be a yes/no answer? In fact, it returns an array which flummoxed me for several days as I realised that no matter what I put into the string as input, it always returned the same value. Two.

Searching for the word ‘will’ in the phrase “Inside will a tag, you will find will will content” will only ever return the value of two. Or rather, the value of ‘yes’ transmuted into ‘two’ by way of the length of an array, which is an entirely erroneous way of doing it. Almost as erroneous as the previous way of counting through the text as a string and looking at each individual part and then counting them (again, erroneously – to the tune of 559). Balls.

In my presentation – which I guess I’ll be covering in Basic Tech II (The Poptart at the End of the Universe) – I was told by Atau that I was only a half step away from solving a few of the problems. Maybe. I can see a functioning end to this problem, just not from here. Should I use the match function and the logic structure that I’ve been working on? There’s no guarantee that the logic structure will even work (559!) Five-five-nine! My least best guess is that my Macbook wants to emigrate to the People’s Republic of China and move to computing division 559.

Creating Displays with Processing

Again, this blog post is part of my basic techniques module, so you might not find this thrilling… casual readers might want to skip this blog post and come back later.

As part of my basic techniques module I wanted to work on something quite simple. I’ve broken it down into lots of smaller chunks, and this chunk that I’ve been working on refers to how you would move the data gathered from counting words within a text to a graphical display.

fake_values

Here I’ve used a ellipse to represent a number between zero and three-hundred and sixty. The finished project won’t have that sort of word limit, and the use of an ellipse would not be a good design feature, but for the purposes of this sketch it works pretty well. Some of the important parts are that the sketch communicates the values from inside a for loop (which, in the final project, will count through the words of the document as an array), it uses rollover style data display, and it has a selection of choices from which you can pick to display different data.

Obviously, this is a mock-up in several different ways, and the data doesn’t actually mean anything – it’s more an experiment to see the data and see how it would be crafted in Processing. Some things that I’m not happy with are the immense amount of code that it takes to do the rollover affect (should that be a class by itself?) and the placement of the text above the smaller arcs. I also think that the actually main display could be better, by rounding off the value to a straight int rather than a four-point decimal.

Click on the image above to see the sketch in action and to download the source code.

Data Mining Yourself as Artistic Practice

After my presentation for DMS8002 (the Basic Techniques module that I’m doing on my course) it was suggested to me that using my coursework as a platform to generate visualisations actually means that – to some extent – the artwork I’m creating is a reflection upon the work I’m doing for my major module.

This reminded me of the Mail Trends project, which takes the contents of an IMAP-enabled email address and combs it for information. With that information, it then produces a bunch of graphs relating to the usage of the email address. Above, you can see that I’m unlikely to send you an email at 6am in the morning. Below, you can see that I’m in touch with Brian Degger a lot. But, strangely, not as much as I’m in touch with myself… Why my own name comes up more than anybody elses, I’m not sure; this might be a side effect of having between two or three other email accounts plumbed into my Gmail account.

This project is interesting to me as it gives you the chance to look at a body of work you produce, but it’s a body of work that you produce by accident. Artistically, the output isn’t fantastic; it has colour and shape, but those are really secondary concerns as to displaying the data graphically. The Feltron Report stands at the other end of this sort of practice; it’s the work of an artist who obsessively records whatever he does and produces an annual report on his activity.