Radical Saussurean semantics?

I have been following Benjamin Schmidt‘s posts (and here and here) about word-embedding models (WEMs). I don’t claim to have any grasp of the underlying mathematics, but the results are very interesting – I’m planning to download the R-package and do some playing with text that I am working on with Brian Zuccala (when I have time [he said optimistically]).

Here I just wanted to draw attention to one foundational aspect of this approach. Schmidt comments:

The question that word embedding models ask is: what if we could model all relationship between words as spatial ones? Or put another way: how can we reduce words into a field where they are purely defined by their relations?

This seems to me to be a very Saussurean approach to semantics – each word is defined by its place in the system, and that is in turn defined by the relationships (especially the differences) between the target word and all other words. The problem for Saussurean semantics has been the scale of the task of establishing all those relationships, except in circumscribed domains such as pronouns. But if that task can be handled by machine learning procedures, then suddenly this is a viable approach! Of course there are problems: the learning data will only ever be a snapshot; very rare words may not occur; are numerical estimates of relationships the one’s we really want (and I’m sure there are others I’m not thinking of yet). Using a very large corpus should give some traction on these sort of problems, but even acknowledging them, these methods seem like potentially a huge advance for empirical semantics.


More on citation styles

Following from my last post, I decided I should have a crack at editing a stylesheet for Zotero. I’m working on a paper at the moment where I rely a good deal on a seventeenth century text via two modern editions. The lack of a field for original date of publication is a long-standing issue for Zotero users; there is a work-around which enters the additional information in the Zotero ‘Extras’ field (see the bottom of p2 in the forum), but it means adding a few lines to stylesheets which have not been tweaked already. My normal style is the Unified Stylesheet for Linguistics Journals, which is untweaked as yet.

So I set to work using the Zotero Style Editor, following the advice provided in the forum linked above (adamsmith Dec 24 2014). And it all went well – because the Style Editor is dynamic, it’s actually very easy to track what you are doing and experiment with the style. I found the macro I needed to adjust, I added the snippet of code from the forum post (which recovers the original date information from the database) and I added “prefix” and “suffix” attributes to get the format I wanted. The results look like this:

(Rumphius 1983 [1648])

Rumphius, G.E. 1983 [1648]. Ambonsche Landbeschrijving. (Ed.) Z.J. Manusama. Jakarta: Arsip Nasional Republik Indonesia.


(Rumphius 2002 [1648])

Rumphius, G.E. 2002 [1648]. De Ambonse Eilanden Onder de VOC Zoals Opgetekend in de Ambonese Landbeschrijving. (Ed.) Chris Frans van Fraassen & Hans Straver. Utrecht: Landelijk Steunpunt Educatie Molukkers.
I am very happy to share the revised stylesheet with anyone who is interested. And I’m feeling much braver about the next tussle with a publisher – I think I have a decent chance of making changes in a stylesheet to match a house style. But of course that assumes that the publisher can tell me what style to start from…….