Simple Interlinear Glosses Shortcode Plugin for WordPress

[Last updated: 2012-02-16] For a long time I’ve been slightly annoyed of formatting interlinear glosses in HTML by hand. I had hoped that there would be a plugin for WordPress at least, as a widely used content management system, that would do things automatically and to my liking. But as far as I can tell, nobody has published anything like that so far. Thus, I finally tried and programmed a shortcode plugin for very simple interlinear glosses myself, hoping that it may be useful for others, too. Especially blogging linguists and conlangers, of course :)

Now, what does this thing do? Basically, what you enter in your post or page editing pane is:

[​gloss​]Ang kamayan para dagās vala, bahu ca!
ang kama-yan para daga-as vala, bahu ca!
AF be_as_as-3PM quick turtle-P lovely, shout-IMP 3PM.LOC[​/​gloss​]
‘If they are as quick as a lovely turtle, shout at them!’

And get out:

Ang
ang
AF
kamayan
kama-yan
be_as_as-3PM
para
para
quick
dagās
daga-as
turtle-P
vala,
vala,
lovely,
bahu
bahu
shout-IMP
ca!
ca!
3PM.LOC

‘If they are as quick as a lovely turtle, shout at them!’

If you want to know the details, the HTML behind this formatted gloss looks basically like this:

<​div​ class="cb-gloss">
  <​dl​ style="display: inline-block;">
    <​dt​>Ang<​/dt​>
    <​dd​>ang<​/dd​>
    <​dd​>AF<​/dd​>
  <​/dl​>
<​/div​>

The formatting of elements can be adjusted in the style.css file in the plugin’s folder. Basic CSS style functionality is hardwired so that columns ought to show up as columns next to each other and not under each other even when the article you include the gloss in is accessed by RSS and the reader doesn’t load all CSS definitions of your website – provided that your RSS reader doesn’t strip CSS or even HTML completely, of course.

The shortcode does not demand a certain number of lines, but simply maps the first, second, third etc. word of the first line you enter to the first, second, third etc. word of the subsequent lines. The main option the shortcode takes as an argument is div="…", which sets the dividing character for the individual columns. For example, if you choose div=" / " as your divider, the individual lines will be split at that sequence:

[​gloss div=" / "​]Ang / kamayan
ang / kama-yan
AF / be_as_as-3PM[​/​gloss​]

becomes

Ang
ang
AF
kamayan
kama-yan
be_as_as-3PM

If nothing is set, a blank space will be assumed. However, you can also assign CSS classes to individual lines by simply putting their names into the tag part, so for example [​gloss "&​nbsp;" ipa small​] will assign no class to the first line, the class ipa to the second line and the class small to the third. You can also add the most basic styling to individual elements in the [​gloss​] block, e.g. to emphasize certain parts:

[​gloss​]<​strong​>Example<​/strong​>[​/gloss​]

becomes

​Example​

Something that is done automatically is turning what is guessed to be an abbreviation for a functional morpheme into small capitals. The script assumes that every sequence of more than two capital letters, numbers, or punctuation marks is to be put in small caps. If you want to circumvent that, for example if acronyms are involved, you need to put the sequence into backticks, for example `CEO`. You can also put things into small caps explicitly by using the pipe character around the sequence, for example |text|. Also compare the following illustration:

[​gloss​]3SG\MASC.AGT CEO `CEO` text |text|[​/​gloss​]

becomes

3SG\MASC.AGT
CEO
CEO
text
text

Another function that uses bracketification is grouping multiple words to one unit. In this case, you will need to put curly brackets around the text, for example {like this}. What the script does is to replace the regular space with a non-breaking one (i.e. &​nbsp; or U+00A0). For example:

[​gloss​]ang kamayan para
ang kama-yan para
AF be_as_as-3PM quick
Ø {they are as} quick[​/​gloss​]

becomes

ang
ang
AF
Ø
kamayan
kama-yan
be_as_as-3PM
they are as
para
para
quick
quick

As you can see, handling this is rather simple. You may of course tweak the code to suit your needs, since the plugin is open-source and has been released under GPLv2 as required by WordPress. I don’t consider myself a programmer, so if you find awkward coding that direly needs improvement, you may of course tell me, especially if bugs are involved.

Download at WordPress.org

To install, download the file, unzip it and upload the folder simple-interlinear-glosses to ./wp-contents/plugins. You can then log in to WordPress as an administrator and activate the plugin on the plugin options page. I tested this only with WordPress 3.1.3 and 3.3.1, so I can’t give a guarantee that this plugin works with much older versions.


Known Bugs, Issues, and Limitations

  • Due to applying for the plugin repository at wordpress.org, I had to rename some files, also the plugin folder. Just delete your cb-glosses folder, upload the simple-interlinear-glosses one and reactivate the plugin. (2012-02-15/16)
  • HTML tags like <​a​> and <​span​> aren’t yet supported because I need to figure out how to make the script not match punctuation inside the tags. All HTML tags but <​strong​><​b​><​em​><​i​><​s​><​strike​><​u​><​big​><​small​><​sup​><​sub​> are currently stripped. (2012-02-15)
  • Overwriting the files will remove your settings in style.css when you’re updating because the plugin is simple and doesn’t make use of WordPress’s database. (2012-02-15)
  • Unfortunate, but important drawback: WordPress.com doesn’t allow you to use custom plugins on their servers in the unpaid version :( (2012-02-14)

2012-02-16:
Revision of 2nd release (0.2.1)

  • Upload to WordPress.org
  • Fixed some typos in readme.txt
  • Fixed some typos in README.pdf

2012-02-15:
2nd release (0.2)

2012-02-14:

  • FIX: Allow markup tags for HTML in glosses: Do not recognize the sequence “<​/" as a trigger for small caps.
  • FIX: HTML tags also get a zero-width space (U+200B) around them now so as not to collide with the function that turns things into small caps. Support only for basic styling tags, i.e. <​strong​><​b​><​em​><​i​><​s​><​strike​><​u​><​big​><​small​><​sup​><​sub​>. (FIXME)
  • FIX: Process only |this|, but not | this | and don’t eat the |.

2012-02-13:
Initial release (0.1)

7 thoughts on “Simple Interlinear Glosses Shortcode Plugin for WordPress

  1. Before I even finish reading this blog post, let me say THANK YOU, CARSTEN! I’ll make sure that every conlanger that uses WordPress knows about this plugin by the end of the day.

  2. Just an update: My own RSS reader (the one that comes with Apple’s Mail application) does break up the coding, so single lines get broken up (so each set of words is its own table). To me, this is no big deal, because I use the reader to get a gist of the article, and if I’m interested, I go to the main site and read it there (as it was intended to be read). But I just thought I’d note it. Again, incredible work!

    • Hi David,

      thanks for advertising (also on Conlang-L) ;) Well, to my shame I must admit that I’ve basically only tested this in my browser before, while I was programming, though after I had published the article I was happy that it worked in my RSS reader as well — that is, Google Reader, which is browser-based. As I’ve already written in the description of the plugin, the formatting is done entirely with CSS and the reason this doesn’t work in your reader is likely because it somehow does not process the hardwired styles, so you will be left without the nifty formatting. Although I’d expect an email application to process HTML, since emails and especially RSS feeds often contain this. Maybe it just doesn’t support display: inline-block? This is a little unfortunate because I had specifically looked for a way not to use proper HTML tables but dynamic alignment with CSS’s float or display: inline-block attributes so that you won’t need to worry about page widths because the overflow is wrapped to the next line automatically when it hits the right border. I guess the problem you describe is something I don’t know how to fix offhand :(

      • And just reiterate: I was simply reporting the problem; I don’t actually think it’s a major concern. A CSS solution is far preferable to using HTML tables. It looks to me like the Mail app does, indeed, strip CSS, while coding HTML correctly (so, e.g., if you use italics HTML tags, that shows up; if you use “font-style: italic;” in an associated CSS file, it doesn’t show up).

        Thinking long term, I don’t believe this is a problem that should be fixed. Ultimately, I believe CSS is going to enjoy broader support in more programs (not just web browsers), and eventually things like stand-alone RSS readers will catch up. Plus, it’s not as if the content is obscured—it’s still there, just spaced out a little.

        Anyway, look out for some interlinears in future posts from me. :)

  3. Pingback: A quote from Aurelian philosophy | jonafras.conlang.org

  4. Pingback: » Conlangery #39: Noun Incorporation Conlangery Podcast

  5. Pingback: » Rhaeshi Ajjalani Dothraki

Comments are closed.