Wednesday 18 July 2018

Creating a proxy (parasite) project

Notes for the Oracc workshop participants at RAI64, Innsbruck, July 2018. Updated for the Oracc for online teaching workshop, August 2020

Setting up

  1. I've emailed you your password and userID. Let me know if you haven't got it or if it doesn't work!
  2. Get yourself a free FTP programme such as Fugu, Cyberduck or WinSCP. You'll use this a lot to upload stuff to Oracc. Make sure you use SFTP (secure FTP) when you log in to build-oracc.museum.upenn.edu (you can't log in with insecure FTP.)
  3. You'll also need to connect to Oracc with a line-command unix terminal, to give commands to Oracc. Read this page for basic instructions on how to find and use one, including how to change your Oracc password.

Basics

  1. Creating a list of texts for your corpus
    • On your own computer, make a list of the P-numbers and/or Q-numbers you want in your corpus. Follow these instructions (but ignore the last sentence as I've already done that for you).
    • Save it as a text-only file with the name proxy.lst.
  2. Uploading your text-list to Oracc
    • Open your FTP programme and log in to build-oracc.museum.upenn.edu on the SFTP setting. Use your Oracc user ID and password.
    • You'll see a list of folders, most of which begin with 00. Look for 00lib and open it by double-clicking.
    • Drag and drop your proxy.lst file into this folder. You can edit and replace it as many times as you like, but make sure the file name stays the same.
  3. Building your corpus
    • Open Terminal or PuTTy and type: ssh [your userID]@build-oracc.museum.upenn.edu. Press return, enter your password, and press return again.
    • The first time you do this you'll get a scary message about security. Typ yes and continue.
    • Now you're in Oracc. Type oracc build clean and press return to build your project from proxy.lst.
    • When the process has finished, see what it looks like at http://build-oracc.museum.upenn.edu/[your user ID].

You can repeat steps 4–6 as many times as you like.

Next steps

  1. Organising your corpus into lists
    • If you want your texts to appear all together online but want to link to groups of them from your VLE, for instance to set weekly readings, you have two choices.
    • Both are described on Oracc here.
  2. Organising your corpus into subprojects
    • If you're using Oracc for several different classes, or want to keep each week's corpus separate, then you can very simply set up one or more subprojects. Each of them behaves like its own project. 
    • See the Oracc website for more information
  3. Building your glossary/glossaries
    • In the Terminal or PuTTy, type: oracc harvest. Press return. This command "harvests" all the available lemmatisations in your corpus across all ancient languages and the proper nouns. It will also generate a list of glossaries with the number of new entries in each.
    • Merge those entries, glossary by glossary, by typing: oracc merge [lang-code], e.g., oracc merge sux or oracc merge akk-x-stdbab or oracc merge qpn and press return.
    • If you want, there are more detailed instructions on the Oracc website.
    • Follow the instructions under step 4 to build the new glossary entries into the website.

Going further

  1. Managing the look and feel of your corpus
  2. Creating an XTML portal page/site for your corpus
  3. Editing your corpus
Contact me for a further tutorial on any or all of these things — and/or read up on the Oracc website.
Also contact me if any of this is wrong or confusing please!

No comments:

Post a Comment