Access Planning in Progress: Text Transcription Issues
Thu, Apr. 1st, 2010 03:41 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
![[community profile]](https://www.dreamwidth.org/img/silk/identity/community.png)
![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
![[community profile]](https://www.dreamwidth.org/img/silk/identity/community.png)
One element she's working on is real-time captioning, abbreviated as CART or RTC. SF/F cons provide a peculiarly challenging environment for real-time captioning: we tend to all talk at once; we talk over each other; we use plenty of made-up words, names, and acronyms; and our discussions swoop unpredictably between grade-school humor and post-doc details (sometimes in one sentence).
CART is created by a highly trained steno-captionist (court reporter) who uses a chording keyboard to transcribe what speakers say, sound for sound. Computer software translates this into text, which is projected on a screen behind the speaker. This phonic-based system means that CART transcribers do best when they can program in names, neologisms, and acronyms in advance. Without that advance prep, ER SUE LA LUG WIN and I SACK AS HIM OFF might be showing up in a panel discussion. On the plus side, the CART transcript is verbatim, which creates a good record of the event.
There's another approach to text-based transcription: "meaning for meaning" or "m4m" systems. At present there are two in the U.S.: TypeWell and C-PRINT. Both provide online training which prepares a transcriber in 60 hours or less. The transcriber uses a standard laptop with extensive abbreviation-expansion software, and basically liveblogs the event. The same concerns arise with personal names; the finished transcript is briefer and hopefully meatier. RTC stenocaptionists earn a minimum of $120/hour; TypeWell transcribers start at around $50/hour.
You can read a spirited discussion of the pros and cons of CART and TypeWell in the college classroom at Deafness section at About.com. Jamie Berke has been editing this section for decades, and she totally knows her stuff.
Finally, here's a good elevator overview of the assistive technologies most helpful for people who have hearing impairments.
Sign language interpreters is a whole 'nother post.
(no subject)
Date: 2010-04-03 03:37 am (UTC)(no subject)
Date: 2010-04-08 10:56 pm (UTC)(no subject)
Date: 2010-04-08 11:35 pm (UTC)And again, it depends on the skill of the captioner-- typing skill for stenotyping, enunciation for voicewriting. CapTel uses voicewriting, and I've had some calls that were transcribed awesomely and some where there were constant mis-transcriptions.
(no subject)
Date: 2010-04-08 08:13 am (UTC)Though depending on the con, it seems that that would be a strikingly inefficient way to actually do it, since it's a huge body of words and some of them are just regular words with idiosyncratic definitions, and some of them are fandom-specific and not likely to come up at a more focused con.
If you took all the words from the last X months of journals from some of the attending fen in question, and/or journals of that segment of fandom (or representative samples of any/all identifiable segments of fandom that will be attending), sorted the words in order of frequency, compared it to a basic dictionary / that operator's dictionary, dumped all the duplicates, and then looked at the non-duplicates in order of frequency ...
This has me thinking about what words that a hypothetical CART operator would have to program into their dictionary if they were attending a Dreamwidth developer/volunteer/enthusiast meetup. I don't know whether that situation will ever come up, but I'm going to start keeping this in mind when maintaining the internal jargon page.
(no subject)
Date: 2010-04-08 10:58 pm (UTC)There's a project to standardize science, technology and math ASL signs at
http://aslstem.cs.washington.edu
(no subject)
Date: 2010-04-08 12:08 pm (UTC)I'll add a couple of more specifics here...
1) While what we're looking at is SF-con based, it is not actually a SF con - we're looking at a small Pagan hotel-based convention.
2) Which means we still have language that will be unfamiliar to the transcriptionist. (But that we can almost certainly provide a pretty accurate dictionary to code in advance, once we know which specific topics are going to be transcribed)
3) But which also means that word-for-word is actually likely to be substantially more accurate for us than meaning-for-meaning, unless we happened to find someone who's got the transcription skills *and* the Pagan community background knowledge.
4) We're expecting there will be a keynote, and that probably 2/3 or so of the events will be single presenter + audience, rather than a panel. (I expect that for cost reasons, we're looking at transcription for the keynote, one other large event, and we might be able to provide it for other events if the budget and attendance goes the way I hope, because we just got a *very* nice bid from the hotel I really wanted.) But that also makes transcription somewhat easier.
5) We're lucky to already know an excellent CART provider locally. A good friend of mine had major surgery last summer, and had CART services provided during her hospital stay and rehab, as she's adult-hearing-loss and really wanted to be sure she understood conversations with doctors, PT, etc. (If you're in Minneapolis/St. Paul Minnesota: http://www.paradigmreporting.com/ )
We had all of the transcriptionists from that company who do anything other than court reporting over the course of those weeks, and they were a) all neat people who were highly competent at what they were doing and b) did really well with our casual conversation (which included wide-ranging topics in both SF and Paganism, actually) once we gave them a few basic terms and concepts to run with. The company also does art events, so they're used to doing things that aren't straight business meetings, medical info, etc. (From various conversations: not all companies are, and folks who *only* do business/medical are likely to have a harder time getting up to speed in other types of conversation at first.)
Which means we're pretty sure they'd be handle what we're looking for competently (and from conversations, because we were already thinking about this last summer, would probably find it interesting and intriguing, which is especially nice.)
But I'm also fascinated by the other approaches to transcription, so the info was really handy.
(no subject)
Date: 2010-04-08 11:01 pm (UTC)Good point that in a sacred, ceremonial context, word-for-word better reflects the meaning of the proceedings than any m4m.
And triple yay! for good hotel bids!