OAQA / OpenQA Setup Guide

We consider this guide obsolete as we think we have a nicer software available now - see BlanQA!

BlanQA has nicer source code, is easier to clone and setup (no Indri JNI!), can run on top of a variety of text corpora (e.g. Wikipedia!), features interactive question-answering mode (and IRC gateway) and we are working to make it even better. Its algorithms are not as advanced as helloqa-prototype (below) yet, though.

This is a step-by-step setup guide for your very own IBM Watson-like software!

Here, we will set up a Linux-based, open-source question answering system that can process free text corpus and free-form English questions and answer them (with some low precision). Note that its performance and precision is not anything like DeepQA / IBM Watson obviously, but it's a start and it is built on the same foundations as the IBM project.

Note that the outcome is still an executable that is not user-friendly at all, it just spews a lot of cryptic output, not load wikipedia and start a nice conversation window. It answers questions based on batch files and you will need to wade through its debug output to figure out what's going on and what its answers are. It's just all very experimental at this point.

OpenQA is a work mainly done at CMU (probably indirectly supported by IBM). It is being tweaked for user-friendlier setup by Pasky @ brmlab. OpenQA uses UIMA by IBM+Apache, Indri by the Lemur project, etc.

See the brmson project page to learn more about the results of our research on state-of-art open source QA systems.

Installation

Anyway, to get this running on your (Debian Wheezy or some-such) machine, it should be enough to follow these steps, command by command: