[[Brmson]]
 

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
project:brmson [2014/01/03 20:30]
pasky
project:brmson [2015/04/28 20:35] (current)
pasky
Line 1: Line 1:
-====== Brmson ======+ ====== Brmson ======
  
 {{template>​infobox| {{template>​infobox|
Line 11: Line 11:
 }} }}
  
-Our very own [[wp>IBM Watson]] - approximated using open source technology, replicated in the brmlab hackerspace ​environment+Our very own [[wp>IBM Watson]] - approximated using open source technology, replicated in non-supercomputer ​environment.
- +
-The first goal is to build a system that can chew on few open semantic databases, Wikipedia and Sbrm and then be able to answer general questions like "List the biggest nuclear explosions in Russia"​ or "When do I pay brmlab membership fees?" or "What was the best time travel movie?"​. The primary language in this phase will be English.+
  
 +The goal is to build a system that can chew on few open semantic databases, Wikipedia and Sbrm and then be able to answer general questions like "List the biggest nuclear explosions in Russia"​ or "When do I pay brmlab membership fees?" or "What was the best time travel movie?"​. The primary language in this phase will be English.
 Then, we can take things further - start supporting more advanced inference (|What resistance do I need in series with a random red LED on 5V?"), add some autonomous goal-based processing etc. Only the Strong AI is the limit! Then, we can take things further - start supporting more advanced inference (|What resistance do I need in series with a random red LED on 5V?"), add some autonomous goal-based processing etc. Only the Strong AI is the limit!
  
-The current primary focus is on working software stack (regardless of speed)then we can think of how to speed this up using cluster technologies.+We already **have ​working software stack** with reasonable performancecurrently focusing on reviewing it for bugs and wrapping it up for a milestone scientific paper publication. Speed has not been focus so far. Accuracy on our 430 trivia question testset is a little above 30% as of Jan 2015.
  
 We aim to do as //little// coding as possible, at least initially, instead focusing on integration of existing technologies. Most impressive initial results in the shortest time! :-) We aim to do as //little// coding as possible, at least initially, instead focusing on integration of existing technologies. Most impressive initial results in the shortest time! :-)
 +
 +**Homepage: [[http://​ailao.eu/​yodaqa/​]]**
 +
 +**Live demo: [[http://​live.ailao.eu/​]]**
 +
 +Pre-print of the first paper on brmson: [[http://​pasky.or.cz/​dev/​brmson/​yodaqa-poster2015.pdf]]
 +
 +(Original story on Pasky'​s blog: [[http://​log.or.cz/?​p=317]])
 +
 +===== Status and Planning =====
 +
 +A question-answering engine "​[[https://​github.com/​brmson/​yodaqa|YodaQA]]"​ (custom-made from the ground up) by Pasky is set up at his home server (AMD FX-8350, 24G RAM), together with enwiki fulltext index and dbpedia. It is connected to IRC and hangs out at #brmson, freenode.
 +
 +All our code incl. documentation and setup instructions is **open source** lives in the [[https://​github.com/​brmson|github brmson organization]].
 +
 +===== (Historical) Knowledge Base =====
  
 Starting points: Starting points:
Line 30: Line 45:
  
 In bold are our current choices that we are running with. In bold are our current choices that we are running with.
- 
-===== Status ===== 
- 
-UIMA toolchain and OAQA HelloQA example is being set up at brmson.dyn.brm (virtual machine on sargon). 
- 
-===== Plan ===== 
- 
-Next steps: 
-  * Study OAQA more 
-    * Set up UIMA, few data sources and OAQA somewhere 
-    * Try it out! 
- 
-===== Knolwedge Base ===== 
  
 ==== Data Sources ==== ==== Data Sources ====
Line 53: Line 55:
     * DBPedia     * DBPedia
   * Unstructured:​   * Unstructured:​
-    * Wikipedia, Everything2 ?, Wikitionary+    ​* **Wikipedia**, Everything2 ?, Wikitionary 
 +    * TVTropes, Urban Dictionary
     * News articles (theregister,​ /., bbc, cnn, reuters)     * News articles (theregister,​ /., bbc, cnn, reuters)
     * Sbrm, laws, patents, ...     * Sbrm, laws, patents, ...
Line 76: Line 79:
  
 Off-the shelf solutions: Off-the shelf solutions:
-  * **OAQA** http://​oaqa.github.io/​+  * **OpenQA/OAQA** http://​oaqa.github.io/​
     * Opensource framework (on top of UIMA) that seems very close to actual IBM Watson tech; https://​github.com/​oaqa     * Opensource framework (on top of UIMA) that seems very close to actual IBM Watson tech; https://​github.com/​oaqa
     * Some inspiration may come from the old website? See e.g. https://​mu.lti.cs.cmu.edu/​trac/​oaqa/​wiki/​OAQADocumentation/​Architecture     * Some inspiration may come from the old website? See e.g. https://​mu.lti.cs.cmu.edu/​trac/​oaqa/​wiki/​OAQADocumentation/​Architecture
     * [[http://​domino.watson.ibm.com/​library/​CyberDig.nsf/​1e4115aea78b6e7c85256b360066f0d4/​d12791eaa13bb952852575a1004a055c?​OpenDocument&​Highlight=0,​rc24789|Joint IBM-CMU paper - Open Advancement of Question Answering]] (some interesting problems defined! esp. "​learning by reading",​ "​sustained investigation"​)     * [[http://​domino.watson.ibm.com/​library/​CyberDig.nsf/​1e4115aea78b6e7c85256b360066f0d4/​d12791eaa13bb952852575a1004a055c?​OpenDocument&​Highlight=0,​rc24789|Joint IBM-CMU paper - Open Advancement of Question Answering]] (some interesting problems defined! esp. "​learning by reading",​ "​sustained investigation"​)
 +    * Full-fledged OAQA pipeline instances publicly available:
 +      * We have rolled our own, **[[https://​github.com/​brmson/​blanqa|blanqa]]**,​ loosely inspired by the DSO project codebase
 +      * [[https://​github.com/​oaqa/​helloqa/​tree/​prototype|helloqa-prototype]] ([[https://​github.com/​oaqa/​helloqa/​wiki/​DSO-Project|DSO project]]) - for setup instructions,​ see [[project/​brmson/​helloqa-prototype-howto]]
 +      * [[https://​github.com/​rzhao1/​helloqa/​tree/​prototype|rzhao-prototype]]
   * UIMA-based http://​www.iiitb.ac.in/​sites/​default/​files/​uploads/​IIITB-TR-2012-001.pdf http://​sourceforge.net/​projects/​questnanswering/​   * UIMA-based http://​www.iiitb.ac.in/​sites/​default/​files/​uploads/​IIITB-TR-2012-001.pdf http://​sourceforge.net/​projects/​questnanswering/​
-    * Small gradstudent project, but it may be a good prototyping base; investigate+    * Small gradstudent project, but it may be a good prototyping base; got no response from affiliated people
   * OpenEphyra http://​www.ephyra.info/​   * OpenEphyra http://​www.ephyra.info/​
-    * Seems to be a kind of obsolete, old-style solution? but ready-to-use; ​fallback+    * Seems to be a kind of obsolete, old-style solution? but ready-to-use; ​we can (and do) use some of its components in OAQA-based solution, at least as placeholders for better solutions 
 +  * QA component of the "​Taming Text" book's codebase https://​github.com/​tamingtext/​book 
 +    * Simple, a bit hackish, tightly integrated with solr
  
 Custom (Watson-inspired) solution structure: Custom (Watson-inspired) solution structure:
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki