The ARX approach is described in:
Matthew Michelson and Craig A. Knoblock, Unsupervised Information Extraction from Unstructured, Ungrammatical Data Sources on the World Wide Web, International Journal of Document Analysis and Recognition (IJDAR), Special Issue on Noisy Text Analytics, 10, p.211-226, 2007
Which you can read here (bibtex).
The Phoebus approach is described in:
Matthew Michelson and Craig A. Knoblock, Creating Relational Data from Unstructured and Ungrammatical Data Sources, Journal of Artificial Intelligence Research (JAIR), 31, p.543-590, 2008
Which you can read here (bibtex).
To build/run the software, download ARXPhoebus.zip and unzip it. From there, all instructions for configuring the system to run are presented in detail in the README.txt file. To save you some time, you will need Java and MySQL to run the software. Please direct all inquiries to Matt Michelson (here).