MetaSeeker Toolkits

The Web is said being under explosion since a great amount of data are fed into it every day. For us to retrieve information and knowledge effectively from the Web, the data should be automatically manipulated by computers. It may be called as Web Automation, for example Mashup services, Web portals, Web integration etc. To pave the way for setting up these services, a key problem must be resolved that computers don't know how to manipulate the data on the Web. It is not to tell computers the meanings of the data. Instead it is to tell computers what is about the data, called as meta data or data schema which is just like database schema. Thereafter computers can aggregate and collate data on the Web just as what is done against a relational database. Unfortunately, data schema cannot be recognized in ease because current contents on the Web are presented for reading by human beings instead of by computers.

MetaSeeker is just this kind of toolkit for defining data schema and extracting structured data from the Web. MetaSeeker provide convenient methods to define schemas of Web pages, to generate wrappers without coding, to extract data effectively. It is a differentiated feature that all above are done in a distributed environment and in a collaborative manner.



MetaSeeker's Target Users

MetaSeeker is a valuable toolkit for personals or enterprises who are going to provide the following services:

  1. vertical search engines (or called as professional search engines)
  2. information aggregation portals
  3. Mashup services
  4. intelligent agents
  5. personalized information retrieval systems
  6. information mining facilities

MetaSeeker toolkit provides a series of tools which semantically describe data schemas of target Web pages, construct Data Schema Specification Files and Data and Clue Extraction Instruction Files, continuously extract information in bulk from the Web, produce and store Data Extraction Result Files with semantic meta data. All above activities are necessary for collecting contents during building up information services.



MetaSeeker's Tools



Professional Web Data Extraction and Web Integration Services

While Web data extraction is a concept emerged in the previous century, it is still a challenge for most Web site operators because the data in the Web take very different formats. The owners of information aggregation services have to pay a lot of money for data extraction, most of which is paid to implement too much HTML wrappers against different target sites respectively. While MetaSeeker must cut operation cost sharply to build up and operate an information aggregation service, the owners of the sites may want to outsource the tasks on Web data extraction or on Web integration so that they can focus more on core business. We provide two paid professional services as follows: