论文标题
logiclib的演示:可扩展数据记录系统的表达性多语言接口
Demonstration of LogicLib: An Expressive Multi-Language Interface over Scalable Datalog System
论文作者
论文摘要
随着数据量的不断增加,迫切需要提供表达和有效的工具来支持大数据分析。声明的逻辑语言数据已被证明非常有效地通过递归查询来表达图形,机器学习和知识发现应用程序。在此演示中,我们开发了逻辑库(LLIB),这是一个在DataLog中编写的递归算法库,可以在BigDatalog中执行,BigDatalog是我们开发的Apache Spark上的Datalog引擎。 LLIB将基于复杂的逻辑算法封装到高级API中,该算法简化了开发并提供了类似于Spark Mllib的统一接口。由于LLIB与DataFrame完全兼容,因此它可以集成其内置应用程序和具有现有SPARK功能的新数据数据函数,例如MLLIB和SPARK SQL提供的数据。通过各种示例,我们将(i)展示如何使用llib编写程序来表达各种应用程序; (ii)说明其在Apache Spark生态系统中的用户体验; (iii)提出一个用户友好的接口,以与LLIB框架进行交互并监视查询结果。
With the ever-increasing volume of data, there is an urgent need to provide expressive and efficient tools to support Big Data analytics. The declarative logical language Datalog has proven very effective at expressing concisely graph, machine learning, and knowledge discovery applications via recursive queries. In this demonstration, we develop Logic Library (LLib), a library of recursive algorithms written in Datalog that can be executed in BigDatalog, a Datalog engine on top of Apache Spark developed by us. LLib encapsulates complex logic-based algorithms into high-level APIs, which simplify the development and provide a unified interface akin to the one of Spark MLlib. As LLib is fully compatible with DataFrame, it enables the integrated utilization of its built-in applications and new Datalog queries with existing Spark functions, such as those provided by MLlib and Spark SQL. With a variety of examples, we will (i) show how to write programs with LLib to express a variety of applications; (ii) illustrate its user experience in Apache Spark ecosystem; and (iii) present a user-friendly interface to interact with the LLib framework and monitor the query results.