Big data row and column mixed permissions fine management practice

朱江

Chinese Session 2023-08-18 16:15 GMT+8  #olap

Background: In recent years, the issue of data security has gradually attracted the attention of governments and enterprises. With the promulgation and implementation of national data security Law and Personal Information Protection Law, clear requirements have been put forward for the principle of minimum sufficient data. Therefore, how to manage permissions at a more granular level has become a problem that every enterprise must solve. Current Issues: The industry usually extracts permission points in SQL based on rules, and manages these permission points horizontally according to the row dimension or vertically according to the column dimension. The granularity of the single-dimension permission control is too coarse to support the combination relationship between multiple permissions. It is difficult to meet the requirements of fine-grained data permission control in the scenario of large-width table with multi-service line unified storage such as Bytedance.

Solution: Based on the above problems, Bytedance designed a set of fine management scheme of column and row mixed permissions based on Apache Calcite and self-developed permission service Gemini.

  • Accurate access point extraction based on Calcite consanguinity
  • Based on consanguineous ability, accurately locate the permission point information (table, row, column, etc.) that is really used in SQL, and carry out refined permission extraction.
  • Row and column mixed permission Multi-dimensional permission control
  • On top of the traditional library permissions, table permissions, and column permissions, a new row restriction permission has been added. The row permissions can be attached to the table/column permissions as a special resource.
  • Each table permission/column permission can bind multiple row permission resources at the same time, and the row restrictions of different table permission/column permissions are independent of each other.
  • Locate query resources to ‘resource cells’ with overlapping rows and rows through a bundle combination of horizontal/vertical permission points to achieve more granular resource level permissions

Program advantages: Under the new scheme, resource control is refined from a horizontal row or vertical column to a ‘resource cell’ with overlapping rows and columns through precise fine-grained access point extraction and multi-dimensional mixed access support. The scope of permission control is further refined, and the required permissions are granted at the minimum granularity on the premise of ensuring the normal use of users. Specific typical cases and implementation principles will be introduced in the presentation PPT.

Speakers:


Zhu Jiang: bytedance, R&d engineer, Bytedance R&D engineer